<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="" id="magicdomid2"><span
        class="author-g-58dmoeyjkh7xk6yi">Dear list,</span></div>
    <div class="" id="magicdomid3"><br>
    </div>
    <div class="" id="magicdomid4"><span
        class="author-g-58dmoeyjkh7xk6yi">Over the last few months we
        (mostly Maarten, with some work by Lennart) have been working on
        the refactoring of the ProbABEL code. Our aim was twofold:</span></div>
    <div class="" id="magicdomid5"><span
        class="author-g-58dmoeyjkh7xk6yi">1) Create a code base that is
        easier to maintain by splitting of parts of the code into
        separate files (e.g. data.h was getting bigger and bigger,
        contained 'real code' instead of only headers, etc.)</span></div>
    <div class="" id="magicdomid6"><span
        class="author-g-58dmoeyjkh7xk6yi">2) See if we could implement
        performance improvements by using a library for much of the
        matrix algebra involved. </span></div>
    <div class="" id="magicdomid7"><br>
    </div>
    <div class="" id="magicdomid8"><span
        class="author-g-heqqyadjvu8gz122zuva">I am </span><span
        class="author-g-58dmoeyjkh7xk6yi">now proud to announce that
        this work has come to fruitition. As of SVN r1028 the changes
        from the probabel-refactoring branch have been integrated in
        trunc. An official release is planned to occur before the new
        year. </span></div>
    <div class="" id="magicdomid9"><br>
    </div>
    <div class="" id="magicdomid10"><span
        class="author-g-58dmoeyjkh7xk6yi">In the following some more
        details about the changes to the code will be discussed. </span></div>
    <div class="" id="magicdomid11"><br>
    </div>
    <div class="" id="magicdomid12"><span
        class="author-g-58dmoeyjkh7xk6yi">Matrix Algebra library
        implementation:</span></div>
    <div class="" id="magicdomid13"><span
        class="author-g-58dmoeyjkh7xk6yi">After some research and
        testing the Eigen template library (</span><span
        class="author-g-58dmoeyjkh7xk6yi url"><a
          href="http://eigen.tuxfamily.org/%29">http://eigen.tuxfamily.org/)</a></span><span
        class="author-g-58dmoeyjkh7xk6yi"> was chosen. It provides an
        easy way to use the most common forms of matrix manipulation as
        well as various solvers (which we might use in the future).
        Furthermore, since Eigen is a  C++ template library, only the
        header files are needed (at compile time), i.e. no need to </span><span
        class="author-g-heqqyadjvu8gz122zuva">build</span><span
        class="author-g-58dmoeyjkh7xk6yi"> a library or for a user to
        have a library installed. This makes life easier for a user who
        wants to install the software, but doesn't have root permissions
        on his/her system. Furthermore, Eigen has</span><span
        class="author-g-ducgikfxvi8xuntp"> a</span><span
        class="author-g-58dmoeyjkh7xk6yi">n</span><span
        class="author-g-ducgikfxvi8xuntp"> Open Source Initiative
        approved GPL compatible licence and claim</span><span
        class="author-g-58dmoeyjkh7xk6yi">s</span><span
        class="author-g-ducgikfxvi8xuntp"> to be fast for matrix*matrix
        operations (</span><span class="author-g-ducgikfxvi8xuntp url"><a
href="http://download.tuxfamily.org/eigen/btl-results-110323/matrix_matrix.pdf%29">http://download.tuxfamily.org/eigen/btl-results-110323/matrix_matrix.pdf)</a></span><span
        class="author-g-heqqyadjvu8gz122zuva">.</span><span
        class="author-g-ducgikfxvi8xuntp"> I thought of making use of
        BLAS libraries, but this is much less intuitive (look up the
        function of matrix multiplication called DGEMM and compare this
        to the Eigen function *</span><span
        class="author-g-heqqyadjvu8gz122zuva"> (i.e. the * operator)</span><span
        class="author-g-ducgikfxvi8xuntp">). BLAS libraries can also be 
        hard to install to get a good performance</span><span
        class="author-g-heqqyadjvu8gz122zuva"> </span><span
        class="author-g-ducgikfxvi8xuntp">(ATLAS:</span><span
        class="author-g-heqqyadjvu8gz122zuva"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://math-atlas.sourceforge.net/%29">http://math-atlas.sourceforge.net/)</a></span><span
        class="author-g-ducgikfxvi8xuntp"> or proprietary (Intel MKL:</span><span
        class="author-g-heqqyadjvu8gz122zuva"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://software.intel.com/en-us/intel-mkl%29">http://software.intel.com/en-us/intel-mkl)</a></span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-jfz122z6uft3wac00vcj">)</span></div>
    <div class="" id="magicdomid14"><span
        class="author-g-58dmoeyjkh7xk6yi">For the implementation I chose
        to make a parallel implementation, where both the old 'mematrix'
        operations and datastructures can be used, as well as the Eigen
        variants of the code. This was done by making a set of
        'eigen_mematrix' functions that can be called in the same way as
        their non-Eigen counterparts and wrap those calls into Eigen
        calls. This way the other parts of the code can be used
        independently of the matrix backend and don't need to be
        rewritten. This also made comparing the new and old versions of
        the PA tools very easy. </span></div>
    <div class="" id="magicdomid15"><span
        class="author-g-58dmoeyjkh7xk6yi">For now both backends</span><span
        class="author-g-heqqyadjvu8gz122zuva"> will be maintained</span><span
        class="author-g-58dmoeyjkh7xk6yi">, but the old mematrix
        implementation is considered deprecated and will be removed in a
        future release of ProbABEL. </span></div>
    <div class="" id="magicdomid16"><br>
    </div>
    <div class="" id="magicdomid17"><span
        class="author-g-58dmoeyjkh7xk6yi">More general code refactoring:</span></div>
    <div class="" id="magicdomid18"><span
        class="author-g-58dmoeyjkh7xk6yi">Aside from the changes related
        to Eigen, I also split up the code into .cpp and corresponding
        .h files for the various classes (most of which previously lived
        in data.h). This improves the readability of the code as well as
        reducing compile time when changes is made in only one class. </span></div>
    <div class="" id="magicdomid19"><span
        class="author-g-58dmoeyjkh7xk6yi">Furthermore, many function
        arguments were set to 'const' (where appropriate) to help
        prevent bugs from showing up. This effort is not finished and
        will be continued in future releases. </span></div>
    <div class="" id="magicdomid20"><br>
    </div>
    <div class="" id="magicdomid21"><span
        class="author-g-58dmoeyjkh7xk6yi">Build monitoring and other
        tools:</span></div>
    <div class="" id="magicdomid22"><span
        class="author-g-58dmoeyjkh7xk6yi">In order to monitor all these
        changes and ensure that the project remained in a compilable
        state, I installed the Jenkins continuous integration platform
        on </span><span class="author-g-ducgikfxvi8xuntp">my</span><span
        class="author-g-58dmoeyjkh7xk6yi"> workstation. Jenkins monitors
        SVN and each time a change is detected it tries to recompile
        ProbABEL. But that's not all, you can basically run any program.
        In this way we added checks for memory leaks using Valgrind,
        static code analysis using cppcheck, simian to find code
        duplication, etc. This helped us a lot in not only find
        (possible) bugs, but also in making the code cleaner. </span></div>
    <div class="" id="magicdomid23"><span
        class="author-g-heqqyadjvu8gz122zuva">Unfortunately it isn't
        possible to install Jenkins on the GenABEL.org web server as it
        is a java-based webserver and (of course) requires the various
        compile and check tools to be installed. R-forge doesn't seem to
        provide a similar service either. </span></div>
    <div class="" id="magicdomid24"><br>
    </div>
    <div class="" id="magicdomid25"><span
        class="author-g-58dmoeyjkh7xk6yi">Profiling:</span></div>
    <div class="" id="magicdomid26"><span
        class="author-g-58dmoeyjkh7xk6yi">In order to find out where
        ProbABEL (mostly palinear in our tests) spends most of its time
        the application was profiled using Valgrind's callgrind option
        as well as GNU gprof. Data was visualised using kcachegrind and
        Gprof2Dot allowing us to make informed decisions on which parts
        of the code are candidates for speedups. It turns out that more
        than </span><span class="author-g-ducgikfxvi8xuntp">30</span><span
        class="author-g-58dmoeyjkh7xk6yi">% of the time when running
        with the --mmscore option (the main use case in the GenEpi group
        at the ErasmusMC, Rotterdam) was spent in creatin</span><span
        class="author-g-ducgikfxvi8xuntp">g</span><span
        class="author-g-58dmoeyjkh7xk6yi"> the var-covar matrix. </span></div>
    <div class="" id="magicdomid27"><br>
    </div>
    <div class="" id="magicdomid28"><span
        class="author-g-ducgikfxvi8xuntp">The other 69% of the time it
        spend on matrix matrix products. That  is one of the reasons
        Eigen was chosen as it makes use of SSE  instruction in the CPU
        for its calculations. As a result, operation on  doubles is
        approximately twice as fast as before. </span></div>
    <div class="" id="magicdomid29"><br>
    </div>
    <div class="" id="magicdomid30"><span
        class="author-g-58dmoeyjkh7xk6yi">Benchmarking:</span></div>
    <div class="" id="magicdomid31"><span
        class="author-g-58dmoeyjkh7xk6yi">And now the moment you have
        all been waiting for: some benchmarks. These were done</span><span
        class="author-g-ducgikfxvi8xuntp"> on a 12 core intel Xeon
        X5680  @ 3.33GHz with 140 GB memory machine in a  parallel way: 
        The short running jobs had to share the cpu and memory  resource
        with 10 other jobs</span><span class="author-g-58dmoeyjkh7xk6yi">
        (becuase our server was busy at that time)</span><span
        class="author-g-ducgikfxvi8xuntp">.  The long running jobs had
        to share this  resource with only 2 or 3 other tasks.</span></div>
    <div class="" id="magicdomid32"><br>
    </div>
    <div class="" id="magicdomid33"><span
        class="author-g-ducgikfxvi8xuntp">The metrics used in the graph
        arre produced with  /usr/bin/time -f "%e\t%U\t%K\t%M\t%C"
        palinear.</span><span class="author-g-heqqyadjvu8gz122zuva"> </span><span
        class="author-g-ducgikfxvi8xuntp">The options -mmscore ,--chrom
        9 , --no-head --map were enabled  and the input data was</span><span
        class="author-g-heqqyadjvu8gz122zuva"> </span><span
        class="author-g-ducgikfxvi8xuntp">in filevector format.</span><span
        class="author-g-heqqyadjvu8gz122zuva"> As you can see the work
        paid off: a factor of ~ 4 decrease in computation time when
        using mmscore, as well as a reduction in memory usage (for large
        sample sizes).</span></div>
    <div class="" id="magicdomid34"><br>
    </div>
    <div class="" id="magicdomid35"><br>
    </div>
    <div class="" id="magicdomid36"><span
        class="author-g-ducgikfxvi8xuntp">If you have any question I am
        happy to answer them.</span></div>
    <div class="" id="magicdomid37"><br>
    </div>
    <div class="" id="magicdomid38"><br>
    </div>
    <div class="" id="magicdomid39"><span
        class="author-g-ducgikfxvi8xuntp">Kind Regards,</span></div>
    <div class="" id="magicdomid40"><br>
    </div>
    <div class="" id="magicdomid41"><span
        class="author-g-ducgikfxvi8xuntp">Maarten</span></div>
    <div class="" id="magicdomid42"><br>
    </div>
    <div class="" id="magicdomid43"><br>
    </div>
    <div class="" id="magicdomid44"><br>
    </div>
    <div class="" id="magicdomid45"><span
        class="author-g-heqqyadjvu8gz122zuva">Below some URLs to tools
        that were used.</span></div>
    <div class="" id="magicdomid46"><br>
    </div>
    <div class="" id="magicdomid47"><span
        class="author-g-jfz122z6uft3wac00vcj b"><b>Analyses: </b></span></div>
    <div class="" id="magicdomid48"><span
        class="author-g-ducgikfxvi8xuntp i"><i>Profiling</i></span><span
        class="author-g-heqqyadjvu8gz122zuva i"><i>:</i></span></div>
    <div class="" id="magicdomid49"><span
        class="author-g-ducgikfxvi8xuntp">valgrind</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> --tool</span><span
        class="author-g-ducgikfxvi8xuntp padtagsearch
        padtagsearch_callgrind"><a
          href="http://piratepad.net/ep/search?query=callgrind">=callgrind</a></span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://valgrind.org/">http://valgrind.org/</a></span></div>
    <div class="" id="magicdomid50"><span
        class="author-g-ducgikfxvi8xuntp">GNU gprof</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://www.gnu.org/software/binutils/">http://www.gnu.org/software/binutils/</a></span></div>
    <div class="" id="magicdomid51"><span
        class="author-g-ducgikfxvi8xuntp i"><i>Visualisation</i></span><span
        class="author-g-heqqyadjvu8gz122zuva i"><i>:</i></span></div>
    <div class="" id="magicdomid52"><span
        class="author-g-ducgikfxvi8xuntp">Gprof2Dot</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot">http://code.google.com/p/jrfonseca/wiki/Gprof2Dot</a></span></div>
    <div class="" id="magicdomid53"><span
        class="author-g-ducgikfxvi8xuntp">kachegrind</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://kcachegrind.sourceforge.net/html/Home.html">http://kcachegrind.sourceforge.net/html/Home.html</a></span></div>
    <div class="" id="magicdomid54"><br>
    </div>
    <div class="" id="magicdomid55"><span
        class="author-g-jfz122z6uft3wac00vcj b"><b>Development:</b></span></div>
    <div class="" id="magicdomid56"><span
        class="author-g-jfz122z6uft3wac00vcj">jenkins</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://jenkins-ci.org/">http://jenkins-ci.org/</a></span></div>
    <div class="" id="magicdomid57"><span
        class="author-g-ducgikfxvi8xuntp">line</span><span
        class="author-g-heqqyadjvu8gz122zuva">s of code</span><span
        class="author-g-ducgikfxvi8xuntp"> count</span><span
        class="author-g-heqqyadjvu8gz122zuva">L</span><span
        class="author-g-ducgikfxvi8xuntp">  SLOCCount </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://www.dwheeler.com/sloccount/">http://www.dwheeler.com/sloccount/</a></span></div>
    <div class="" id="magicdomid58"><span
        class="author-g-ducgikfxvi8xuntp">cppcheck</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://cppcheck.sourceforge.net/">http://cppcheck.sourceforge.net/</a></span></div>
    <div class="" id="magicdomid59"><span
        class="author-g-ducgikfxvi8xuntp">cpplint.py</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://google-styleguide.googlecode.com/svn/trunk/cpplint/">http://google-styleguide.googlecode.com/svn/trunk/cpplint/</a></span></div>
    <div class="" id="magicdomid60"><span
        class="author-g-ducgikfxvi8xuntp">simian</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://www.harukizaemon.com/simian/">http://www.harukizaemon.com/simian/</a></span><span
        class="author-g-ducgikfxvi8xuntp"> to detect code duplication</span></div>
    <div class="" id="magicdomid61"><span
        class="author-g-jfz122z6uft3wac00vcj">eclipse</span><span
        class="author-g-ducgikfxvi8xuntp"> cdt</span><span
        class="author-g-heqqyadjvu8gz122zuva">:</span><span
        class="author-g-ducgikfxvi8xuntp"> </span><span
        class="author-g-ducgikfxvi8xuntp url"><a
          href="http://www.eclipse.org/cdt/">http://www.eclipse.org/cdt/</a></span></div>
    <div class="" id="magicdomid62"><br>
    </div>
  </body>
</html>