[GSoC-PortA] Welcome to our GSoC PortfolioAnalytics project list

Brian G. Peterson brian at braverock.com
Tue Jun 4 01:37:22 CEST 2013


*

You're receiving this message from a private list I've set up to help 
keep us all coordinated over this summer's GSoC project.  All of the 
mentors and the student are subscribed and can send mail to the list.


Using the list will let everyone communicate with just one address (the 
list address), and will make it easier to sort these communications out 
in our inboxes (I'm sure all of you, like me, get lots of emails every 
day).  It should also lessen the pressure on everyone, as any of the 
mentors can respond if we have an answer to a student query, when we 
have a moment to do so.


Please use this list for all 'general' or non-time-critical discussions 
about the project, so that everyone stays informed.  Obviously, if you 
are local to one of the mentors, face to face meetings are encouraged. 
  Please do inform the rest of us of major decisions though.


I anticipate that we will discuss design decisions, implementation 
roadblocks, and various implementation choices that need to be made as 
the summer progresses.  These discussions should take place on this 
mailing list as much as possible.


As you know, the schedule has several milestones:

  *

    Community Bonding Period: Today, May 27th

  *

    Official Coding Start: June 17th

  *

    Mid Term Evaluation: Jul 29th - Aug 2nd

  *

    Pencils Down: Sept 23rd

  *

    Final Evaluations Due: Sept 27th

Please add these to your calendar (or add the Google Calendar overlay 
<http://www.google-melange.com/gsoc/events/google/gsoc2013>[0] to yours 
for the summer from melange)


As the first phase starts today, we want to re-emphasize what our 
objectives are for the short term.  This first phase is intended for you 
to get your bearings and firm up your coding plan.  To do that, you'll 
need to identify and obtain any source and background reading materials. 
  Start reading and and asking questions.  Early on, you'll want to 
identify what data you need for tests and examples.  Inventory any 
existing code.  Familiarize yourself with packages you will use for 
development, and packages you will be committing code into.  Get set up 
with R-forge and SVN, and make your first commits (no matter how minor). 
  List out what you want to start implementing and why. Sketch out a 
project plan in more detail than you have previously.  Start 
communicating about what you are going to do when you start coding.


This project will use R-Forge <http://r-forge.r-project.org/>[1] 
extensively.  R-Forge provides a set of tools for source code management 
(Subversion) and various web-based features.  To use Subversion (svn) on 
R-Forge you'll need toregister as a site user 
<https://r-forge.r-project.org/account/register.php>[2] and thenlogin 
<https://r-forge.r-project.org/account/login.php>[3].  If you are 
unfamiliar with R-Forge, you may want to review a copy of theUser's 
Manual <http://download.r-forge.r-project.org/R-Forge_Manual.pdf>[4]. 
  To get you set up with svn commit access on R-Forge, we will need your 
r-forge id.


R-Forge uses svn for version control, and it will be very important for 
everyone (both mentors and students) to quickly get adept at using svn. 
  For more specifics about how to use svn, take a look at the book 
Version Control with Subversion <http://svnbook.red-bean.com/>[5].  You 
will need to install the client of your choice (e.g., Tortoise SVN 
<http://tortoisesvn.tigris.org/>[6] on Windows or svnX on Mac OSX) and 
check out the repository.  Please do not hesitate to ask for help if you 
get stuck - this is a critical component of our workflow and will be 
important for keeping everyone up to date with current code.  If you've 
previously checked out the code anonymously, you'll need to check it out 
again using your R-Forge id (and ssh key) before you'll be able to 
commit your changes.


In addition, everyone should join the r-forge commit list for the 
project the code is being submitted into.  For example, go to the 
returnanalytics 
<https://r-forge.r-project.org/projects/returnanalytics/>[7] project on 
R-Forge.  You'll see a link for mailing lists, with one public mailing 
list called "returnanalytics-commits".  Subscribe to that, and you'll be 
notified by email of any commits made to the project.


Please try to make commits to svn at least daily while coding.  If you 
make an improvement and it works - check it in.  This way mentors will 
be able to test code continuously, and we'll know quickly if something 
is broken.  We suggest an iterative approach to development: first make 
it work, then make it work *correctly*, and finally make it work fast 
(if needed).  Do not try to make it perfect or even pretty before 
checking something in.  Make sure you provide a log message for each 
commit.  Look at the log of the repository you will be working with to 
get a feel for the logging style. Make small changes, frequently.  We 
know from past years that students who make incremental, small progress 
have a much greater chance of successfully finishing the summer.


Document as you write.  It is really important to write the 
documentation as you write functions, perhaps even *before* you write 
the function, at least to describe what it should do.

When documenting a function:

  *

    make sure equations are correct and cited,

  *

    make sure all user-facing functions have examples,

  *

    make sure you know the expected results of the examples (these will
    become tests),

  *

    make sure relevant literature is cited everywhere, and

  *

    apply a standard mathematical notation.  In most cases, follow the
    notation from the original paper.


We either have or are currently converting all of our packages' 
documentation to roxygen2 
<http://cran.r-project.org/web/packages/roxygen2/index.html>[8], an 
in-source 'literate programming' 
<http://en.wikipedia.org/wiki/Literate_programming>[9] documentation 
system for generating Rd, collation, and NAMESPACE files.  What that 
means is that the documentation will be in the same file as the 
functions (as comments before each function) which will make writing and 
synchronizing the documentation easier for everyone.  Every function 
file will have the documentation and roxygen tags in the file, and 
roxygenize() will be run before the package build process to generate 
the Rd documentation files required by R.  Roxygen2 is available on CRAN.


Equations in documentation should have both full LaTeX code for printing 
in the PDF and a text representation that will be used in the console 
help. Use:


\eqn{\LaTeX}{ascii}

or

\deqn{\LaTeX}{ascii}


Greek letters will also be rendered in the HTML help.  However, the only 
way to get the full mathematical equation layout is in the PDF rendered 
from LaTeX.


For more information on documentation and R package development in 
general, read 'Writing R Extensions 
<http://cran.r-project.org/doc/manuals/R-exts.pdf>'[10].


Although preferences for code style do vary, when there are a number of 
contributors to the package it can be important for readability and 
future maintainability of the code.  You should strive (as much as is 
practical) to match the style in the existing code.  When in doubt, rely 
on Google's R Style Guide 
<http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html>[11] 
or ask the mailing list.


Everyone should know how to build packages from source, although 
once-daily builds may be available on R-Forge.  A *nix machine should 
have everything needed (see Appendix A of 'R Installation and 
Administration 
<http://cran.r-project.org/doc/manuals/R-admin.pdf>'[13]), but a regular 
Windows machine will not.  Windows users will need to install RTools 
<http://cran.r-project.org/bin/windows/Rtools/>[12], a collection of 
resources for building packages for R under Microsoft Windows (see 
Appendix D of 'R Installation and Administration 
<http://cran.r-project.org/doc/manuals/R-admin.pdf>'[13]).  Once all 
tools are in place, you should be able to build the package by opening a 
shell, moving to the directory of the package, and typing 'R CMD INSTALL 
packagename'.  The R-Forge Manual provides more detail in section 4.


I know this was a lot of information, but we thought it would make sense 
to get most of it out of the way immediately, in a format that is easily 
referred to throughout the summer.  Please don't hesitate to use the 
list to ask any questions, that's what it is here for.



Regards,


Brian



References:

[0] Google Calendar overlay

http://www.google-melange.com/gsoc/events/google/gsoc2013

[1] R-Forge

http://r-forge.r-project.org <http://r-forge.r-project.org/>

[2] R-Forge registration

https://r-forge.r-project.org/account/register.php

[3] R-Forge login

https://r-forge.r-project.org/account/login.php

[4] R-Forge User Manual

http://download.r-forge.r-project.org/R-Forge_Manual.pdf

[5] SVN Book

http://svnbook.red-bean.com <http://svnbook.red-bean.com/>

[6] Tortoise SVN

http://tortoisesvn.tigris.org <http://tortoisesvn.tigris.org/>

[7] ReturnAnalytics on R-Forge

https://r-forge.r-project.org/projects/returnanalytics/

[8] roxygen2

http://cran.r-project.org/web/packages/roxygen2/index.html

[9] literate programming

http://en.wikipedia.org/wiki/Literate_programming

[10] Writing R Extensions

http://cran.r-project.org/doc/manuals/R-exts.pdf

[11] Google's R style guide

http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html

[12] Rtools

http://cran.r-project.org/bin/windows/Rtools/

[13] R Installation and Administration

http://cran.r-project.org/doc/manuals/R-admin.pdf


*

-- 
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/gsoc-porta/attachments/20130603/71d3737d/attachment-0001.html>


More information about the GSoC-PortA mailing list