[GSoC-PortA] Welcome to our GSoC PortfolioAnalytics project list
Brian G. Peterson
brian at braverock.com
Tue Jun 4 01:37:22 CEST 2013
*
You're receiving this message from a private list I've set up to help
keep us all coordinated over this summer's GSoC project. All of the
mentors and the student are subscribed and can send mail to the list.
Using the list will let everyone communicate with just one address (the
list address), and will make it easier to sort these communications out
in our inboxes (I'm sure all of you, like me, get lots of emails every
day). It should also lessen the pressure on everyone, as any of the
mentors can respond if we have an answer to a student query, when we
have a moment to do so.
Please use this list for all 'general' or non-time-critical discussions
about the project, so that everyone stays informed. Obviously, if you
are local to one of the mentors, face to face meetings are encouraged.
Please do inform the rest of us of major decisions though.
I anticipate that we will discuss design decisions, implementation
roadblocks, and various implementation choices that need to be made as
the summer progresses. These discussions should take place on this
mailing list as much as possible.
As you know, the schedule has several milestones:
*
Community Bonding Period: Today, May 27th
*
Official Coding Start: June 17th
*
Mid Term Evaluation: Jul 29th - Aug 2nd
*
Pencils Down: Sept 23rd
*
Final Evaluations Due: Sept 27th
Please add these to your calendar (or add the Google Calendar overlay
<http://www.google-melange.com/gsoc/events/google/gsoc2013>[0] to yours
for the summer from melange)
As the first phase starts today, we want to re-emphasize what our
objectives are for the short term. This first phase is intended for you
to get your bearings and firm up your coding plan. To do that, you'll
need to identify and obtain any source and background reading materials.
Start reading and and asking questions. Early on, you'll want to
identify what data you need for tests and examples. Inventory any
existing code. Familiarize yourself with packages you will use for
development, and packages you will be committing code into. Get set up
with R-forge and SVN, and make your first commits (no matter how minor).
List out what you want to start implementing and why. Sketch out a
project plan in more detail than you have previously. Start
communicating about what you are going to do when you start coding.
This project will use R-Forge <http://r-forge.r-project.org/>[1]
extensively. R-Forge provides a set of tools for source code management
(Subversion) and various web-based features. To use Subversion (svn) on
R-Forge you'll need toregister as a site user
<https://r-forge.r-project.org/account/register.php>[2] and thenlogin
<https://r-forge.r-project.org/account/login.php>[3]. If you are
unfamiliar with R-Forge, you may want to review a copy of theUser's
Manual <http://download.r-forge.r-project.org/R-Forge_Manual.pdf>[4].
To get you set up with svn commit access on R-Forge, we will need your
r-forge id.
R-Forge uses svn for version control, and it will be very important for
everyone (both mentors and students) to quickly get adept at using svn.
For more specifics about how to use svn, take a look at the book
Version Control with Subversion <http://svnbook.red-bean.com/>[5]. You
will need to install the client of your choice (e.g., Tortoise SVN
<http://tortoisesvn.tigris.org/>[6] on Windows or svnX on Mac OSX) and
check out the repository. Please do not hesitate to ask for help if you
get stuck - this is a critical component of our workflow and will be
important for keeping everyone up to date with current code. If you've
previously checked out the code anonymously, you'll need to check it out
again using your R-Forge id (and ssh key) before you'll be able to
commit your changes.
In addition, everyone should join the r-forge commit list for the
project the code is being submitted into. For example, go to the
returnanalytics
<https://r-forge.r-project.org/projects/returnanalytics/>[7] project on
R-Forge. You'll see a link for mailing lists, with one public mailing
list called "returnanalytics-commits". Subscribe to that, and you'll be
notified by email of any commits made to the project.
Please try to make commits to svn at least daily while coding. If you
make an improvement and it works - check it in. This way mentors will
be able to test code continuously, and we'll know quickly if something
is broken. We suggest an iterative approach to development: first make
it work, then make it work *correctly*, and finally make it work fast
(if needed). Do not try to make it perfect or even pretty before
checking something in. Make sure you provide a log message for each
commit. Look at the log of the repository you will be working with to
get a feel for the logging style. Make small changes, frequently. We
know from past years that students who make incremental, small progress
have a much greater chance of successfully finishing the summer.
Document as you write. It is really important to write the
documentation as you write functions, perhaps even *before* you write
the function, at least to describe what it should do.
When documenting a function:
*
make sure equations are correct and cited,
*
make sure all user-facing functions have examples,
*
make sure you know the expected results of the examples (these will
become tests),
*
make sure relevant literature is cited everywhere, and
*
apply a standard mathematical notation. In most cases, follow the
notation from the original paper.
We either have or are currently converting all of our packages'
documentation to roxygen2
<http://cran.r-project.org/web/packages/roxygen2/index.html>[8], an
in-source 'literate programming'
<http://en.wikipedia.org/wiki/Literate_programming>[9] documentation
system for generating Rd, collation, and NAMESPACE files. What that
means is that the documentation will be in the same file as the
functions (as comments before each function) which will make writing and
synchronizing the documentation easier for everyone. Every function
file will have the documentation and roxygen tags in the file, and
roxygenize() will be run before the package build process to generate
the Rd documentation files required by R. Roxygen2 is available on CRAN.
Equations in documentation should have both full LaTeX code for printing
in the PDF and a text representation that will be used in the console
help. Use:
\eqn{\LaTeX}{ascii}
or
\deqn{\LaTeX}{ascii}
Greek letters will also be rendered in the HTML help. However, the only
way to get the full mathematical equation layout is in the PDF rendered
from LaTeX.
For more information on documentation and R package development in
general, read 'Writing R Extensions
<http://cran.r-project.org/doc/manuals/R-exts.pdf>'[10].
Although preferences for code style do vary, when there are a number of
contributors to the package it can be important for readability and
future maintainability of the code. You should strive (as much as is
practical) to match the style in the existing code. When in doubt, rely
on Google's R Style Guide
<http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html>[11]
or ask the mailing list.
Everyone should know how to build packages from source, although
once-daily builds may be available on R-Forge. A *nix machine should
have everything needed (see Appendix A of 'R Installation and
Administration
<http://cran.r-project.org/doc/manuals/R-admin.pdf>'[13]), but a regular
Windows machine will not. Windows users will need to install RTools
<http://cran.r-project.org/bin/windows/Rtools/>[12], a collection of
resources for building packages for R under Microsoft Windows (see
Appendix D of 'R Installation and Administration
<http://cran.r-project.org/doc/manuals/R-admin.pdf>'[13]). Once all
tools are in place, you should be able to build the package by opening a
shell, moving to the directory of the package, and typing 'R CMD INSTALL
packagename'. The R-Forge Manual provides more detail in section 4.
I know this was a lot of information, but we thought it would make sense
to get most of it out of the way immediately, in a format that is easily
referred to throughout the summer. Please don't hesitate to use the
list to ask any questions, that's what it is here for.
Regards,
Brian
References:
[0] Google Calendar overlay
http://www.google-melange.com/gsoc/events/google/gsoc2013
[1] R-Forge
http://r-forge.r-project.org <http://r-forge.r-project.org/>
[2] R-Forge registration
https://r-forge.r-project.org/account/register.php
[3] R-Forge login
https://r-forge.r-project.org/account/login.php
[4] R-Forge User Manual
http://download.r-forge.r-project.org/R-Forge_Manual.pdf
[5] SVN Book
http://svnbook.red-bean.com <http://svnbook.red-bean.com/>
[6] Tortoise SVN
http://tortoisesvn.tigris.org <http://tortoisesvn.tigris.org/>
[7] ReturnAnalytics on R-Forge
https://r-forge.r-project.org/projects/returnanalytics/
[8] roxygen2
http://cran.r-project.org/web/packages/roxygen2/index.html
[9] literate programming
http://en.wikipedia.org/wiki/Literate_programming
[10] Writing R Extensions
http://cran.r-project.org/doc/manuals/R-exts.pdf
[11] Google's R style guide
http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html
[12] Rtools
http://cran.r-project.org/bin/windows/Rtools/
[13] R Installation and Administration
http://cran.r-project.org/doc/manuals/R-admin.pdf
*
--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/gsoc-porta/attachments/20130603/71d3737d/attachment-0001.html>
More information about the GSoC-PortA
mailing list