[GenABEL-dev] GenABEL tutorials to SVN

Yurii Aulchenko yurii.aulchenko at gmail.com
Sun Feb 24 17:27:28 CET 2013


Dear Lennart, Maarten, All,

Lennart - thank you for drawing the attention to the problem of not being
able to compile affecting the whole thing (I personally would never change
the code of something I am not able to compile :) ). Maarten, thanks a lot
for suggesting this elegant Jenkins solution - I was not aware of this
system.

All together, I see two ways to proceed:

1) Solution based on Jenkins - the code is open, can be modified, bud build
happens in 'private' environment

2) I replace the data sets we can not distribute with some small fake
datasets. This will make the code technically compilable, though all the
interpretation of the "results" will be screwed up

I like the solution (1) technically, but I do not like it because it does
not really address the point behind: we can not share these datasets. In a
way, people can look at these PDFs but people can not use these parts as
tutorial, because they do not have the data sets! Therefore, if I had to
choose, I would be inclined towards the solution (2).

But I think we do not need to choose. We could keep both the "full old PDF"
together with "incomplete new" on the genabel.org's tutorial section. Next,
I am going to try to construct a smarter Makefile, which could build both
'private' and 'public' version depending on the availability of data files.
Then we could combine both solutions :)

I think the next steps are 1) for me to look up how many chapters in the
GenA tutorial become crap when I remove these datasets 2) try to do smart
Makefile - hopefully with your help

Let me know what you think, and I will keep you updated

best wishes,
Yurii

On Wed, Feb 20, 2013 at 10:40 PM, Maarten Kooyman <kooyman at gmail.com> wrote:

> Dear All,
>
> I think on the long run replacing the data is the best thing to do.
> (although it will take quite some effort).
>
> As an temporary solution we could use a build server with jenkins (
> http://jenkins-ci.org/), that recreates the document after each
> alteration on svn and publish this on a public place(by coping it to a
> webserver). On this build server the datasets are secure in a trusted
> environment and the results are visible to the outer world. I use  Jenkins
> also  for monitoring Probabel, but the goal is the same: keep the quality
> of the code in check.
>
> This solution prevent coping binary files to svn and this can be done in a
> completely automated way.
>
> Kind regards,
>
> Maarten
>
>
>
> On 02/20/2013 06:54 PM, L.C. Karssen wrote:
>
>> Dear Yurii,
>>
>> Great idea. I'm all for putting the tutorials in SVN. They are already
>> of high quality and together with our community we can make them even
>> better.
>> I do see the problem with the data sets, of course.
>>
>> You are using Sweave, right? I'm wondering how much not having the data
>> will impact the possibility to tweak the document. Fixing small typos
>> will be alright, but before you know it a typo can mess up the LaTeX or
>> R code and since you can't compile the document to check it this may
>> lead to a lot of bug hunting for you, once you recompile it again.
>> That's the only potential problem I see.
>>
>> How about also including the latest PDF version of the tutorial (I know,
>> this is against SVN's principles) each time you compile a version? This
>> way people who don't have the data know what it is supposed to look like
>> and could even help creating replacement data sets.
>>
>>
>> Best,
>>
>> Lennart.
>>
>> On 02/20/2013 04:35 PM, Yurii Aulchenko wrote:
>>
>>> Dear All,
>>>
>>> For long time I was thinking that GenABEL tutorial(s) should be a part of
>>> the project - the same logic as with the code, with the same idea that in
>>> such case people can easily contribute by submitting patches and new
>>> pieces.
>>>
>>> The problem was (and still is) that the tutorial uses some data sets,
>>> which
>>> are not public domain, and it is quite awkward if we as the project start
>>> re-distributing them. Little by little I am trying to switch the whole
>>> thing to the use of only public and simulated data, but this is a lengthy
>>> process.
>>>
>>> So I thought that may be a good solution is to put the code of tutorials
>>> on
>>> our SVN; and put the data only if these are either public or simulated.
>>> Of
>>> cause in this way the tutorials will not be really "functional" (e.g.
>>> they
>>> would not compile right away), but this may become a starting point for
>>> others to build up something new and really
>>> free-for-all-to-use-and-**contribute.
>>>
>>> Let me know what you think,
>>> best regards,
>>> Yurii
>>>
>>>
>>>
>>> ______________________________**_________________
>>>
>>
> ______________________________**_________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130224/b9e2073c/attachment.html>


More information about the genabel-devel mailing list