[Rsiena-help] RSiena on a computer cluster

Tue Sep 24 15:11:59 CEST 2013

Tom's questions: 1. Yes, the memory issue is general and it would be
good to have a note in the manual.
     2. I am viewing a profile on a Mac as I write, of a phase 2 using 2
waves of Tobias' data, and the scores are using about 22 per cent of the
cpu time. This may differ on different architectures and will depend on
the model. Many slow-to-calculate-effects will reduce the proportion
spent in accumulating scores, so with the extra time-varying covariates
effects (16 of 48 effects) which I have excluded in the current run, it
was about 15 percent. Whether turning Dolby off is a good idea depends
on its effect on the convergence - I have not done enough tests to know.

Tobias' question: I have not found anything very unexpected in my tests.
So a few comments might be helpful:

1) To compare performance, I suggest using just simulation/phase 3. The
effect on any particular iteration should be similar in phase 2, but the
random length of phase 2 complicates comparisons.

2) Do try to check that your machine is not heavily loaded or short of
memory while you run comparisons. Keeping them short (100 iterations or
so) will assist in this.

3) For small numbers of cores on a single machine an almost proportional
decrease in speed per iteration should be expected. The benefit will be
reduced if eg the machine starts hyperthreading, or stops using 'turbo'
mode. (And this may not be under your control.) So 'small numbers of
cores' is relative to the architecture of the computer.

4) With the current parallel processing in RSiena, each core needs a
copy of the basic data, (a potentially large memory requirement) but
after the start-up, communication is small enough not to affect speeds
within one machine very much.

5) If using more than one machine, the communication speed may well
reduce the benefit markedly.

Ruth

On 23/09/2013 10:13, Tom Snijders wrote:
> Dear Ruth,
> 
> Thanks a lot for your help with this. I have a couple of further remarks.
> 1. Do you think it would be useful that in the manual and help page I add a remark that in view of memory use in general and particularly when using multiple processes, it can help to create the RSiena objects first and then use them for estimation in a separate R session? In other words, do you think your findings are general (or generalizable) sufficiently to warrant such an advice?
> 2. When adding the Dolby option I did not think that it would cause a considerable increase in computing time. Apparently I was wrong.
> 
> Best regards,
> Tom
> 
> 
> ================================================================
> Tom A.B. Snijders
> Professor of Statistics in the Social Sciences
> Department of Politics and Department of Statistics
> Nuffield College
> University of Oxford
> tel. +44-01865-278599
> 
> -----Original Message-----
> From: Ruth Ripley [mailto:ruth at stats.ox.ac.uk]
> Sent: 21 September 2013 10:40
> To: rsiena-help at lists.r-forge.r-project.org
> Cc: Mark Ortmann; Tom Snijders
> Subject: Re: [Rsiena-help] RSiena on a computer cluster
> 
> Dear Tobias,
> 
> After a few hours experiments, I have not finished one complete run yet, but have some initial comments about the odd behaviour you report. For the benefit of Tom and Mark I have included some more technical comments.
> 
> First, I had memory problems on a machine with 8 GB. Creating the data and running siena07 straight afterwards resulted in a master process using 5.5 GB and subprocesses using 1 GB each. I failed to get a single run with nbrNodes=2 to function on a machine with 8 GB. I recommend you run two R jobs: one to create the siena data object, model and effects object and save just these three and another job which just loads these
> 3 objects and RSiena and runs siena07.  Each process then uses about 1 GB, startup is much quicker and I have 4 processes running happily on the 8 GB machine. (1 with useCluster=FALSE, 2 from a run with nbrNodes=2, and 1 with useCluster=FALSE and dolby=FALSE in the model.)
> 
> Secondly, the model is running into estimation problems in phase 2. The subphases are being repeated different numbers of times, so the total number of iterations in phase 2 ends up varying a lot.
> 
> Thirdly, with dolby=TRUE the phase 2 iterations are taking longer because they need to accumulate the scores - hence my experiment with turning it off. This estimation is winning at the moment (nearing the end of phase 2.2) largely because it restarted (after 50 iterations) twice in phase 2.1 and then terminated the phase after another 50, while the other single process did 338 slower iterations in phase 2.1 with no restarts and the job with nbrNodes=2 did 269 iterations in phase 2.1 and has restarted phase 2.2 twice after 50 iterations.
> 
> It does not seem to be communication that is the problem with the extra need for scores, but the calculation of them.
> 
> I think in phase 1 the expected behaviour does occur, but the memory issues may reduce the benefit by causing more paging.
> 
> The networks are undirected and using model type 3 - this adds a little extra processing to each iteration but does not seem to be the main cause of the unpredictable timings.
> 
> Regards,
> 
> Ruth
> 
> On 18/09/2013 17:56, Ruth Ripley wrote:
>> Dear Tobias,
>>
>> Parallel processing always seems to have costs which can noticeably
>> reduce the expected time savings. In the case of RSiena (normal
>> forward
>> processing) with the parallel package the major extra costs are 1)
>> every sub-process is sent a copy of the data - this will take a while
>> to set up and need a fair amount of memory if the data is large (but
>> luckily only needs to be done once) and 2) every sub-process needs to
>> receive the parameters at each iteration and return the statistics and
>> scores - this is not a great overhead.
>>
>> In fact if one is using linux or Mac, it is not necessary to send the
>> data across explicitly, but I did it because one must on Windows, and
>> because it is a one-off cost. Any data that is updated during the
>> iteration will need to be copied by the operating system, though.
>>
>> In my experience there is a consistent benefit, provided a few 'serious'
>> effects are being fitted, unless the processes overwhelm the CPU's
>> available and compete.
>>
>> I would be interested to see if I can replicate the behaviour you
>> describe - if you would like to send me your data  and commands
>> (please email them direct to me), I will investigate. It may be that
>> the rather random element - length of time in phase 2 (where the
>> controller must wait for the slowest process at each iteration, and
>> the end conditions depend on the results of the simulations) is causing the strange results.
>>
>> The actual communication speeds will vary between architectures. What
>> operating system were you using? And which versions of R and RSiena?
>>
>> Regards,
>>
>> Ruth
>>
>>
>>
>> On 18/09/2013 17:14, Tom Snijders wrote:
>>> Dear Tobias,
>>>
>>> The multicluster option in RSiena is based on the R package parallel. It has the disadvantage of requiring rather much communication between the processors. How this works out in practice depends strongly on the hardware configuration. In my experience, using multiple processes does have an advantage over the use of only one process. I would guess that a really large number makes no difference, and 16 seems already a quite large number in this respect. The result that using 8 processes takes more time than 1, and 16 takes less time, seems to me totally hardware specific.
>>> But I do not know a lot about this, and if anybody else can correct me or say more specific things, that would be great.
>>>
>>> We are still hoping that the settings model will be implemented some time in the future, which should be much more reasonable and less time-consuming for large networks. But this is not yet nearby.
>>>
>>> Best wishes,
>>> Tom
>>>
>>>
>>> ================================================================
>>> Tom A.B. Snijders
>>> Professor of Statistics in the Social Sciences Department of Politics
>>> and Department of Statistics Nuffield College University of Oxford
>>> tel. +44-01865-278599
>>>
>>>
>>> -----Original Message-----
>>> From: rsiena-help-bounces at lists.r-forge.r-project.org [mailto:rsiena-help-bounces at lists.r-forge.r-project.org] On Behalf Of Tobias Stark
>>> Sent: 18 September 2013 06:55
>>> To: rsiena-help at lists.r-forge.r-project.org
>>> Subject: [Rsiena-help] RSiena on a computer cluster
>>>
>>> Dear RSiena developers,
>>>
>>> I hope to increase the speed of my analyses using a computer cluster. I ran the exact same test analysis with a large network (approx. 1,000 nodes) and varied the number of cores on which SIENA could run. I noticed that there was hardly any gain in speed using more cores. In fact, the analysis took longer when I ran it on 8 cores instead of 4 cores (no matter if the cores where on one machine or distributed across the cluster). The analyses where considerably faster on 16 cores but using 26 or even 32 cores did not result in quicker results
>>>
>>> I wonder if there is a restriction within SIENA that prevents additional gains in speed with more cores or if the problem lies with the communication between machines in the computer cluster. Do you have a hint for me?
>>>
>>> Thanks,
>>> Tobias
>>> _______________________________________________
>>> Rsiena-help mailing list
>>> Rsiena-help at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rsiena-help
>>>
>>> Nuffield College is a Registered Charity No. 1137506. Registered Office: Nuffield College, New Road, Oxford, OX1 1NF
>>> _______________________________________________
>>> Rsiena-help mailing list
>>> Rsiena-help at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rsiena-help
>>>
>>
> 
> --
> Ruth M. Ripley,                         Email:ruth at stats.ox.ac.uk
> Dept. of Statistics,                    http://www.stats.ox.ac.uk/~ruth/
> University of Oxford,                   Tel:   01865 282857
> 1 South Parks Road, Oxford OX1 3TG, UK  Fax:   01865 272595
> 
> 
> Nuffield College is a Registered Charity No. 1137506. Registered Office: Nuffield College, New Road, Oxford, OX1 1NF
>