[adegenet-forum] dataset too large? Follow-up

Thomas, Evert (Bioversity-Colombia) E.Thomas at CGIAR.ORG
Wed Jul 6 14:44:34 CEST 2011

Dear Thibaut,


Thanks for this. I have tried running several times overnight now but
each time get the message:



I am running windows7 on a 64bit system with 4x 2.4GHz and 4Gb RAM, so I
don't think the problem is related to my PC?

Many thanks for any suggestions you might have...


Cheers Evert


(PS when reading in my CSV is use "stringsAsFactor=F", so that my marker
data is read in as characters -could that be the problem?)

From: Jombart, Thibaut [mailto:t.jombart at imperial.ac.uk] 
Sent: Monday, July 04, 2011 11:33 AM
To: Thomas, Evert (Bioversity-Colombia);
adegenet-forum at r-forge.wu-wien.ac.at
Subject: RE: [adegenet-forum] dataset too large? Follow-up


Dear Thomas, 

The algorithm for translating your data into individual frequencies is
not linear. RAM saturation is likely to cause supplementary delays in
any case, but windows is good at having applications freezing/crashing
in such cases ("R has stopped working...send a report") . How much
memory do you have on your computer? In any case I would recommend
running overnight to make sure it just doesn't take ages, but works.

We are looking at a big dataset, but it is merely 2-3 times bigger than
eHGDP, which was not such a pain to obtain.

As for multicore, the package is not available for windows,

Importing your data from STRUCTURE won't help, it will actually be
longer and more RAM-demanding.

On the bright side, once you'll have your data imported, analysis should
be slightly less time-consuming.





From: adegenet-forum-bounces at r-forge.wu-wien.ac.at
[adegenet-forum-bounces at r-forge.wu-wien.ac.at] on behalf of Thomas,
Evert (Bioversity-Colombia) [E.Thomas at CGIAR.ORG]
Sent: 04 July 2011 16:18
To: adegenet-forum at r-forge.wu-wien.ac.at
Subject: Re: [adegenet-forum] dataset too large? Follow-up



The problem does not seem to be related to my commands, since I do get
results for subsets of my data (1000 individuals takes 40 seconds), but
it does not seem to work for my entire dataset of >25000 individuals
(should theoretically take 16.6 minutes, but after 4 hours still no
result) ... any suggestions?  

many thanks in advance



From: adegenet-forum-bounces at r-forge.wu-wien.ac.at
[mailto:adegenet-forum-bounces at r-forge.wu-wien.ac.at] On Behalf Of
Thomas, Evert (Bioversity-Colombia)
Sent: Friday, July 01, 2011 1:56 PM
To: adegenet-forum at r-forge.wu-wien.ac.at
Subject: [adegenet-forum] dataset too large?


Dear colleagues,


I am new to R so apologies for my ignorance, but I have a couple of


I am trying to use adegenet (on a 64bit system, windows7) for analyzing
a SSR dataset. It consists 96 loci and I have >25000 individuals (after
resampling). I have loaded the database as a dataframe in R, but am not
able to convert to genind format (PC physical memory becomes saturated,
while only 10% of CPU is used) . Could this be related to the size of my
dataset? Any suggestions?


On another note: Alternatively, I tried importing my data to genind
object from the corresponding file in Structure format. However, my
version of Structure (2.3.3.) does not seem to generate .stru or .str
files, any solution there?


And a last point: I am unable to install/load the R application
multicore because it is not among the packages list...


This is what I have done:


I did a read.csv with "header=T", and then rownames<-cacaoCSV[,1]


The problems occurs with the following command

cacao<-df2genind(cacaoCSV, sep="/",ind.names=NULL, loc.names=NULL,
pop=cacaoCSV[,2], missing=NA, ploidy=2, type="codom")



Many thanks in advance for any advice or suggestion you might have!


Enjoy the weekend

Evert Thomas, PhD

Associate Expert, Conservation and Use of 

Forest Genetic Resources in Latin America


Bioversity International

Regional Office for the Americas

Recta Cali-Palmira Km 17 - CIAT

Cali, Colombia

P.O. Box 6713


Tel. 57 2 4450048 / 49 Ext 113

Fax 57 2 4450096

Email: e.thomas at cgiar.org

Skype: evertthomas




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20110706/d5fb1aeb/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 21954 bytes
Desc: image001.png
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20110706/d5fb1aeb/attachment-0001.png>

More information about the adegenet-forum mailing list