[Basta-users] memory useage and categorical covariates
Caroline Chong
caroline.chong at anu.edu.au
Sun Aug 3 01:46:34 CEST 2014
>>
>> Thanks Owen, Fernando
>>
>> We are encountering problems with memory usage when running basta
>> analyses (using the latest version), and would be most grateful for
>> your suggestions on how to resolve the following situation. We're
>> running basta on a 64-bit linux and are able to commence basta runs
>> successfully (both in serial and parallel), but rapidly occupy 128 GB
>> RAM and all available swap (90 GB). The number of iterations is
>> currently set to 2 million but we have also attempted 1 million. We
>> are keen to run the final analyses, and will look forward to your
>> feedback (and from the basta community) with great anticipation.
>>
>> - All iterations are currently stored in the PAR matrix. Is it
>> possible to store only the thinned chain in memory and write the full
>> chain to disc every (for example) 100000 iterations?
>>
>> - If so, would this be a reasonably straightforward alteration to
>> implement in the current basta version, and what would this look like?
>>
>> - Alternatively, could you please advise of any other methods to
>> reduce memory useage?
>>
>> Example of basta commands are below. I am aiming to compare model
>> types (exponential, gompertz, logistic) run for each of 30 or more
>> species:
>>
>> for (s in 1:length(species.list))
>> {
>>
>> ### iterate through species list, read input matrix file
>> species <- species.list[s]
>> iM.spec <- read.delim(paste("inputmatg.", species, ".txt", sep=""),
>> header=T, sep=",") colnames(iM.spec)[4:112]<- 1:109
>>
>> iM.spec[,4:112] <- sapply(iM.spec[,4:112], as.character)
>> iM.spec[,4:112] <- sapply(iM.spec[,4:112], as.numeric)
>>
>> ### perform data check on imput matrix
>> iM.spec.basta <- DataCheck(iM.spec, studyStart = 1, studyEnd = 109,
>> autofix = rep(1, 7), silent = FALSE)
>>
>> ### basta
>> speciesout <- basta(object = iM.spec.basta$newData, studyStart = 1,
>> studyEnd = 109, model = "GO", shape= "simple", nsim = 4, parallel =
>> TRUE, ncpus = 16, updateJumps = TRUE, niter = 2000000, burnin= 8001,
>> lifeTable=TRUE)
>> }
>> Example input matrix showing data for the first three individuals of
>> species1 only, for brevity:
>>
>> "ID","birth","death","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107","108","109"
>> "1",1,0,68,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
>> "2",2,0,68,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
>> "3",3,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
>>
>>
>> My second query is as per last week regarding the "Error in FUN(newX[,
>> i], ...) : invalid 'type' (character) of argument" error encountered
>> when attempting to include a single or multiple covariates
>> (particularly of type categorical), but re-posted here to the mailing
>> list in case the community can also help out. Do you have any
>> suggestions on how to code or name categorical covariates to
>> circumvent this error?
>>
>> Many thanks for your help,
>> best regards
>> Caroline.
>>
>>
>>
>> On 21 Jul 2014, at 23:52, caroline <caroline.chong at anu.edu.au> wrote:
>>
>> Dear Owen/Fernando,
>>
>> I was wondering whether you had any updated advice on how to code a
>> single or multiple categorical covariates to avoid the "FUN(newX[,i])"
>> error (similarly reported by Richard on 30 May 2014). I have checked
>> that the covariate names do not start with the same characters and
>> have also tried simplified names (a, b, c) to no avail so far. I would
>> be most happy to provide you with some of my data set if that would be
>> helpful for more context and to find the solution. I also intend to
>> use basta with both categorical and numeric covariates so would
>> appreciate any suggestions you may have on covariate naming.
>>
>> Also, I would be grateful for your advice on deciphering the following
>> (run on a cluster with 32 cpus available):
>>
>> iM.spec.basta <- DataCheck(iM.spec, studyStart = 1, studyEnd = 109,
>> autofix = rep(1, 7), silent = FALSE)
>>
>> exspeciesout <- basta(object = iM.spec.basta$newData, studyStart = 1,
>> studyEnd = 109, model = "EX", shape= "simple", nsim = 4, parallel =
>> TRUE, ncpus = 16, updateJumps = TRUE, niter = 2000000, burnin= 8001,
>> lifeTable=TRUE)
>>
>> Total MCMC computing time: 10.48 hours.
>>
>> Survival parameters converged appropriately.
>> DIC was calculated.
>> Error: cannot allocate vector of size 109.5 Gb
>> Execution halted
>> Warning message:
>> system call failed: Cannot allocate memory
>>
>> Warmest thanks for your assistance,
>> with best regards,
>> Caroline.
>>
>>
>>
>> On 01/11/2013, at 8:54 AM, Owen Jones wrote:
>>
>>> Dear Caroline,
>>>
>>> This is a possible bug in one of the sub-functions in basta that
>>> deals with the covariates.
>>>
>>> We're investigating and will get back to you shortly.
>>>
>>> Best wishes,
>>> Owen
>>>
>>>
>>>
>>>
>>> On 1 Nov 2013, at 14:01, caroline <caroline.chong at anu.edu.au> wrote:
>>>
>>>> Dear All,
>>>>
>>>> I am running a simple Gompertz model with one categorical
>>>> covariate, species (coded as species SPEC = a, b, c.... ah for
>>>> simplicity). After running DataCheck with autofix set to on, I
>>>> encounter the following error "in FUN(newX[,i]...)" - would you
>>>> have any experience with a similar situation, and could provide any
>>>> help or suggestions on how to interpret and trouble-shoot the
>>>> problem? I am unsure as to how to decipher where the error lies.
>>>> N.b. this data set seems to run ok when I do not include the
>>>> covariate matrix.
>>>>
>>>> Very grateful for your help,
>>>> best regards
>>>> Caroline.
>>>>
>>>> Caroline Chong
>>>> Postdoctoral Fellow
>>>> Research School of Biology
>>>> Australian National University, Canberra ACT
>>>>
>>>> speciesout <- basta(object = inputMat2$newData, studyStart = 1,
>>>> studyEnd = 109, covarsStruct = "all.in.mort", nsim = 4, parallel =
>>>> TRUE, ncpus = 4, updateJumps = TRUE, niter = 100000) No problems
>>>> were detected with the data.
>>>>
>>>> Error in FUN(newX[, i], ...) : invalid 'type' (character) of
>>>> argument
>>
>>
>>
More information about the Basta-users
mailing list