[Basta-users] errors in inputMat and `colnames<-`(`*tmp*`, value = c("b0", "b1"))?
Fernando Colchero
colchero at imada.sdu.dk
Tue Sep 17 09:07:37 CEST 2013
Hi Caroline,
I suspect that the problem has still to do with the way the variables are named. We haven't been able to fix this in BaSTA but, a quick solution would be to rename all your covariates with simpler names. I could send you an optional code for that if you like.
Best,
Fernando
Fernando Colchero
Assistant Professor
Department of Mathematics and Computer Science
Max-Planck Odense Center on the Biodemography of Aging
Tlf. +45 65 50 23 24
Email colchero at imada.sdu.dk
Web www.sdu.dk/staff/colchero
Pers. web www.colchero.com
Adr. Campusvej 55, 5230, Odense, Dk
University of Southern Denmark
On 17 Sep 2013, at 04:08, Caroline Chong <caroline.chong at anu.edu.au> wrote:
> Dear Fernando,
>
> I have attempted to run using multiple categorical covariates (input covariates csv attached) but seem to have encountered a similar problem. ("Latest recorded death year" reports as 149 which is outside my range of observations).
>
> Would you be able to suggest how to fix this in order to run MultiBasta? Also, may I enquire what might be an expected run time for model = "GO", nsim = 4, parallel = TRUE, ncpus = 4, updateJumps = TRUE - i.e. in the order of minutes, hours (or days?)
>
> Greatest thanks for your assistance,
> with best regards
> Caroline.
>
>
> cv <- read.csv("CaptHist.csv")
> rd <- cv$ROBSDATES
> class(rd)
> sum(is.na(cv))
> rd<-as.Date(rd)
> Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
> head(Y)
> sum(is.na(cv))
> birthDeath <- read.csv("penults_birthdeath.csv")
> covar <- read.csv("~/fixeda_covars.csv")
> covars <- MakeCovMat(x= c("SPECIES", "SUBGEN", "CLADE", "SECT", "LOCAT"), data = covar)
>
> colnames(covars)[-1] <- letters[1:(ncol(covars) - 1)]
> dat <- data.frame(birthDeath, Y[, -1], covars[, -1])
> dat2 <- DataCheck(dat, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent = FALSE)
>
> The following rows deaths occur before observations start:
> [1] 550 689
> These records have been removed from the Dataframe
> The following rows have observations that occur after the year of death:
> [1] 2 20 22 41 42 ..........
> Observations that post-date year of death have been removed.
>
> The following rows have observations that occur before the year of birth:
> [1] 298 299 300 301 .........
> [673] 1706 1707 1710 1711 1712 1715 1716 1717 1718
> Observations that pre-date year of birth have been removed.
>
> The following rows have a one in the recapture matrix in the birth year:
> [1] 14 25 36 47 58 80 91 102 113 114 125 136 147 158 169 191 213 224 225 236
> [21] 247 258 269 280
> *DataSummary*
> - Number of individuals = 1,718
> - Number with known birth year = 1,260
> - Number with known death year = 361
> - Number with known birth
> AND death years = 97
>
> - Total number of detections
> in recapture matrix = 8,789
>
> - Earliest detection time = 1
> - Latest detection time = 109
> - Earliest recorded birth year = 1
> - Latest recorded birth year = 108
> - Earliest recorded death year = 16
> - Latest recorded death year = 149
>
> > source("/Users/caroline/BASTA/MultiBaSTA.r")
> > multiOut <- MultiBaSTA(dat2$newDat, studyStart = 1, studyEnd = 109, nsim=4, parallel = TRUE, ncpus = 4, models = c("GO"), shape = "simple", covarStruct="all.in.mort", updateJumps = TRUE)
>
> --------------------------
> Run number 1, model: Go.Si
> --------------------------
> No problems were detected with the data.
>
> Starting simulation to find jump sd's...
> On 14/09/2013, at 2:24 PM, Fernando Colchero wrote:
>
>> Hi Caroline,
>>
>> In principle yes, but let us know if you find any problems.
>>
>> best,
>>
>> Fernando
>>
>>
>>
>> Fernando Colchero
>> Assistant Professor
>> Department of Mathematics and Computer Science
>> Max-Planck Odense Center on the Biodemography of Aging
>>
>> Tlf. +45 65 50 23 24
>> Email colchero at imada.sdu.dk
>> Web www.sdu.dk/staff/colchero
>> Pers. web www.colchero.com
>> Adr. Campusvej 55, 5230, Odense, Dk
>>
>> University of Southern Denmark
>>
>>
>>
>>
>>
>> On Sep 14, 2013, at 4:38 AM, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>>
>>> Hi Fernando,
>>>
>>> Oh, that's brilliant. Thanks for explaining the problem! I have managed to run using the temporary fix and will let you know if I encounter any to the contrary as I try out different covariate combinations etc.
>>> To confirm - should this code be ok to run regardless of the type or combination of covariates I select to run? e.g. multiple categorical covariates, or a mixture of integer and categorical covariates - I am intending to incorporate these into the analysis also.
>>>
>>> Many thanks,
>>> Caroline.
>>>
>>> On 13/09/2013, at 11:01 PM, Fernando Colchero wrote:
>>>
>>>> Hi Caroline,
>>>>
>>>> I found the problem. The issue is not with the data but a bug in the code when assigning parameter names to the covariates. The names in your CLADE covariates conflicted with the way BaSTA processes the results and finds the parameters. We have to fix it but, for the time being, here's a temporary solution so you can run your analyses:
>>>>
>>>> cv <- read.csv("CaptHist.csv")
>>>>
>>>> rd <- cv$ROBSDATES
>>>> class(rd)
>>>> sum(is.na(cv))
>>>> rd<-as.Date(rd)
>>>> Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
>>>> head(Y)
>>>> sum(is.na(cv))
>>>>
>>>> birthDeath <- read.csv("penults_birthdeath.csv")
>>>> covar <- read.csv("fixed_covars.csv")
>>>>
>>>> covars <- MakeCovMat(x= "CLADE", data = covar)
>>>>
>>>> # Here's the way to avoid the problem:
>>>> colnames(covars)[-1] <- letters[1:(ncol(covars) - 1)]
>>>> dat <- data.frame(birthDeath, Y[, -1], covars[, -1])
>>>>
>>>> dat2 <- DataCheck(dat, studyStart = 1, studyEnd = 109, autofix = rep(1, 7),
>>>> silent = FALSE)
>>>> out <- basta(dat2$newDat, studyStart = 1, studyEnd = 109, updateJumps = FALSE)
>>>>
>>>>
>>>> Let me know if this works. Best,
>>>>
>>>> Fernando
>>>>
>>>> Fernando Colchero
>>>> Assistant Professor
>>>> Department of Mathematics and Computer Science
>>>> Max-Planck Odense Center on the Biodemography of Aging
>>>>
>>>> Tlf. +45 65 50 23 24
>>>> Email colchero at imada.sdu.dk
>>>> Web www.sdu.dk/staff/colchero
>>>> Pers. web www.colchero.com
>>>> Adr. Campusvej 55, 5230, Odense, Dk
>>>>
>>>> University of Southern Denmark
>>>>
>>>> On 13 Sep 2013, at 12:08, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>>>>
>>>>> Hi Fernando,
>>>>> These errors are with the data attached here (and with my latest "errors in inputMat" email).
>>>>>
>>>>> Thanks so much for taking a look!!
>>>>>
>>>>> best,
>>>>> Caroline.
>>>>> "CaptHist.csv" = census matrix
>>>>> "penults_birthdeath.csv" = birthDeath matrix
>>>>> "fixed_covars.csv" = covariate matrix
>>>>>
>>>>>
>>>>> On 13/09/2013, at 7:49 PM, Fernando Colchero wrote:
>>>>>
>>>>>> Hi Caroline,
>>>>>>
>>>>>> Are these errors with the data you sent me? If so, I'll run them myself and get back to you asap.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Fernando
>>>>>>
>>>>>> Fernando Colchero
>>>>>> Assistant Professor
>>>>>> Department of Mathematics and Computer Science
>>>>>> Max-Planck Odense Center on the Biodemography of Aging
>>>>>>
>>>>>> Tlf. +45 65 50 23 24
>>>>>> Email colchero at imada.sdu.dk
>>>>>> Web www.sdu.dk/staff/colchero
>>>>>> Pers. web www.colchero.com
>>>>>> Adr. Campusvej 55, 5230, Odense, Dk
>>>>>>
>>>>>> University of Southern Denmark
>>>>>>
>>>>>> On 13 Sep 2013, at 09:58, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>>>>>>
>>>>>>> Dear BaSTA
>>>>>>>
>>>>>>> Owen, and Fernando - thanks for your assistance with my previous birth-death coding issue (rowSums error)- happy to report that I was able to re-code this successfully and DataCheck now passes with no errors.
>>>>>>>
>>>>>>> However I am running into the below two errors - would you be able to solve or decipher what the issue is? Firstly in the final compiled matrix (im2 = inputMat), every single observation now has a recorded Death observation whereas this is not the case in my input birthDeath matrix. I tried editing this via bd.na (below code) but this didn't work. Is there some possible issue when reading 0s in the birth and death columns?
>>>>>>>
>>>>>>> ##e.g. original births and deaths observation matrix
>>>>>>> > head(birthDeath)
>>>>>>> ID birth death
>>>>>>> 1 1 0 68
>>>>>>> 2 2 0 68
>>>>>>> 3 3 0 0
>>>>>>> 4 4 1 0
>>>>>>> 5 5 1 0
>>>>>>> 6 6 1 0
>>>>>>> ## whereas final merged matrix below shows:
>>>>>>> > head(im2)
>>>>>>> ID birth death 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
>>>>>>> 1 1 0 53 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
>>>>>>> 10 2 0 99 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
>>>>>>> 100 3 0 99 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
>>>>>>> 1000 4 8 103 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>>>>> 1001 5 8 103 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>>>>> 1002 6 8 103 0 0
>>>>>>> > dc <- DataCheck(im2, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent=FALSE)
>>>>>>> No problems were detected with the data.
>>>>>>>
>>>>>>> *DataSummary*
>>>>>>> - Number of individuals = 1,720
>>>>>>> - Number with known birth year = 1,428
>>>>>>> - Number with known death year = 1,720
>>>>>>> - Number with known birth
>>>>>>> AND death years = 1,428
>>>>>>>
>>>>>>> - Total number of detections
>>>>>>> in recapture matrix = 10,339
>>>>>>>
>>>>>>> - Earliest detection time = 1
>>>>>>> - Latest detection time = 109
>>>>>>> - Earliest recorded birth year = 1
>>>>>>> - Latest recorded birth year = 107
>>>>>>> - Earliest recorded death year = 2
>>>>>>> - Latest recorded death year = 109
>>>>>>>
>>>>>>>
>>>>>>> Secondly on running basta (run time to error 20mins) I get returned e.g.
>>>>>>>
>>>>>>> > out <- basta(object = im2, studyStart = 1, studyEnd = 109)
>>>>>>> No problems were detected with the data.
>>>>>>>
>>>>>>> Starting simulation to find jump sd's... done.
>>>>>>>
>>>>>>> Simulation started...
>>>>>>>
>>>>>>> Error in `colnames<-`(`*tmp*`, value = c("b0", "b1")) :
>>>>>>> length of 'dimnames' [2] not equal to array extent
>>>>>>>
>>>>>>> It appears that the dimensions of the matrix are not 2 - is this correct?, which I am unsure how to interpret or fix.
>>>>>>>
>>>>>>> looking forward to hearing back,
>>>>>>> many thanks for your help,
>>>>>>> best regards
>>>>>>> Caroline.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> cv <- read.csv("~/CaptHist.csv")
>>>>>>> rd <- cv$ROBSDATES
>>>>>>> class(rd)
>>>>>>> sum(is.na(cv))
>>>>>>> rd<-as.Date(rd)
>>>>>>> Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
>>>>>>> head(Y)
>>>>>>> sum(is.na(cv))
>>>>>>>
>>>>>>> birthDeath <- read.delim("~/penults_birthdeath.csv", sep=",", header=T)
>>>>>>>
>>>>>>> bd.na <- t( # the below returns a transposed matrix, so we have to re-transpose it back to normal
>>>>>>> apply( # foreach row (hence 1) in the birth dates matrix
>>>>>>> birthDeath, 1,
>>>>>>> function(r) { # apply by row this function (hence r)
>>>>>>> if(r[2] == 0) { # if birth [2] is 0
>>>>>>> r[2] <- NA # replace birth value with NA (R's missing data value)
>>>>>>> }
>>>>>>> if (r[3] == 0) { # if death [3] is 0
>>>>>>> r[3] <- NA # replace death value with NA (R's missing data value)
>>>>>>> }
>>>>>>> return(r) # return the whole row
>>>>>>> }
>>>>>>> ))
>>>>>>>
>>>>>>> table(is.na(bd.na[,3]))
>>>>>>> BD <- bd.na
>>>>>>> head(BD)
>>>>>>> covar <- read.delim("~/fixed_covars.csv", sep=",", header=T)
>>>>>>> head(covar)
>>>>>>> covMat <- MakeCovMat(x=c("CLADE"), data = covar)
>>>>>>> days <- as.numeric(colnames(Y)[2:ncol(Y)])
>>>>>>> y <- as.matrix(Y)
>>>>>>> bd <- apply(y,1,function(r) min(as.numeric(days[as.logical(r[2:ncol(Y)])]))) -1
>>>>>>> dd <- apply(y,1,function(r) max(as.numeric(days[as.logical(r[2:ncol(Y)])])))
>>>>>>> inputMat <- as.data.frame(cbind(BD, Y[, -1], covMat[, -1]))
>>>>>>> ##inputMat <- merge(BD, Y, by.x = "ID", by.y = "ID")
>>>>>>> ##inputMat <- merge(inputMat, covMat, by.x = "ID", by.y = "ID")
>>>>>>> dim(inputMat)
>>>>>>> colnames(inputMat)
>>>>>>> im2 <- inputMat
>>>>>>> im2[,2] <- bd
>>>>>>> im2[,3] <- dd
>>>>>>> head(im2)
>>>>>>> dc <- DataCheck(im2, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent=FALSE)
>>>>>>> names(dc)
>>>>>>> head(inputMat)
>>>>>>> # outMat <- dc$newData
>>>>>>> out <- basta(object = im2, studyStart = 1, studyEnd = 109)
>>>>>>>
>>>>>>> <penults_birthdeath.csv><fixed_covars.csv><CaptHist.csv>
>>>>>>
>>>>>
>>>>> <penults_birthdeath.csv><fixed_covars.csv><CaptHist.csv>
>>>>
>>>
>>
>
> <fixeda_covars.csv><CaptHist.csv><penults_birthdeath.csv>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/basta-users/attachments/20130917/c80dc1b0/attachment-0001.html>
More information about the Basta-users
mailing list