[Basta-users] errors in inputMat and `colnames<-`(`*tmp*`, value = c("b0", "b1"))?

Caroline Chong caroline.chong at anu.edu.au
Tue Sep 17 09:20:58 CEST 2013


Thanks, Fernando

Is the issue likely covariate name length combined with other issues? Would appreciate any code tips that you might have for added clarity!

Many thanks for your help,
best
Caroline.

On 17/09/2013, at 5:07 PM, Fernando Colchero wrote:

Hi Caroline,

    I suspect that the problem has still to do with the way the variables are named. We haven't been able to fix this in BaSTA but, a quick solution would be to rename all your covariates with simpler names. I could send you an optional code for that if you like.

   Best,

   Fernando

Fernando Colchero
Assistant Professor
Department of Mathematics and Computer Science
Max-Planck Odense Center on the Biodemography of Aging

Tlf.               +45 65 50 23 24
Email           colchero at imada.sdu.dk<mailto:colchero at imada.sdu.dk>
Web             www.sdu.dk/staff/colchero<http://www.sdu.dk/staff/colchero>
Pers. web   www.colchero.com<http://www.sdu.dk/staff/colchero>
Adr.              Campusvej 55, 5230, Odense, Dk

University of Southern Denmark

On 17 Sep 2013, at 04:08, Caroline Chong <caroline.chong at anu.edu.au<mailto:caroline.chong at anu.edu.au>> wrote:

Dear Fernando,

I have attempted to run using multiple categorical covariates (input covariates csv attached) but seem to have encountered a similar problem. ("Latest recorded death year" reports as 149 which is outside my range of observations).

Would you be able to suggest how to fix this in order to run MultiBasta? Also, may I enquire what might be an expected run time for model = "GO", nsim = 4, parallel = TRUE, ncpus = 4, updateJumps = TRUE - i.e. in the order of minutes, hours (or days?)

Greatest thanks for your assistance,
with best regards
Caroline.


cv <- read.csv("CaptHist.csv")
rd <- cv$ROBSDATES
class(rd)
sum(is.na(cv))
rd<-as.Date(rd)
Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
head(Y)
sum(is.na(cv))
birthDeath <- read.csv("penults_birthdeath.csv")
covar <- read.csv("~/fixeda_covars.csv")
covars <- MakeCovMat(x= c("SPECIES", "SUBGEN", "CLADE", "SECT", "LOCAT"), data = covar)

colnames(covars)[-1] <- letters[1:(ncol(covars) - 1)]
dat <- data.frame(birthDeath, Y[, -1], covars[, -1])
dat2 <- DataCheck(dat, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent = FALSE)

The following rows deaths occur before observations start:
[1] 550 689
These records have been removed from the Dataframe
The following rows have observations that occur after the year of death:
  [1]    2   20   22   41   42   ..........
Observations that post-date year of death have been removed.

The following rows have observations that occur before the year of birth:
  [1]  298  299  300  301  .........
[673] 1706 1707 1710 1711 1712 1715 1716 1717 1718
Observations that pre-date year of birth have been removed.

The following rows have a one in the recapture matrix in the birth year:
 [1]  14  25  36  47  58  80  91 102 113 114 125 136 147 158 169 191 213 224 225 236
[21] 247 258 269 280
*DataSummary*
- Number of individuals         =    1,718
- Number with known birth year  =    1,260
- Number with known death year  =     361
- Number with known birth
 AND death years                =      97

- Total number of detections
 in recapture matrix            =    8,789

- Earliest detection time       =       1
- Latest detection time         =     109
- Earliest recorded birth year  =       1
- Latest recorded birth year    =     108
- Earliest recorded death year  =      16
- Latest recorded death year    =     149

> source("/Users/caroline/BASTA/MultiBaSTA.r")
> multiOut <- MultiBaSTA(dat2$newDat, studyStart = 1, studyEnd = 109, nsim=4, parallel = TRUE, ncpus = 4, models = c("GO"), shape = "simple", covarStruct="all.in.mort", updateJumps = TRUE)

--------------------------
Run number 1, model: Go.Si
--------------------------
No problems were detected with the data.

Starting simulation to find jump sd's...
On 14/09/2013, at 2:24 PM, Fernando Colchero wrote:

Hi Caroline,

   In principle yes, but let us know if you find any problems.

  best,

  Fernando



Fernando Colchero
Assistant Professor
Department of Mathematics and Computer Science
Max-Planck Odense Center on the Biodemography of Aging

Tlf.               +45 65 50 23 24
Email           colchero at imada.sdu.dk<mailto:colchero at imada.sdu.dk>
Web             www.sdu.dk/staff/colchero<http://www.sdu.dk/staff/colchero>
Pers. web   www.colchero.com<http://www.sdu.dk/staff/colchero>
Adr.              Campusvej 55, 5230, Odense, Dk

University of Southern Denmark





On Sep 14, 2013, at 4:38 AM, Caroline Chong <caroline.chong at anu.edu.au<mailto:caroline.chong at anu.edu.au>> wrote:

Hi Fernando,

Oh, that's brilliant. Thanks for explaining the problem! I have managed to run using the temporary fix and will let you know if I encounter any to the contrary as I try out different covariate combinations etc.
To confirm - should this code be ok to run regardless of the type or combination of covariates I select to run? e.g. multiple categorical covariates, or a mixture of integer and categorical covariates - I am intending to incorporate these into the analysis also.

Many thanks,
Caroline.

On 13/09/2013, at 11:01 PM, Fernando Colchero wrote:

Hi Caroline,

   I found the problem. The issue is not with the data but a bug in the code when assigning parameter names to the covariates. The names in your CLADE covariates conflicted with the way BaSTA processes the results and finds the parameters. We have to fix it but, for the time being, here's a temporary solution so you can run your analyses:

cv <- read.csv("CaptHist.csv")

rd <- cv$ROBSDATES
class(rd)
sum(is.na(cv))
rd<-as.Date(rd)
Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
head(Y)
sum(is.na(cv))

birthDeath <- read.csv("penults_birthdeath.csv")
covar <- read.csv("fixed_covars.csv")

covars <- MakeCovMat(x= "CLADE", data = covar)

# Here's the way to avoid the problem:
colnames(covars)[-1] <- letters[1:(ncol(covars) - 1)]
dat <- data.frame(birthDeath, Y[, -1], covars[, -1])

dat2 <- DataCheck(dat, studyStart = 1, studyEnd = 109, autofix = rep(1, 7),
                  silent = FALSE)
out <- basta(dat2$newDat, studyStart = 1, studyEnd = 109, updateJumps = FALSE)


  Let me know if this works. Best,

   Fernando

Fernando Colchero
Assistant Professor
Department of Mathematics and Computer Science
Max-Planck Odense Center on the Biodemography of Aging

Tlf.               +45 65 50 23 24
Email           colchero at imada.sdu.dk<mailto:colchero at imada.sdu.dk>
Web             www.sdu.dk/staff/colchero<http://www.sdu.dk/staff/colchero>
Pers. web   www.colchero.com<http://www.sdu.dk/staff/colchero>
Adr.              Campusvej 55, 5230, Odense, Dk

University of Southern Denmark

On 13 Sep 2013, at 12:08, Caroline Chong <caroline.chong at anu.edu.au<mailto:caroline.chong at anu.edu.au>> wrote:

Hi Fernando,
These errors are with the data attached here (and with my latest "errors in inputMat" email).

Thanks so much for taking a look!!

best,
Caroline.
"CaptHist.csv" = census matrix
"penults_birthdeath.csv" = birthDeath matrix
"fixed_covars.csv" = covariate matrix


On 13/09/2013, at 7:49 PM, Fernando Colchero wrote:

Hi Caroline,

   Are these errors with the data you sent me? If so, I'll run them myself and get back to you asap.

  Best,

  Fernando

Fernando Colchero
Assistant Professor
Department of Mathematics and Computer Science
Max-Planck Odense Center on the Biodemography of Aging

Tlf.               +45 65 50 23 24
Email           colchero at imada.sdu.dk<mailto:colchero at imada.sdu.dk>
Web             www.sdu.dk/staff/colchero<http://www.sdu.dk/staff/colchero>
Pers. web   www.colchero.com<http://www.sdu.dk/staff/colchero>
Adr.              Campusvej 55, 5230, Odense, Dk

University of Southern Denmark

On 13 Sep 2013, at 09:58, Caroline Chong <caroline.chong at anu.edu.au<mailto:caroline.chong at anu.edu.au>> wrote:

Dear BaSTA

Owen, and Fernando - thanks for your assistance with my previous birth-death coding issue (rowSums error)- happy to report that I was able to re-code this successfully and DataCheck now passes with no errors.

However I am running into the below two errors - would you be able to solve or decipher what the issue is? Firstly in the final compiled matrix (im2 = inputMat), every single observation now has a recorded Death observation whereas this is not the case in my input birthDeath matrix. I tried editing this via bd.na (below code) but this didn't work. Is there some possible issue when reading 0s in the birth and death columns?

##e.g. original births and deaths observation matrix
> head(birthDeath)
  ID birth death
1  1     0    68
2  2     0    68
3  3     0     0
4  4     1     0
5  5     1     0
6  6     1     0
## whereas final merged matrix below shows:
> head(im2)
ID birth death 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
1     1     0    53 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
10    2     0    99 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
100   3     0    99 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
1000  4     8   103 0 0 0 0 0 0 0 0 1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
1001  5     8   103 0 0 0 0 0 0 0 0 1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
1002  6     8   103 0 0
> dc <- DataCheck(im2, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent=FALSE)
No problems were detected with the data.

*DataSummary*
- Number of individuals         =    1,720
- Number with known birth year  =    1,428
- Number with known death year  =    1,720
- Number with known birth
 AND death years                =    1,428

- Total number of detections
 in recapture matrix            =   10,339

- Earliest detection time       =       1
- Latest detection time         =     109
- Earliest recorded birth year  =       1
- Latest recorded birth year    =     107
- Earliest recorded death year  =       2
- Latest recorded death year    =     109


Secondly on running basta (run time to error 20mins) I get returned e.g.

> out <- basta(object = im2, studyStart = 1, studyEnd = 109)
No problems were detected with the data.

Starting simulation to find jump sd's...  done.

Simulation started...

Error in `colnames<-`(`*tmp*`, value = c("b0", "b1")) :
  length of 'dimnames' [2] not equal to array extent

 It appears that the dimensions of the matrix are not 2 - is this correct?, which I am unsure how to interpret or fix.

looking forward to hearing back,
many thanks for your help,
best regards
Caroline.




cv <- read.csv("~/CaptHist.csv")
rd <- cv$ROBSDATES
class(rd)
sum(is.na(cv))
rd<-as.Date(rd)
Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
head(Y)
sum(is.na(cv))

birthDeath <- read.delim("~/penults_birthdeath.csv", sep=",", header=T)

bd.na <- t( # the below returns a transposed matrix, so we have to re-transpose it back to normal
  apply( # foreach row (hence 1) in the birth dates matrix
    birthDeath, 1,
    function(r) { # apply by row this function (hence r)
      if(r[2] == 0) { # if birth [2] is 0
        r[2] <- NA # replace birth value with NA (R's missing data value)
      }
      if (r[3] == 0) { # if death [3] is 0
        r[3] <- NA # replace death value with NA (R's missing data value)
      }
      return(r) # return the whole row
    }
  ))

table(is.na(bd.na[,3]))
BD <- bd.na
head(BD)
covar <- read.delim("~/fixed_covars.csv", sep=",", header=T)
head(covar)
covMat <- MakeCovMat(x=c("CLADE"), data = covar)
days <- as.numeric(colnames(Y)[2:ncol(Y)])
y <- as.matrix(Y)
bd <- apply(y,1,function(r) min(as.numeric(days[as.logical(r[2:ncol(Y)])]))) -1
dd <- apply(y,1,function(r) max(as.numeric(days[as.logical(r[2:ncol(Y)])])))
inputMat <- as.data.frame(cbind(BD, Y[, -1], covMat[, -1]))
##inputMat <- merge(BD, Y, by.x = "ID", by.y = "ID")
##inputMat <- merge(inputMat, covMat, by.x = "ID", by.y = "ID")
dim(inputMat)
colnames(inputMat)
im2 <- inputMat
im2[,2] <- bd
im2[,3] <- dd
head(im2)
dc <- DataCheck(im2, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent=FALSE)
names(dc)
head(inputMat)
# outMat <- dc$newData
out <- basta(object = im2, studyStart = 1, studyEnd = 109)

<penults_birthdeath.csv><fixed_covars.csv><CaptHist.csv>


<penults_birthdeath.csv><fixed_covars.csv><CaptHist.csv>




<fixeda_covars.csv><CaptHist.csv><penults_birthdeath.csv>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/basta-users/attachments/20130917/a1ad8dfe/attachment-0001.html>


More information about the Basta-users mailing list