[Basta-users] errors in inputMat and `colnames<-`(`*tmp*`, value = c("b0", "b1"))?

Owen Jones jonesor at gmail.com
Tue Oct 15 19:17:44 CEST 2013


Sorry for the delay in getting to this.
We've been super busy here!

The CensusToCaptHist function is expecting integer values, not dates (we're working on extending the functionality of that!).

Most of the studies we've worked on have been focussed on annual censuses so we envisioned this to work with a vector of IDs and a vector of years in which they have been seen (repeated observations in the same year are dealt with).

In your case, it looks like the observation window should either be day, or week of the study. You can do this with the following code:

captHist = read.csv("/Users/orj/Desktop/CaptHist.csv")
head(captHist)

# convert your dates to a class(Date) object.
captHist$Date = as.Date(captHist$ROBSDATES)

#Extract the Julian day.
captHist$julDay = julian(captHist$Date)

#Convert to day of study for convenience (this step is not essential but it makes interpretation easier)
captHist$DayOfStudy = (captHist$julDay - min(captHist$julDay))+1

#Apply the funciton
Y = CensusToCaptHist(captHist$ID, captHist$DayOfStudy)
dim(Y)

#Or by week?
captHist$WeekOfStudy = ceiling(captHist$DayOfStudy/7)
Y = CensusToCaptHist(captHist$ID, captHist$WeekOfStudy)
dim(Y)

Regarding the covariates - your code looks OK. However, I would recommend starting simple (no covariates) and building up to more complicated models so you can get a feel for the outputs.

I hope this helps.

Cheers,
Owen



On 23 Sep 2013, at 04:10, Caroline Chong <caroline.chong at anu.edu.au> wrote:

> Hi Fernando, Owen,
> 
> Thanks again for your assistance in helping me through the veritable storm. I have for today two persisting problems I want to clarify with you and have attached my working datasets just to confirm that we have the same versions. I have checked that my input birthdeath matrix contains values from 0 - 109 (and in future datasets will aim to code from 0...n to avoid the negatives).
> 
> First, the obvious - I installed the fixed BaSTA version you sent last week and checked that I am running this version in R (I typically am using a Mac, but also tried running this in Windows and Linux to be sure).
> 
> I am encountering an error that is potentially populating from the Y <- CensusToCaptHist step. Is is possible that the input dates in CaptHist are not being translated entirely? For example, for individual 2, I have four observation dates recorded in CaptHist:
> 
> 
>  
> 	
> 
> ID
> 	
> 
> ROBSDATES
> 
> 
> 1
> 	
> 
> 1
> 	
> 
> 2012-07-14
> 
> 
> 2
> 	
> 
> 1
> 	
> 
> 2012-08-04
> 
> 
> 3
> 	
> 
> 1
> 	
> 
> 2012-08-18
> 
> 
> 4
> 	
> 
> 1
> 	
> 
> 2012-09-04
> 
> 
> 5
> 	
> 
> 2
> 	
> 
> 2012-07-14
> 
> 
> 6
> 	
> 
> 2
> 	
> 
> 2012-08-04
> 
> 
> 7
> 	
> 
> 2
> 	
> 
> 2012-08-18
> 
> 
> 8
> 	
> 
> 2
> 	
> 
> 2012-09-04
> 
> 
> 9
> 	
> 
> 3
> 	
> 
> 2012-07-14
> 
> 
> and for this individual 2, birthdate = 0 (no data) and deathdate = 68.
> 
> However on running Y <- CensusToCaptHist I get returned that individual 2 has nine observation dates, in "years" 1, 22, 36, 53, 68, 86, 99, which is not represented in my input data. Would you be able to solve this translation discrepancy from the input CaptHist to output Y matrix? This results in many "errors" being returned when I run DataCheck.
> 
> Secondly, and hopefully as a minor check, I have a finalised covariate matrix containing four categorical covariates named: spec, subg, clade, sect. (File attached). If I wanted to include spec, subg and clade in to the analysis, would
>  covars <- MakeCovMat(~spec + subg + clade, data = covarsRaw)
> dat <- data.frame(birthDeath2, Y[, -1], covars[, -1])
> 
> be the correct code? And should I be able to proceed to use MultiBaSTA, (which I am eager to use)?
> 
> Again, am incredibly appreciative of your feedback and help.
> with thanks,
> best,
> Caroline.
> 
> On 20/09/2013, at 7:00 PM, Fernando Colchero wrote:
> 
>> Hi Caroline,
>> 
>>    I have to admit that your case was like the perfect storm for BaSTA! We've sorted out the issues with the dataset you sent us. Here are some ways of dealing with it:
>> 
>>    First, install the attached version of BaSTA which has several bug fixes that apply to your case. To install it just save it in a folder, say "C:/Documents/Temp/" and then run the following command on the R console:
>> 
>> install.packages("C:/Documents/Temp/BaSTA_1.9.1.tar.gz", type = "source")
>> 
>>    Then do the following: 
>> 
>>   cv <- read.csv(sprintf("%sCaptHist.csv", path))
>>   
>>   rd <- cv$ROBSDATES
>>   rd<-as.Date(rd)
>>   
>>   Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
>>   
>>   birthDeath <- read.csv(sprintf("%spenults_birthdeath.csv", path))
>>   birthDeath2 <- birthDeath
>>   birthDeath2[birthDeath != 0] <- birthDeath[birthDeath != 0] + 100
>>   covarsRaw <- read.csv(sprintf("%sfixed_covars.csv", path))
>>   covars <- MakeCovMat(~SPECIES + CLADE, data = covarsRaw)
>>   # Change the colnames of two of the covariates that overlap with another two covariates:
>>   colnames(covars)[c(21, 31)] <- c("SPECIESmyrrh01", "CLADEA201")
>>   dat <- data.frame(birthDeath2, Y[, -1], covars[, -1])
>>   
>>   dat2 <- DataCheck(dat, studyStart = 101, studyEnd = 209, autofix = rep(1, 7), 
>>       silent = FALSE)
>>   out <- basta(dat2$newDat, studyStart = 101, studyEnd = 209, thetaStart = c(-10, 0.001))
>> 
>> 
>>    I hope this really solves it. If not please let us know. Best,
>> 
>>    Fernando
>> 
>> 
>> 
>> Fernando Colchero
>> Assistant Professor
>> Department of Mathematics and Computer Science
>> Max-Planck Odense Center on the Biodemography of Aging
>> 
>> Tlf.               +45 65 50 23 24
>> Email           colchero at imada.sdu.dk
>> Web             www.sdu.dk/staff/colchero
>> Pers. web   www.colchero.com
>> Adr.              Campusvej 55, 5230, Odense, Dk
>> 
>> University of Southern Denmark
>> 
>> On 18 Sep 2013, at 16:54, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>> 
>>> Hi Fernando,
>>> Thanks so much for your explanations, deciphering and help. I'll re-inspect my input files as well to see if I can simplify any parameters further. I'll look forward to keeping in touch re how this goes, and of your updates on any function fixes..
>>> Thanks again,
>>> best,
>>> caroline.
>>> 
>>> On 18/09/2013, at 11:35 PM, "Fernando Colchero" <colchero at imada.sdu.dk> wrote:
>>> 
>>>> Hi Caroline,
>>>> 
>>>>   Well, the latest recorded year being 149 is because for individual 1465 in your "penults_birthdeath.csv" table you have that the death date is 149. 
>>>> 
>>>>    The second problem, which is an issue with MakeCovMat() that we need to fix, is that, if you specify the covariates just by their name, the function will assume that you want to model the covariates all with interactions. With the tables you gave me, that produces more than 2,000 covariates, which is one quarter of the total number of points in your dataset. To avoid this, you can use the following code:
>>>> 
>>>> covarsRaw <- read.csv("fixed_covars.csv")
>>>> covars <- MakeCovMat(~SPECIES + CLADE + POP, data = covarsRaw)
>>>> 
>>>>   In this way, you'll have only 45, which are a lot, but it's better than 2,000... 
>>>> 
>>>>   Another problem, is that, your range of dates go from -5 to 109, so when you specify a 0 in the death or birth dates, it's not clear with of the 0s are missing values and which are actual dates. This is not your fault and it's something that we need to fix from BaSTA. If you could add up say, 200 to the real dates, then a 0 would actually mean a real missing value.
>>>> 
>>>>    There's still an issue when prepping the parameters that makes that the algorithm doesn't move. We're working on it. We'll get you an answer as soon as possible.
>>>> 
>>>>    Best,
>>>> 
>>>>    Fernando
>>>>   
>>>> 
>>>> Fernando Colchero
>>>> Assistant Professor
>>>> Department of Mathematics and Computer Science
>>>> Max-Planck Odense Center on the Biodemography of Aging
>>>> 
>>>> Tlf.               +45 65 50 23 24
>>>> Email           colchero at imada.sdu.dk
>>>> Web             www.sdu.dk/staff/colchero
>>>> Pers. web   www.colchero.com
>>>> Adr.              Campusvej 55, 5230, Odense, Dk
>>>> 
>>>> University of Southern Denmark
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Sep 17, 2013, at 4:08 AM, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>>>> 
>>>>> Dear Fernando,
>>>>> 
>>>>> I have attempted to run using multiple categorical covariates (input covariates csv attached) but seem to have encountered a similar problem. ("Latest recorded death year" reports as 149 which is outside my range of observations).
>>>>> 
>>>>> Would you be able to suggest how to fix this in order to run MultiBasta? Also, may I enquire what might be an expected run time for model = "GO", nsim = 4, parallel = TRUE, ncpus = 4, updateJumps = TRUE - i.e. in the order of minutes, hours (or days?)
>>>>> 
>>>>> Greatest thanks for your assistance,
>>>>> with best regards
>>>>> Caroline.
>>>>> 
>>>>> 
>>>>> cv <- read.csv("CaptHist.csv")
>>>>> rd <- cv$ROBSDATES
>>>>> class(rd)
>>>>> sum(is.na(cv))
>>>>> rd<-as.Date(rd)
>>>>> Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
>>>>> head(Y)
>>>>> sum(is.na(cv))
>>>>> birthDeath <- read.csv("penults_birthdeath.csv")
>>>>> covar <- read.csv("~/fixeda_covars.csv")
>>>>> covars <- MakeCovMat(x= c("SPECIES", "SUBGEN", "CLADE", "SECT", "LOCAT"), data = covar)
>>>>> 
>>>>> colnames(covars)[-1] <- letters[1:(ncol(covars) - 1)]
>>>>> dat <- data.frame(birthDeath, Y[, -1], covars[, -1])
>>>>> dat2 <- DataCheck(dat, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent = FALSE)
>>>>> 
>>>>> The following rows deaths occur before observations start:
>>>>> [1] 550 689
>>>>> These records have been removed from the Dataframe
>>>>> The following rows have observations that occur after the year of death:
>>>>>   [1]    2   20   22   41   42   ..........
>>>>> Observations that post-date year of death have been removed.
>>>>> 
>>>>> The following rows have observations that occur before the year of birth:
>>>>>   [1]  298  299  300  301  .........
>>>>> [673] 1706 1707 1710 1711 1712 1715 1716 1717 1718
>>>>> Observations that pre-date year of birth have been removed.
>>>>> 
>>>>> The following rows have a one in the recapture matrix in the birth year:
>>>>>  [1]  14  25  36  47  58  80  91 102 113 114 125 136 147 158 169 191 213 224 225 236
>>>>> [21] 247 258 269 280
>>>>> *DataSummary*
>>>>> - Number of individuals         =    1,718 
>>>>> - Number with known birth year  =    1,260 
>>>>> - Number with known death year  =     361 
>>>>> - Number with known birth
>>>>>  AND death years                =      97 
>>>>> 
>>>>> - Total number of detections
>>>>>  in recapture matrix            =    8,789 
>>>>> 
>>>>> - Earliest detection time       =       1 
>>>>> - Latest detection time         =     109 
>>>>> - Earliest recorded birth year  =       1 
>>>>> - Latest recorded birth year    =     108 
>>>>> - Earliest recorded death year  =      16 
>>>>> - Latest recorded death year    =     149 
>>>>> 
>>>>> > source("/Users/caroline/BASTA/MultiBaSTA.r")
>>>>> > multiOut <- MultiBaSTA(dat2$newDat, studyStart = 1, studyEnd = 109, nsim=4, parallel = TRUE, ncpus = 4, models = c("GO"), shape = "simple", covarStruct="all.in.mort", updateJumps = TRUE)
>>>>> 
>>>>> --------------------------
>>>>> Run number 1, model: Go.Si
>>>>> --------------------------
>>>>> No problems were detected with the data.
>>>>> 
>>>>> Starting simulation to find jump sd's... 
>>>>> On 14/09/2013, at 2:24 PM, Fernando Colchero wrote:
>>>>> 
>>>>>> Hi Caroline,
>>>>>> 
>>>>>>    In principle yes, but let us know if you find any problems.
>>>>>> 
>>>>>>   best,
>>>>>> 
>>>>>>   Fernando
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Fernando Colchero
>>>>>> Assistant Professor
>>>>>> Department of Mathematics and Computer Science
>>>>>> Max-Planck Odense Center on the Biodemography of Aging
>>>>>> 
>>>>>> Tlf.               +45 65 50 23 24
>>>>>> Email           colchero at imada.sdu.dk
>>>>>> Web             www.sdu.dk/staff/colchero
>>>>>> Pers. web   www.colchero.com
>>>>>> Adr.              Campusvej 55, 5230, Odense, Dk
>>>>>> 
>>>>>> University of Southern Denmark
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Sep 14, 2013, at 4:38 AM, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>>>>>> 
>>>>>>> Hi Fernando,
>>>>>>> 
>>>>>>> Oh, that's brilliant. Thanks for explaining the problem! I have managed to run using the temporary fix and will let you know if I encounter any to the contrary as I try out different covariate combinations etc.
>>>>>>> To confirm - should this code be ok to run regardless of the type or combination of covariates I select to run? e.g. multiple categorical covariates, or a mixture of integer and categorical covariates - I am intending to incorporate these into the analysis also.
>>>>>>> 
>>>>>>> Many thanks,
>>>>>>> Caroline.
>>>>>>> 
>>>>>>> On 13/09/2013, at 11:01 PM, Fernando Colchero wrote:
>>>>>>> 
>>>>>>>> Hi Caroline,
>>>>>>>> 
>>>>>>>>    I found the problem. The issue is not with the data but a bug in the code when assigning parameter names to the covariates. The names in your CLADE covariates conflicted with the way BaSTA processes the results and finds the parameters. We have to fix it but, for the time being, here's a temporary solution so you can run your analyses:
>>>>>>>> 
>>>>>>>> cv <- read.csv("CaptHist.csv")
>>>>>>>> 
>>>>>>>> rd <- cv$ROBSDATES
>>>>>>>> class(rd)
>>>>>>>> sum(is.na(cv))
>>>>>>>> rd<-as.Date(rd)
>>>>>>>> Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
>>>>>>>> head(Y)
>>>>>>>> sum(is.na(cv))
>>>>>>>> 
>>>>>>>> birthDeath <- read.csv("penults_birthdeath.csv")
>>>>>>>> covar <- read.csv("fixed_covars.csv")
>>>>>>>> 
>>>>>>>> covars <- MakeCovMat(x= "CLADE", data = covar)
>>>>>>>> 
>>>>>>>> # Here's the way to avoid the problem:
>>>>>>>> colnames(covars)[-1] <- letters[1:(ncol(covars) - 1)]
>>>>>>>> dat <- data.frame(birthDeath, Y[, -1], covars[, -1])
>>>>>>>> 
>>>>>>>> dat2 <- DataCheck(dat, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), 
>>>>>>>>                   silent = FALSE)
>>>>>>>> out <- basta(dat2$newDat, studyStart = 1, studyEnd = 109, updateJumps = FALSE)
>>>>>>>> 
>>>>>>>> 
>>>>>>>>   Let me know if this works. Best,
>>>>>>>> 
>>>>>>>>    Fernando
>>>>>>>> 
>>>>>>>> Fernando Colchero
>>>>>>>> Assistant Professor
>>>>>>>> Department of Mathematics and Computer Science
>>>>>>>> Max-Planck Odense Center on the Biodemography of Aging
>>>>>>>> 
>>>>>>>> Tlf.               +45 65 50 23 24
>>>>>>>> Email           colchero at imada.sdu.dk
>>>>>>>> Web             www.sdu.dk/staff/colchero
>>>>>>>> Pers. web   www.colchero.com
>>>>>>>> Adr.              Campusvej 55, 5230, Odense, Dk
>>>>>>>> 
>>>>>>>> University of Southern Denmark
>>>>>>>> 
>>>>>>>> On 13 Sep 2013, at 12:08, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>>>>>>>> 
>>>>>>>>> Hi Fernando,
>>>>>>>>> These errors are with the data attached here (and with my latest "errors in inputMat" email).
>>>>>>>>> 
>>>>>>>>> Thanks so much for taking a look!!
>>>>>>>>> 
>>>>>>>>> best,
>>>>>>>>> Caroline.
>>>>>>>>> "CaptHist.csv" = census matrix
>>>>>>>>> "penults_birthdeath.csv" = birthDeath matrix
>>>>>>>>> "fixed_covars.csv" = covariate matrix
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 13/09/2013, at 7:49 PM, Fernando Colchero wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Caroline,
>>>>>>>>>> 
>>>>>>>>>>    Are these errors with the data you sent me? If so, I'll run them myself and get back to you asap.
>>>>>>>>>> 
>>>>>>>>>>   Best,
>>>>>>>>>> 
>>>>>>>>>>   Fernando
>>>>>>>>>> 
>>>>>>>>>> Fernando Colchero
>>>>>>>>>> Assistant Professor
>>>>>>>>>> Department of Mathematics and Computer Science
>>>>>>>>>> Max-Planck Odense Center on the Biodemography of Aging
>>>>>>>>>> 
>>>>>>>>>> Tlf.               +45 65 50 23 24
>>>>>>>>>> Email           colchero at imada.sdu.dk
>>>>>>>>>> Web             www.sdu.dk/staff/colchero
>>>>>>>>>> Pers. web   www.colchero.com
>>>>>>>>>> Adr.              Campusvej 55, 5230, Odense, Dk
>>>>>>>>>> 
>>>>>>>>>> University of Southern Denmark
>>>>>>>>>> 
>>>>>>>>>> On 13 Sep 2013, at 09:58, Caroline Chong <caroline.chong at anu.edu.au> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Dear BaSTA
>>>>>>>>>>> 
>>>>>>>>>>> Owen, and Fernando - thanks for your assistance with my previous birth-death coding issue (rowSums error)- happy to report that I was able to re-code this successfully and DataCheck now passes with no errors.
>>>>>>>>>>> 
>>>>>>>>>>> However I am running into the below two errors - would you be able to solve or decipher what the issue is? Firstly in the final compiled matrix (im2 = inputMat), every single observation now has a recorded Death observation whereas this is not the case in my input birthDeath matrix. I tried editing this via bd.na (below code) but this didn't work. Is there some possible issue when reading 0s in the birth and death columns?
>>>>>>>>>>> 
>>>>>>>>>>> ##e.g. original births and deaths observation matrix
>>>>>>>>>>> > head(birthDeath)
>>>>>>>>>>>   ID birth death
>>>>>>>>>>> 1  1     0    68
>>>>>>>>>>> 2  2     0    68
>>>>>>>>>>> 3  3     0     0
>>>>>>>>>>> 4  4     1     0
>>>>>>>>>>> 5  5     1     0
>>>>>>>>>>> 6  6     1     0
>>>>>>>>>>> ## whereas final merged matrix below shows:
>>>>>>>>>>> > head(im2)
>>>>>>>>>>> ID birth death 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
>>>>>>>>>>> 1     1     0    53 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
>>>>>>>>>>> 10    2     0    99 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
>>>>>>>>>>> 100   3     0    99 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
>>>>>>>>>>> 1000  4     8   103 0 0 0 0 0 0 0 0 1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
>>>>>>>>>>> 1001  5     8   103 0 0 0 0 0 0 0 0 1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
>>>>>>>>>>> 1002  6     8   103 0 0 
>>>>>>>>>>> > dc <- DataCheck(im2, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent=FALSE)
>>>>>>>>>>> No problems were detected with the data.
>>>>>>>>>>> 
>>>>>>>>>>> *DataSummary*
>>>>>>>>>>> - Number of individuals         =    1,720 
>>>>>>>>>>> - Number with known birth year  =    1,428 
>>>>>>>>>>> - Number with known death year  =    1,720 
>>>>>>>>>>> - Number with known birth
>>>>>>>>>>>  AND death years                =    1,428 
>>>>>>>>>>> 
>>>>>>>>>>> - Total number of detections
>>>>>>>>>>>  in recapture matrix            =   10,339 
>>>>>>>>>>> 
>>>>>>>>>>> - Earliest detection time       =       1 
>>>>>>>>>>> - Latest detection time         =     109 
>>>>>>>>>>> - Earliest recorded birth year  =       1 
>>>>>>>>>>> - Latest recorded birth year    =     107 
>>>>>>>>>>> - Earliest recorded death year  =       2 
>>>>>>>>>>> - Latest recorded death year    =     109 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Secondly on running basta (run time to error 20mins) I get returned e.g.
>>>>>>>>>>> 
>>>>>>>>>>> > out <- basta(object = im2, studyStart = 1, studyEnd = 109)
>>>>>>>>>>> No problems were detected with the data.
>>>>>>>>>>> 
>>>>>>>>>>> Starting simulation to find jump sd's...  done.
>>>>>>>>>>> 
>>>>>>>>>>> Simulation started...
>>>>>>>>>>> 
>>>>>>>>>>> Error in `colnames<-`(`*tmp*`, value = c("b0", "b1")) : 
>>>>>>>>>>>   length of 'dimnames' [2] not equal to array extent
>>>>>>>>>>> 
>>>>>>>>>>>  It appears that the dimensions of the matrix are not 2 - is this correct?, which I am unsure how to interpret or fix.
>>>>>>>>>>> 
>>>>>>>>>>> looking forward to hearing back,
>>>>>>>>>>> many thanks for your help,
>>>>>>>>>>> best regards
>>>>>>>>>>> Caroline.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> cv <- read.csv("~/CaptHist.csv")
>>>>>>>>>>> rd <- cv$ROBSDATES
>>>>>>>>>>> class(rd)
>>>>>>>>>>> sum(is.na(cv))
>>>>>>>>>>> rd<-as.Date(rd)
>>>>>>>>>>> Y <- CensusToCaptHist(ID = cv[,1], d=rd, timeInt="D")
>>>>>>>>>>> head(Y)
>>>>>>>>>>> sum(is.na(cv))
>>>>>>>>>>> 
>>>>>>>>>>> birthDeath <- read.delim("~/penults_birthdeath.csv", sep=",", header=T)
>>>>>>>>>>> 
>>>>>>>>>>> bd.na <- t( # the below returns a transposed matrix, so we have to re-transpose it back to normal
>>>>>>>>>>>   apply( # foreach row (hence 1) in the birth dates matrix
>>>>>>>>>>>     birthDeath, 1,
>>>>>>>>>>>     function(r) { # apply by row this function (hence r)
>>>>>>>>>>>       if(r[2] == 0) { # if birth [2] is 0
>>>>>>>>>>>         r[2] <- NA # replace birth value with NA (R's missing data value)
>>>>>>>>>>>       }
>>>>>>>>>>>       if (r[3] == 0) { # if death [3] is 0
>>>>>>>>>>>         r[3] <- NA # replace death value with NA (R's missing data value)
>>>>>>>>>>>       }
>>>>>>>>>>>       return(r) # return the whole row
>>>>>>>>>>>     }
>>>>>>>>>>>   ))
>>>>>>>>>>> 
>>>>>>>>>>> table(is.na(bd.na[,3]))
>>>>>>>>>>> BD <- bd.na
>>>>>>>>>>> head(BD)
>>>>>>>>>>> covar <- read.delim("~/fixed_covars.csv", sep=",", header=T)
>>>>>>>>>>> head(covar)
>>>>>>>>>>> covMat <- MakeCovMat(x=c("CLADE"), data = covar)
>>>>>>>>>>> days <- as.numeric(colnames(Y)[2:ncol(Y)])
>>>>>>>>>>> y <- as.matrix(Y)
>>>>>>>>>>> bd <- apply(y,1,function(r) min(as.numeric(days[as.logical(r[2:ncol(Y)])]))) -1
>>>>>>>>>>> dd <- apply(y,1,function(r) max(as.numeric(days[as.logical(r[2:ncol(Y)])])))
>>>>>>>>>>> inputMat <- as.data.frame(cbind(BD, Y[, -1], covMat[, -1]))
>>>>>>>>>>> ##inputMat <- merge(BD, Y, by.x = "ID", by.y = "ID")
>>>>>>>>>>> ##inputMat <- merge(inputMat, covMat, by.x = "ID", by.y = "ID")
>>>>>>>>>>> dim(inputMat)
>>>>>>>>>>> colnames(inputMat)
>>>>>>>>>>> im2 <- inputMat
>>>>>>>>>>> im2[,2] <- bd
>>>>>>>>>>> im2[,3] <- dd
>>>>>>>>>>> head(im2)
>>>>>>>>>>> dc <- DataCheck(im2, studyStart = 1, studyEnd = 109, autofix = rep(1, 7), silent=FALSE)
>>>>>>>>>>> names(dc)
>>>>>>>>>>> head(inputMat)
>>>>>>>>>>> # outMat <- dc$newData
>>>>>>>>>>> out <- basta(object = im2, studyStart = 1, studyEnd = 109)
>>>>>>>>>>> 
>>>>>>>>>>> <penults_birthdeath.csv><fixed_covars.csv><CaptHist.csv>
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> <penults_birthdeath.csv><fixed_covars.csv><CaptHist.csv>
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> <fixeda_covars.csv><CaptHist.csv><penults_birthdeath.csv>
>>>> 
>> 
> 
> <CaptHist.csv><fixedb_covars.csv><penults_birthdeath.csv><BaSTA_1.9.1.tar.gz>



More information about the Basta-users mailing list