[Traminer-users] Using the process time axis when converting from SPELL to STS

Gilbert Ritschard Gilbert.Ritschard at unige.ch
Fri Mar 7 15:47:46 CET 2014


Hi Aart-Jan,

After a quick glance at your code, I notice that you provide 'birthyear' (your 'start' values in pdata) that are greater than the BEGINTIME. This should generate negative ages.  This is probably the source of your problem.

In your example I see that the start time of the first spell is 1 for id 1. (Is that a calendar date?). What are the begin time of the first spells for the cases 2, 3, 4 ?
What does this begin time of observation correspond to ? 

Gilbert  


  

-----Original Message-----
From: traminer-users-bounces at lists.r-forge.r-project.org [mailto:traminer-users-bounces at lists.r-forge.r-project.org] On Behalf Of Arie Riekhoff
Sent: Wednesday, March 05, 2014 14:13
To: traminer-users at r-forge.wu-wien.ac.at
Subject: [Traminer-users] Using the process time axis when converting from SPELL to STS

Hello!

The TraMineR package has been doing a wonderful job with my data and it's a real pleasure to work with it, even as a beginner in R. I have just run into a problem with the process axis function in the seqformat command and I haven't managed to figure out what I'm doing wrong.

My data comes in SPELL format and I want to convert it to STS before creating a sequence object. Following instruction from the TraMineR user's guide this work very nicely if I select process = FALSE.  
However, I need to use the process time axis, because I'm working with a cohort of 3 consecutive birth years and I want to start counting from the year in which they reach a specific age and then follow them for the next 10 years (i.e. 120 months). My data starts in 1999 and registers statuses per month. I have recoded the dates, so that January 1999 is 1, February 1999 is 2, etc. I want to start the time axis for the respondents from the 3 consecutive birth years at t = 1 (year 1), t = 13 (year 2) and t = 25 (year 3). I have imported a separate file with the id's of the respondents and the different start times.

So, my command was the following:

> wr.sts.process <- seqformat (wr, id = "ID", begin = "BEGINTIME", end = 
> "ENDTIME", status = SOCECST_rec", from = "SPELL", to = "STS", process 
> = TRUE, pdata = starttime, pvar =c("ID", "start"), limit =
> 120)
    [>] SPELL data converted into 2088 STS sequences

But it results in missing values (NA) for almost each status.

My wr dataframe with spell data looks something like this:

> wr [1:5,]

     ID    BEGINTIME    ENDTIME    SOCECST_rec
1   1             1         16             1
2   1             17        18             4
3   1             19        20             4
4   1             21        21             4
5   1             22        22             3

I had followed the user guide's advice to convert the status variable into an integer.

And my starttime dataframe like this:

> starttime [1:5,]

     ID   start
1    1      13
2    2      13
3    3      25
4    4       1
5    5      25

I also tried converting into converting into a sequence object with the seqdef() function directly from spell data, but run into the same problem ([!] sequence with index: 1,2,3 etc contains only missing values).

Like I wrote, when I use process = FALSE, both seqformat and seqdef work perfectly well, so it's not the wr data that's the problem. I guess I'm doing something wrong with the process time axis.

Someone might have asked a similar question here before, but I couldn't find any definite answers anywhere. I hope that someone can point me in the right direction or give me a hint as to the solution of my problem!

Thanks in advance,

Aart-Jan
_______________________________________________
Traminer-users mailing list
Traminer-users at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users


More information about the Traminer-users mailing list