[Traminer-users] how is NA treated in traminer

Thu Apr 21 09:40:28 CEST 2011

Hi Phill,

Indeed, if the presence of missing statuses in your sequences is somehow 
linked to the type of trajectory, then omitting missing value in your 
analysis will introduce considerable bias as you say.

But remember that you can add the missing state as one additional state 
of the alphabet when computing pairwise distances (add with.missing=TRUE 
as argument to seqdist). However if you have sequences containing many 
missing values they will probably be in a same cluster since they will 
tend to be very similar to each other. The best is to try.

Otherwise you may also use only single imputation.

Hope this ideas help.

Best regards,
Alexis.

Le 20. 04. 11 11:49, Philip Parker a écrit :
> Ah of course! Silly me.
>
> Ok second question if that is alright. What would be the typical 
> approach to dealing with missing states in sequence analysis. My 
> initial feeling is to impute, however, I am not sure how one would 
> combine the results gained from traminer using  multiple imputation.
>
> Alternatively should one run tramine with something like a na.omit 
> data.frame and then impute transition clusters for further analysis. 
> This strategy would seems to introduce considerable bias and thus I 
> think is somewhat pointless.
>
> Regards,
>
> Phil
>
> On 20.04.2011 11:32, Gilbert Ritschard wrote:
>> Hi Phil,
>>
>> The seqdplot function (seqplot with the type="d" argument) calls the 
>> seqstatd function 
>> (http://mephisto.unige.ch/traminer/doc/seqstatd.html) in which the 
>> with.missing argument is, by default, set as FALSE.
>>
>> The solution to display the missings in the sequence distribution 
>> plot is to add with.missing=TRUE to the argument list of seqdplot:
>>
>> seqdplot(seqdata, with.missing=TRUE)
>>
>> Hope this helps.
>> Gilbert
>>
>>
>> On 20-Apr-11 10:50, Philip Parker wrote:
>>> Hi All,
>>>
>>> I am looking at using Traminer to explore post-school transition 
>>> pathways. I would like to identify some theoretical typical 
>>> transitions (straight to university, gapyear, vocational training, 
>>> uncertain (or frequent change) ). I would then like to use these 
>>> derived clusters taken from traminer and look at what variables at 
>>> school predict entery into these different pathways.
>>>
>>> So far so good however my data has about 15% missing data (coded as 
>>> left=NA, gaps=NA, right=NA). I note that when I run traminer plots 
>>> after clustering (and before), the state distribution plots give 
>>> group transitions without missing data but the Sequence frequency 
>>> plots show the missing data.
>>>
>>> I was wondering whether traminer imputes the most likely transition 
>>> pattern for the missing data and hence the state distribution plots 
>>> are based on some sort of imputed full sample, or whether the state 
>>> distribution plots represent some sort of listwise deletion by 
>>> ploting only complete cases? This becomes important after clustering 
>>> as my hypothesised four group turn up. The problem is that the 
>>> fourth cluster represents the uncertain (or frequent change) in the 
>>> distribution plots but the state  frequency plots suggest that the 
>>> group consists largely of people with missing data holes.
>>>
>>> Regards,
>>>
>>> Phil
>>> _______________________________________________
>>> Traminer-users mailing list
>>> Traminer-users at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users 
>>>
>>>
>>
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users 
>