[Traminer-users] how is NA treated in traminer

Wed Apr 20 10:50:09 CEST 2011

Hi All,

I am looking at using Traminer to explore post-school transition 
pathways. I would like to identify some theoretical typical transitions 
(straight to university, gapyear, vocational training, uncertain (or 
frequent change) ). I would then like to use these derived clusters 
taken from traminer and look at what variables at school predict entery 
into these different pathways.

So far so good however my data has about 15% missing data (coded as 
left=NA, gaps=NA, right=NA). I note that when I run traminer plots after 
clustering (and before), the state distribution plots give group 
transitions without missing data but the Sequence frequency plots show 
the missing data.

I was wondering whether traminer imputes the most likely transition 
pattern for the missing data and hence the state distribution plots are 
based on some sort of imputed full sample, or whether the state 
distribution plots represent some sort of listwise deletion by ploting 
only complete cases? This becomes important after clustering as my 
hypothesised four group turn up. The problem is that the fourth cluster 
represents the uncertain (or frequent change) in the distribution plots 
but the state  frequency plots suggest that the group consists largely 
of people with missing data holes.

Regards,

Phil