[Traminer-users] IDs in graphs/ table of values

Judith Krüger judithkrueger1 at googlemail.com
Thu Sep 29 09:50:11 CEST 2011

Hey everyone,

I use R and the package TraMineR to analyse payment data in 151 cases. As
you might expect, I have some trouble...

Question 1: I successfully calculated an optimal matching analysis and a
cluster analysis. I also generated the necessary graphs.* How to I get the
IDs into the graphs* OR *How can I get a table of every cluster and the
relevant sequences?*

Question 2: E.g., when plotting the mean times spent in each state per
cluster (seqmtplot) oder other graphs, *how can I get a table with the
corresponding values *(since I can only guess the values in the graphs)*?*

Thanks a lot in advance and good luck with all your projects!


> library(TraMineR)
> library(foreign)
> library(cluster)
> library(RColorBrewer)
> datenR <-
to.data.frame=TRUE, use.value.labels=FALSE)
> datenR.labels <- c("ES6", "ES7", "ES8", "ES9", "ES10", "ES11", "ES12",
"ES13", "ES14", "Ruhendes Arb.verh.", "Austritt", "Muttersch.,
Erz.-Elt.zeit", "Wehrdienst", "Weiterbildung")
> datenR.seq <- seqdef(datenR, var=20:103, labels=datenR.labels, id="auto")
 [>] found missing values ('NA') in sequence data
 [>] preparing 151 sequences
 [>] coding void elements with '%' and missing values with '*'
 [>] 14 distinct states appear in the data:
     1 = 6
     2 = 7
     3 = 8
     4 = 9
     5 = 10
     6 = 11
     7 = 12
     8 = 13
     9 = 14
     10 = 44
     11 = 55
     12 = 66
 [>] alphabet (state labels):
     1 = 6 (ES6)
     2 = 7 (ES7)
     3 = 8 (ES8)
     4 = 9 (ES9)
     5 = 10 (ES10)
     6 = 11 (ES11)
     7 = 12 (ES12)
     8 = 13 (ES13)
     9 = 14 (ES14)
     10 = 44 (Ruhendes Arb.verh.)
     11 = 55 (Austritt)
     12 = 66 (Muttersch., Erz.-Elt.zeit)
      ... (14 states)
 [>] no color palette attributed, provide one to use graphical functions
 [>] 151 sequences in the data set
 [>] min/max sequence length: 2/84
 [!] no automatic color palete attributed, number of states>12.
     Use 'cpal' argument to define one.
> cpal(datenR.seq) <- c("white", "yellow", "orange", "hotpink", "red1",
"red3", "darkred", "skyblue", "blue", "grey80", "grey60", "springgreen",
"grey20", "purple")
> subcostmatrix <- seqsubm(datenR.seq, method="TRATE")
 [>] creating substitution-cost matrix using transition rates ...
 [>] computing transition rates for states
6/7/8/9/10/11/12/13/14/44/55/66/77/88 ...
> round(subcostmatrix, 2)
      6->  7->  8->  9-> 10-> 11-> 12-> 13-> 14-> 44-> 55-> 66-> 77-> 88->
6->  0.00 1.75 2.00 1.75 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00  2.0
7->  1.75 0.00 1.89 1.98 1.99 2.00 2.00 2.00 2.00 2.00 1.93 1.92 1.72  2.0
8->  2.00 1.89 0.00 1.89 1.99 2.00 2.00 2.00 2.00 2.00 1.86 1.88 1.90  1.4
9->  1.75 1.98 1.89 0.00 1.82 2.00 2.00 2.00 2.00 1.75 2.00 1.98 1.95  2.0
 10-> 2.00 1.99 1.99 1.82 0.00 1.85 2.00 2.00 2.00 2.00 2.00 1.99 1.98  2.0
11-> 2.00 2.00 2.00 2.00 1.85 0.00 1.91 2.00 2.00 2.00 2.00 1.90 2.00  2.0
12-> 2.00 2.00 2.00 2.00 2.00 1.91 0.00 1.99 2.00 2.00 2.00 1.92 2.00  2.0
13-> 2.00 2.00 2.00 2.00 2.00 2.00 1.99 0.00 1.99 2.00 2.00 1.97 2.00  2.0
14-> 2.00 2.00 2.00 2.00 2.00 2.00 2.00 1.99 0.00 2.00 2.00 2.00 2.00  2.0
44-> 2.00 2.00 2.00 1.75 2.00 2.00 2.00 2.00 2.00 0.00 2.00 1.75 2.00  2.0
55-> 2.00 1.93 1.86 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.00 1.98 2.00  2.0
66-> 2.00 1.92 1.88 1.98 1.99 1.90 1.92 1.97 2.00 1.75 1.98 0.00 2.00  2.0
77-> 2.00 1.72 1.90 1.95 1.98 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.00  2.0
88-> 2.00 2.00 1.40 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00  0.0
> datenR.om <- seqdist(datenR.seq, method="OM", indel=2, sm=subcostmatrix)
 [>] 151 sequences with 14 distinct events/states
 [>] 147 distinct sequences
 [>] min/max sequence length: 2/84
 [>] computing distances using OM metric
 [>] total time: 0.33 secs
The substitution cost matrix is not symmetric.
> clusterward <- agnes(datenR.om, diss=TRUE, method="ward")
> plot(clusterward, which.plots=2)
> cluster4 <- cutree(clusterward, k=4)
> cluster4 <- factor(cluster4, labels=c("Cluster 1", "Cluster 2", "Cluster
3", "Cluster 4"))
> table(cluster4)
Cluster 1 Cluster 2 Cluster 3 Cluster 4
       32        17        35        67
> seqfplot(datenR.seq, group=cluster4, pbarw=T, tlim=0, border=NA)
> seqmtplot(datenR.seq, group=cluster4)

Judith Krüger
Ph.D. Student
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20110929/e15dec7c/attachment.htm>

More information about the Traminer-users mailing list