[Traminer-users] IDs in graphs/ table of values

Alexis Gabadinho Alexis.Gabadinho at unige.ch
Mon Oct 3 13:46:24 CEST 2011


Hi Judith,

Question 1:
To get the sequence ids (indexes) for each cluster, you can use for example:

which(cluster4=="Cluster 1")
which(cluster4=="Cluster 2")
...

This will return the indexes of the sequences classified in cluster 1, 
2, ....

Question 2:
Each plot type has a corresponding function that produces the 
statistics. In this case, this is the  seqmeant() function. See our 
paper in the Journal of Statistical Software 
(http://www.jstatsoft.org/v40/i04) and the manual pages.

All the best,
Alexis

Le 29. 09. 11 09:50, Judith Krüger a écrit :
>
> Hey everyone,
>
> I use R and the package TraMineR to analyse payment data in 151 cases. 
> As you might expect, I have some trouble...
>
> Question 1: I successfully calculated an optimal matching analysis and 
> a cluster analysis. I also generated the necessary graphs.*How to I 
> get the IDs into the graphs* OR *How can I get a table of every 
> cluster and the relevant sequences?*
>
> Question 2: E.g., when plotting the mean times spent in each state per 
> cluster (seqmtplot) oder other graphs, *how can I get a table with the 
> corresponding values *(since I can only guess the values in the 
> graphs)*?*
>
> Thanks a lot in advance and good luck with all your projects!
>
> Greetz
> Judie
>
> > library(TraMineR)
> > library(foreign)
> > library(cluster)
> > library(RColorBrewer)
> > datenR <- 
> read.spss("Y:\\DOKTORARBEIT\\1_Dissertation\\3_Methode\\2_Erwerbsverläufe\\5_R\\datenR.sav", 
> to.data.frame=TRUE, use.value.labels=FALSE)
> > datenR.labels <- c("ES6", "ES7", "ES8", "ES9", "ES10", "ES11", 
> "ES12", "ES13", "ES14", "Ruhendes Arb.verh.", "Austritt", "Muttersch., 
> Erz.-Elt.zeit", "Wehrdienst", "Weiterbildung")
> > datenR.seq <- seqdef(datenR, var=20:103, labels=datenR.labels, id="auto")
>  [>] found missing values ('NA') in sequence data
>  [>] preparing 151 sequences
>  [>] coding void elements with '%' and missing values with '*'
>  [>] 14 distinct states appear in the data:
>      1 = 6
>      2 = 7
>      3 = 8
>      4 = 9
>      5 = 10
>      6 = 11
>      7 = 12
>      8 = 13
>      9 = 14
>      10 = 44
>      11 = 55
>      12 = 66
>       ...
>  [>] alphabet (state labels):
>      1 = 6 (ES6)
>      2 = 7 (ES7)
>      3 = 8 (ES8)
>      4 = 9 (ES9)
>      5 = 10 (ES10)
>      6 = 11 (ES11)
>      7 = 12 (ES12)
>      8 = 13 (ES13)
>      9 = 14 (ES14)
>      10 = 44 (Ruhendes Arb.verh.)
>      11 = 55 (Austritt)
>      12 = 66 (Muttersch., Erz.-Elt.zeit)
>       ... (14 states)
>  [>] no color palette attributed, provide one to use graphical functions
>  [>] 151 sequences in the data set
>  [>] min/max sequence length: 2/84
> Warnmeldung:
>  [!] no automatic color palete attributed, number of states>12.
>      Use 'cpal' argument to define one.
> > cpal(datenR.seq) <- c("white", "yellow", "orange", "hotpink", "red1", 
> "red3", "darkred", "skyblue", "blue", "grey80", "grey60", 
> "springgreen", "grey20", "purple")
> > subcostmatrix <- seqsubm(datenR.seq, method="TRATE")
>  [>] creating substitution-cost matrix using transition rates ...
>  [>] computing transition rates for states 
> 6/7/8/9/10/11/12/13/14/44/55/66/77/88 ...
> > round(subcostmatrix, 2)
>       6->  7->  8->  9-> 10-> 11-> 12-> 13-> 14-> 44-> 55-> 66-> 77-> 88->
> 6->  0.00 1.75 2.00 1.75 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00  2.0
> 7->  1.75 0.00 1.89 1.98 1.99 2.00 2.00 2.00 2.00 2.00 1.93 1.92 1.72  
> 2.0 <tel:1.93%201.92%201.72%C2%A0%202.0>
> 8->  2.00 1.89 0.00 1.89 1.99 2.00 2.00 2.00 2.00 2.00 1.86 1.88 1.90  1.4
> 9->  1.75 1.98 1.89 0.00 1.82 2.00 2.00 2.00 2.00 1.75 2.00 1.98 1.95  2.0
> 10-> 2.00 1.99 1.99 1.82 0.00 1.85 2.00 2.00 2.00 2.00 2.00 1.99 1.98  2.0
> 11-> 2.00 2.00 2.00 2.00 1.85 0.00 1.91 2.00 2.00 2.00 2.00 1.90 2.00  2.0
> 12-> 2.00 2.00 2.00 2.00 2.00 1.91 0.00 1.99 2.00 2.00 2.00 1.92 2.00  2.0
> 13-> 2.00 2.00 2.00 2.00 2.00 2.00 1.99 0.00 1.99 2.00 2.00 1.97 2.00  2.0
> 14-> 2.00 2.00 2.00 2.00 2.00 2.00 2.00 1.99 0.00 2.00 2.00 2.00 2.00  2.0
> 44-> 2.00 2.00 2.00 1.75 2.00 2.00 2.00 2.00 2.00 0.00 2.00 1.75 2.00  2.0
> 55-> 2.00 1.93 1.86 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.00 1.98 2.00  2.0
> 66-> 2.00 1.92 1.88 1.98 1.99 1.90 1.92 1.97 2.00 1.75 1.98 0.00 2.00  2.0
> 77-> 2.00 1.72 1.90 1.95 1.98 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.00  2.0
> 88-> 2.00 2.00 1.40 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00  0.0
> > datenR.om <- seqdist(datenR.seq, method="OM", indel=2, sm=subcostmatrix)
>  [>] 151 sequences with 14 distinct events/states
>  [>] 147 distinct sequences
>  [>] min/max sequence length: 2/84
>  [>] computing distances using OM metric
>  [>] total time: 0.33 secs
> Warnmeldung:
> The substitution cost matrix is not symmetric.
> > clusterward <- agnes(datenR.om, diss=TRUE, method="ward")
> > plot(clusterward, which.plots=2)
> > cluster4 <- cutree(clusterward, k=4)
> > cluster4 <- factor(cluster4, labels=c("Cluster 1", "Cluster 2", 
> "Cluster 3", "Cluster 4"))
> > table(cluster4)
> cluster4
> Cluster 1 Cluster 2 Cluster 3 Cluster 4
>        32        17        35        67
> > seqfplot(datenR.seq, group=cluster4, pbarw=T, tlim=0, border=NA)
> > seqmtplot(datenR.seq, group=cluster4)
>
> Judith Krüger
> Ph.D. Student
> Germany
>
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20111003/c7974ac1/attachment.htm>


More information about the Traminer-users mailing list