[Traminer-users] IDs in graphs/ table of values
Alexis Gabadinho
Alexis.Gabadinho at unige.ch
Mon Oct 3 13:46:24 CEST 2011
Hi Judith,
Question 1:
To get the sequence ids (indexes) for each cluster, you can use for example:
which(cluster4=="Cluster 1")
which(cluster4=="Cluster 2")
...
This will return the indexes of the sequences classified in cluster 1,
2, ....
Question 2:
Each plot type has a corresponding function that produces the
statistics. In this case, this is the seqmeant() function. See our
paper in the Journal of Statistical Software
(http://www.jstatsoft.org/v40/i04) and the manual pages.
All the best,
Alexis
Le 29. 09. 11 09:50, Judith Krüger a écrit :
>
> Hey everyone,
>
> I use R and the package TraMineR to analyse payment data in 151 cases.
> As you might expect, I have some trouble...
>
> Question 1: I successfully calculated an optimal matching analysis and
> a cluster analysis. I also generated the necessary graphs.*How to I
> get the IDs into the graphs* OR *How can I get a table of every
> cluster and the relevant sequences?*
>
> Question 2: E.g., when plotting the mean times spent in each state per
> cluster (seqmtplot) oder other graphs, *how can I get a table with the
> corresponding values *(since I can only guess the values in the
> graphs)*?*
>
> Thanks a lot in advance and good luck with all your projects!
>
> Greetz
> Judie
>
> > library(TraMineR)
> > library(foreign)
> > library(cluster)
> > library(RColorBrewer)
> > datenR <-
> read.spss("Y:\\DOKTORARBEIT\\1_Dissertation\\3_Methode\\2_Erwerbsverläufe\\5_R\\datenR.sav",
> to.data.frame=TRUE, use.value.labels=FALSE)
> > datenR.labels <- c("ES6", "ES7", "ES8", "ES9", "ES10", "ES11",
> "ES12", "ES13", "ES14", "Ruhendes Arb.verh.", "Austritt", "Muttersch.,
> Erz.-Elt.zeit", "Wehrdienst", "Weiterbildung")
> > datenR.seq <- seqdef(datenR, var=20:103, labels=datenR.labels, id="auto")
> [>] found missing values ('NA') in sequence data
> [>] preparing 151 sequences
> [>] coding void elements with '%' and missing values with '*'
> [>] 14 distinct states appear in the data:
> 1 = 6
> 2 = 7
> 3 = 8
> 4 = 9
> 5 = 10
> 6 = 11
> 7 = 12
> 8 = 13
> 9 = 14
> 10 = 44
> 11 = 55
> 12 = 66
> ...
> [>] alphabet (state labels):
> 1 = 6 (ES6)
> 2 = 7 (ES7)
> 3 = 8 (ES8)
> 4 = 9 (ES9)
> 5 = 10 (ES10)
> 6 = 11 (ES11)
> 7 = 12 (ES12)
> 8 = 13 (ES13)
> 9 = 14 (ES14)
> 10 = 44 (Ruhendes Arb.verh.)
> 11 = 55 (Austritt)
> 12 = 66 (Muttersch., Erz.-Elt.zeit)
> ... (14 states)
> [>] no color palette attributed, provide one to use graphical functions
> [>] 151 sequences in the data set
> [>] min/max sequence length: 2/84
> Warnmeldung:
> [!] no automatic color palete attributed, number of states>12.
> Use 'cpal' argument to define one.
> > cpal(datenR.seq) <- c("white", "yellow", "orange", "hotpink", "red1",
> "red3", "darkred", "skyblue", "blue", "grey80", "grey60",
> "springgreen", "grey20", "purple")
> > subcostmatrix <- seqsubm(datenR.seq, method="TRATE")
> [>] creating substitution-cost matrix using transition rates ...
> [>] computing transition rates for states
> 6/7/8/9/10/11/12/13/14/44/55/66/77/88 ...
> > round(subcostmatrix, 2)
> 6-> 7-> 8-> 9-> 10-> 11-> 12-> 13-> 14-> 44-> 55-> 66-> 77-> 88->
> 6-> 0.00 1.75 2.00 1.75 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.0
> 7-> 1.75 0.00 1.89 1.98 1.99 2.00 2.00 2.00 2.00 2.00 1.93 1.92 1.72
> 2.0 <tel:1.93%201.92%201.72%C2%A0%202.0>
> 8-> 2.00 1.89 0.00 1.89 1.99 2.00 2.00 2.00 2.00 2.00 1.86 1.88 1.90 1.4
> 9-> 1.75 1.98 1.89 0.00 1.82 2.00 2.00 2.00 2.00 1.75 2.00 1.98 1.95 2.0
> 10-> 2.00 1.99 1.99 1.82 0.00 1.85 2.00 2.00 2.00 2.00 2.00 1.99 1.98 2.0
> 11-> 2.00 2.00 2.00 2.00 1.85 0.00 1.91 2.00 2.00 2.00 2.00 1.90 2.00 2.0
> 12-> 2.00 2.00 2.00 2.00 2.00 1.91 0.00 1.99 2.00 2.00 2.00 1.92 2.00 2.0
> 13-> 2.00 2.00 2.00 2.00 2.00 2.00 1.99 0.00 1.99 2.00 2.00 1.97 2.00 2.0
> 14-> 2.00 2.00 2.00 2.00 2.00 2.00 2.00 1.99 0.00 2.00 2.00 2.00 2.00 2.0
> 44-> 2.00 2.00 2.00 1.75 2.00 2.00 2.00 2.00 2.00 0.00 2.00 1.75 2.00 2.0
> 55-> 2.00 1.93 1.86 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.00 1.98 2.00 2.0
> 66-> 2.00 1.92 1.88 1.98 1.99 1.90 1.92 1.97 2.00 1.75 1.98 0.00 2.00 2.0
> 77-> 2.00 1.72 1.90 1.95 1.98 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.00 2.0
> 88-> 2.00 2.00 1.40 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 0.0
> > datenR.om <- seqdist(datenR.seq, method="OM", indel=2, sm=subcostmatrix)
> [>] 151 sequences with 14 distinct events/states
> [>] 147 distinct sequences
> [>] min/max sequence length: 2/84
> [>] computing distances using OM metric
> [>] total time: 0.33 secs
> Warnmeldung:
> The substitution cost matrix is not symmetric.
> > clusterward <- agnes(datenR.om, diss=TRUE, method="ward")
> > plot(clusterward, which.plots=2)
> > cluster4 <- cutree(clusterward, k=4)
> > cluster4 <- factor(cluster4, labels=c("Cluster 1", "Cluster 2",
> "Cluster 3", "Cluster 4"))
> > table(cluster4)
> cluster4
> Cluster 1 Cluster 2 Cluster 3 Cluster 4
> 32 17 35 67
> > seqfplot(datenR.seq, group=cluster4, pbarw=T, tlim=0, border=NA)
> > seqmtplot(datenR.seq, group=cluster4)
>
> Judith Krüger
> Ph.D. Student
> Germany
>
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20111003/c7974ac1/attachment.htm>
More information about the Traminer-users
mailing list