[Traminer-users] seqfplot issue

Gilbert Ritschard Gilbert.Ritschard at unige.ch
Wed May 18 11:00:53 CEST 2011


Hi Adam,

You are defining a color palette of size 14, while your alphabet seems 
(from the length of your labels argument) to contain only 13 states.
This would explain the color mix in your plots.

Gilbert

On 17-May-11 19:19, Kleinbaum, Adam M. wrote:
>
> Hi all,
>
> I’m having an issue with seqfplot I’m hoping you can help me with. I 
> have a series of career sequences over a 77-month observation period 
> for a group of 15,000 or so individuals. I use the following code to 
> read in my data (including labels), define a color palette, and drop 
> any observations with missing data:
>
> fx <- read.csv(file="Seq_function.txt", header=FALSE)
>
> fx.lab <- c("AD", "CO", "FI", "GM", "HR", "LE", "MF", "MK", "OT", 
> "RD", "SC", "SL", "SV")
>
> fx.seq <- seqdef(fx, 2:78, labels=fx.lab)
>
> attr(fx.seq,"cpal") <- c(brewer.pal(n=12, name="Set3"), "cyan", "white")
>
> fx <- fx[!is.na(fx[,78]), ]
>
> fx.seq <- fx.seq[seqlength(fx.seq)==77,]
>
> Then I use TraMineR to calculate the substitution cost matrix and do 
> optimal matching. Next I use the “cluster” package to do a cluster 
> analysis and I find that there are 9 prototypical sequences. I’d like 
> to do a bit of graphing for each of the 9 clusters, so I run the 
> following code:
>
> for (i in 1:k.best.om.fx)
>
> {
>
> seqdplot(fx.seq[pam.best.fx$clustering==i,], withlegend=F, 
> title=paste("Cluster ", i))
>
> seqfplot(fx.seq[pam.best.fx$clustering==i,], withlegend=F, 
> title=paste("Cluster ", i))
>
> }
>
> seqlegend(fx.seq)
>
> where fx.seq contains my sequences, as shown above; pam.best.fx is the 
> pam object that came out of the clustering algorithm and 
> pam.best.fx$clustering contains the index of each actor’s cluster 
> assignment.
>
> The seqdplot command produces a series of 9 beautiful distribution 
> plots, one for each cluster. No problem.
>
> What I want seqfplot to do is, for each cluster, graph out the 
> frequency of sequences within each cluster, from most frequent on down 
> the list. To take one example, the medoid of one cluster is 77 
> observations of all the same job function – people who stay in sales 
> and don’t move. People assigned to that cluster should have spent 
> nearly all 77 periods in that function. The output should look like a 
> long sequence of blocks that are mostly the same color, with another 
> sequence above it that’s a little bit different, but also mostly all 
> the same color. But instead, every single block is a different color 
> from the one next to it. There’s obviously some problem with either 
> how I’m calling the function or how I’m defining my color palette, but 
> I can’t figure out what it is. I’m especially perplexed that seqdplot 
> works properly while an identical call to seqfplot does not. Any 
> ideas? Thanks in advance,
>
> All the best,
>
> Adam
>
> --
>
> Adam M. Kleinbaum
>
> Assistant Professor
>
> Tuck School of Business
>
> Dartmouth College
>
> http://bit.ly/kleinbaum
>
> 603.646.6447
>
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users

-- 
Gilbert Ritschard, Department of Economics and
Institute for Demographic and Life Course Studies,
University of Geneva, 40, bd du Pont-d'Arve, CH-1211 Genève 4, Switzerland
http://mephisto.unige.ch



More information about the Traminer-users mailing list