[Traminer-users] seqfplot issue
Alexis Gabadinho
Alexis.Gabadinho at unige.ch
Wed May 18 11:04:26 CEST 2011
Hi Adam,
I can see that you have 13 states in your alphabet, and you define only
12 colors in your palette. Indeed, brewer.pal() can define up to 12
colors only.
You have to use a "home made" color palete like:
my.pal <- c("blue", "red", "yellow", ....)
see colors() for available colors.
Then you can pass your color palettedirectly in seqdef():
fx.seq <- seqdef(fx, 2:78, labels=fx.lab, cpal=my.pal)
I'm pretty sure this is the problem.
All the best,
Alexis.
Le 18. 05. 11 10:32, Matthias Studer a écrit :
> Hi all,
>
> I hope I understand well your question. Usually, the problem with
> seqfplot is that the most frequent one are the stable sequence,
> especially when sequences are long (and 77 monthes is long). This is
> due to the fact that every small difference is taken into account. To
> overcome this problem, you may use "representative sequences". This
> procedure tries to regroup sequences that differ only marginally and
> to identify a set of sequences that might represent a given percentage
> of the sequences in that group (see the help of the seqrep function).
>
> The procedure and the concept of representative sequence is described
> in this reference: Gabadinho A, Ritschard G, Studer M, Müller NS
> (2011). "Extracting and Rendering Representative Sequences", In A
> Fred, JLG Dietz, K Liu, J Filipe (eds.), /Knowledge Discovery,
> Knowledge Engineering and Knowledge Management/, volume 128 of
> /Communications in Computer and Information Science (CCIS)/, pp.
> 94-106. Springer-Verlag.
>
> I have another suggestion regarding your code. You may use the group
> argument (available in all seq*plot) to automatically draw a distinct
> plot of each group of sequences.
>
> seqdplot(fx.seq, group=pam.best.fx$clustering)
>
>
> Hope this helps.
> Matthias Studer
>
> Le 17.05.2011 19:19, Kleinbaum, Adam M. a écrit :
>>
>> Hi all,
>>
>> I'm having an issue with seqfplot I'm hoping you can help me with. I
>> have a series of career sequences over a 77-month observation period
>> for a group of 15,000 or so individuals. I use the following code to
>> read in my data (including labels), define a color palette, and drop
>> any observations with missing data:
>>
>> fx <- read.csv(file="Seq_function.txt", header=FALSE)
>>
>> fx.lab <- c("AD", "CO", "FI", "GM", "HR", "LE", "MF", "MK", "OT",
>> "RD", "SC", "SL", "SV")
>>
>> fx.seq <- seqdef(fx, 2:78, labels=fx.lab)
>>
>> attr(fx.seq,"cpal") <- c(brewer.pal(n=12, name="Set3"), "cyan", "white")
>>
>> fx <- fx[!is.na(fx[,78]), ]
>>
>> fx.seq <- fx.seq[seqlength(fx.seq)==77,]
>>
>> Then I use TraMineR to calculate the substitution cost matrix and do
>> optimal matching. Next I use the "cluster" package to do a cluster
>> analysis and I find that there are 9 prototypical sequences. I'd
>> like to do a bit of graphing for each of the 9 clusters, so I run the
>> following code:
>>
>> for (i in 1:k.best.om.fx)
>>
>> {
>>
>> seqdplot(fx.seq[pam.best.fx$clustering==i,], withlegend=F,
>> title=paste("Cluster ", i))
>>
>> seqfplot(fx.seq[pam.best.fx$clustering==i,], withlegend=F,
>> title=paste("Cluster ", i))
>>
>> }
>>
>> seqlegend(fx.seq)
>>
>> where fx.seq contains my sequences, as shown above; pam.best.fx is
>> the pam object that came out of the clustering algorithm and
>> pam.best.fx$clustering contains the index of each actor's cluster
>> assignment.
>>
>> The seqdplot command produces a series of 9 beautiful distribution
>> plots, one for each cluster. No problem.
>>
>> What I want seqfplot to do is, for each cluster, graph out the
>> frequency of sequences within each cluster, from most frequent on
>> down the list. To take one example, the medoid of one cluster is 77
>> observations of all the same job function -- people who stay in sales
>> and don't move. People assigned to that cluster should have spent
>> nearly all 77 periods in that function. The output should look like
>> a long sequence of blocks that are mostly the same color, with
>> another sequence above it that's a little bit different, but also
>> mostly all the same color. But instead, every single block is a
>> different color from the one next to it. There's obviously some
>> problem with either how I'm calling the function or how I'm defining
>> my color palette, but I can't figure out what it is. I'm especially
>> perplexed that seqdplot works properly while an identical call to
>> seqfplot does not. Any ideas? Thanks in advance,
>>
>> All the best,
>>
>> Adam
>>
>> --
>>
>> Adam M. Kleinbaum
>>
>> Assistant Professor
>>
>> Tuck School of Business
>>
>> Dartmouth College
>>
>> http://bit.ly/kleinbaum
>>
>> 603.646.6447
>>
>>
>> _______________________________________________
>> Traminer-users mailing list
>> Traminer-users at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20110518/1c02d250/attachment.htm>
More information about the Traminer-users
mailing list