[Traminer-users] dendrogram

Gilbert Ritschard Gilbert.Ritschard at unige.ch
Thu Aug 2 12:25:47 CEST 2012


The identify.hclust function works on dendrograms obtained by plotting 
the outcome of hclust. The following example works perfectly on my system:

####### start of example ###########
## generating the state sequence object (from preview on the TraMineR 
web page)
library(TraMineR)
data(mvad)
mvad.alphabet <- c("employment", "FE", "HE", "joblessness", "school",
"training")
mvad.labels <- c("employment", "further education", "higher education",
"joblessness", "school", "training")
mvad.scodes <- c("EM", "FE", "HE", "JL", "SC", "TR")
mvad.seq <- seqdef(mvad, 17:86, alphabet = mvad.alphabet, states = 
mvad.scodes,
labels = mvad.labels, xtstep = 6)

## OM distances
dist.om1 <- seqdist(mvad.seq, method = "OM", indel = 1, sm = "TRATE")

## generating the hierarchical clusters with hclust
hca <- hclust(as.dist(dist.om1))
plot(hca)

## now using identify on the dendrogram produced by the above plot function
(x <- identify(hca))

###### end example ########

Best,
Gilbert


On 01-Aug-12 17:54, so.bellit wrote:
>
>
> I would like to use the "identify" function in order to know how 
> identify clusters in the dendogram. But it seems not work with the 
> Optimal Matching Analysis.
>
> Thanks
> Sonia
>
>
>
>
>
>     > Message du 01/08/12 11:29
>     > De : "so.bellit"
>     > A : "Users questions"
>     > Copie à :
>     > Objet : [Traminer-users] dendrogram
>     >
>     >
>
>     Hi,
>
>     Anyone knows if we can link the cluster or individuals with the
>     dendrogram?
>
>     Thanks
>     Sonia
>
>
>
>
>
>         > Message du 23/07/12 17:49
>         > De : "Joel Schwartz"
>         > A : "Users questions"
>         > Copie à :
>         > Objet : Re: [Traminer-users] how to sort a sequence plot
>         (seqIplot) by more than one variable
>         >
>         > Hi Jan,
>
>
>         >
>
>         Yes, that worked. Using the order() function, I created a new,
>         sorted version of the sequence object and got the sort order I
>         wanted.
>
>
>         >
>
>         Thanks!
>
>         Joel
>
>
>         >
>
>         On Jul 19, 2012, at 10:53 AM, Jan Goebel wrote:
>
>
>             Dear Joel,
>             >
>             > the help page on plot.stslist states that sortv has to
>             be a variable name:
>             >
>             > sortv: name of an optional variable used to sort the
>             sequences
>             > before plotting.
>             >
>             > So my guess is, that you have to create a new variable
>             within your data frame and not to supply an "external" vector.
>             >
>             > However ?seqIplot has in my opinion at least some
>             fuzziness, because here you find:
>             > "The ‘sortv’ argument can be used to pass a vector of
>             numerical values for sorting the sequences. See
>             ‘plot.stslist’ for a complete list of optional arguments."
>             >
>             > Best wishes,
>             >
>             > Jan
>             >
>             > On 07/17/2012 03:25 PM, Joel Schwartz wrote:
>             >
>
>                 Hi Alexis,
>                 > 
>
>
>                 > 
>
>                 I tried your suggestion and for some reason it's not
>                 working, even when I create a sorting
>                 > 
>
>                 variable that uses a single variable to sort on.
>                 Here's what I did:
>                 > 
>
>
>                 > 
>
>                 # Original version, which works fine. See first
>                 attached file below, which shows plot is
>                 > 
>
>                 clearly sorted by f09as.
>                 > 
>
>                 seqIplot(df.seq09, border = NA, withlegend = "right",
>                 sortv=df09$f09as)
>                 > 
>
>
>                 > 
>
>                 # Create sorting variable
>                 > 
>
>                 term.sort = order(df09$f09as)
>                 > 
>
>
>                 > 
>
>                 # Create plot again, using sorting variable. This
>                 version of the plot is unsorted. See second
>                 > 
>
>                 attached file.
>                 > 
>
>                 seqIplot(df.seq09, border = NA, withlegend = "right",
>                 sortv=term.sort)
>                 > 
>
>
>                 > 
>
>                 I checked the sorting variable by looking at what it
>                 does to the original data frame and it
>                 > 
>
>                 looked exactly like it should, so the sorting variable
>                 doesn't seem to have a problem. Here's
>                 > 
>
>                 the command I used for that: df09[order(df09$f09as), ]
>                 > 
>
>
>                 > 
>
>                 Any idea what could be going wrong? If it would help
>                 to look at the sequence object or
>                 > 
>
>                 data.frame I'm working with, I can send those.
>                 > 
>
>
>                 > 
>
>                 Thanks again,
>                 > 
>
>                 Joel
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>                 =
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>                 On Jul 16, 2012, at 12:10 AM, Alexis gabadinho wrote:
>                 > 
>
>
>                 > 
>
>                     Hi Joel,
>                     > 
>
>
>                     > 
>
>                     Use first the order function to create one single
>                     sorting variable, and then pass this
>                     > 
>
>                     variable to the seqIplot function. Here is an
>                     example with the biofam data frame where
>                     > 
>
>                     sequences are sorted by gender and birthyr
>                     > 
>
>
>                     > 
>
>                     data(biofam)
>                     > 
>
>                     biofam.seq <- seqdef(biofam, 10:25)
>                     > 
>
>                     csort <- order(biofam$sex, biofam$birthyr)
>                     > 
>
>                     seqIplot(biofam.seq, sortv=csort)
>                     > 
>
>
>                     > 
>
>                     All the best,
>                     > 
>
>                     Alexis
>                     > 
>
>
>                     > 
>
>
>                     > 
>
>                     Le 16. 07. 12 08:12, Joel Schwartz a écrit :
>                     > 
>
>                         I'm a new TraMineR user and just ran into a
>                         problem while using the seqIplot function. I'm
>                         > 
>
>                         making a plot of hundreds of sequences. To
>                         make it possible to see patterns, I'm trying to
>                         > 
>
>                         sort the sequences by more than one variable.
>                         It works as expected when I sort by one
>                         > 
>
>                         variable. But when I try to sort by more than
>                         one, I get the exact same result as when I
>                         > 
>
>                         sort by one variable.
>                         > 
>
>
>                         > 
>
>                         Here are the two commands I'm using
>                         > 
>
>
>                         > 
>
>                         # Sort by one variable
>                         > 
>
>                         seqIplot(df.seq, border=NA,
>                         withlegend="right", sortv=df$s09as)
>                         > 
>
>
>                         > 
>
>                         # Sort by two variables
>                         > 
>
>                         seqIplot(df.seq, border = NA,
>                         withlegend="right", sortv=c(df$s09as, df$f09as))
>                         > 
>
>
>                         > 
>
>                         I get the exact same plot either way, and no
>                         warnings or errors.
>                         > 
>
>
>                         > 
>
>                         Is there a way to sort by more than one variable?
>                         > 
>
>
>                         > 
>
>                         Thanks for your help.
>                         > 
>
>
>                         > 
>
>                         Best Wishes,
>                         > 
>
>                         Joel Schwartz
>                         > 
>
>
>                         > 
>
>
>                         > 
>
>
>                         > 
>
>
>                         > 
>
>                         _______________________________________________
>                         > 
>
>                         Traminer-users mailing list
>                         > 
>
>                         Traminer-users at lists.r-forge.r-project.org
>                         <mailto:Traminer-users at lists.r-forge.r-project.org>
>                         > 
>
>                         https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>                         > 
>
>
>                     > 
>
>                     _______________________________________________
>                     > 
>
>                     Traminer-users mailing list
>                     > 
>
>                     Traminer-users at lists.r-forge.r-project.org
>                     <mailto:Traminer-users at lists.r-forge.r-project.org> <mailto:Traminer-users at lists.r-forge.r-project.org>
>                     > 
>
>                     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>                     > 
>
>
>                 > 
>
>                 =
>                 > 
>
>
>                 > 
>
>
>                 > 
>
>                 This body part will be downloaded on demand.
>                 > 
>
>
>                 > 
>
>
>             > --
>             > -----------------------------------------
>             > Dr. Jan Goebel
>             > Head of the Division
>             > Data Operation and Research Data Center
>             >
>             > DIW Berlin
>             > Socio-Economic Panel Study (SOEP)
>             > Mohrenstr. 58
>             > D-10117 Berlin -- Germany --
>             > phone: +49 30 89789-377
>             > -----------------------------------------
>             >
>             > _______________________________________________
>             > Traminer-users mailing list
>             > Traminer-users at lists.r-forge.r-project.org
>             <mailto:Traminer-users at lists.r-forge.r-project.org>
>             >
>             https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>             >
>             >
>
>
>         >
>
>         >
>         > [ (pas de nom de fichier) (0.2 Ko) ]
>
>     >
>     > [ (pas de nom de fichier) (0.2 Ko) ]
>
>
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users




More information about the Traminer-users mailing list