[Traminer-users] [Traminer-us ers] Re : Re: error splitting sts Object
Alexis Gabadinho
Alexis.Gabadinho at unige.ch
Fri Mar 16 13:16:39 CET 2012
Yes, if you have NA values in your CLUSTER variable then the expression
CLUSTER==1 produces a logical vector containing NAs, which in turn
introduces sequences containing only NA values in your sequence object
when you select cases with CLUSTER==1.
The solution proposed by Jan surely works. But you should investigate
why you have NA in the CLUSTER variable and make sure that each row in
CLUSTER corresponds to the same observation in your sequence object. If
you have clustered you data using a pairwise distance matrix computed on
your sequence object then there should not be any NA in your resulting
cluster membership vector.
By the way I would like to mention that Chris and Jan are the first
non-TraMineR-team members to answer a user's question! This is also what
the user's list is intended to. Many thanks to them.
All the best,
Alexis
Le 16. 03. 12 12:54, Jan Goebel a écrit :
> Try
> ! ist.na(cluster) & cluster == 1
>
> Best,
> Jan
>
> Hadrien Commenges --- [Traminer-users] Re : Re: error splitting sts
> Object ---
>
> Von: "Hadrien Commenges" <hc at parisgeo.cnrs.fr>
> An "Users questions" <traminer-users at r-forge.wu-wien.ac.at>
> Datum: Fr., 16.03.2012 12:48
> Betreff [Traminer-users] Re : Re: error splitting sts Object
>
> ------------------------------------------------------------------------
>
> Thank you Alexis and Chris for your help. The CLUSTER variable is not
> added to the sequence object (a mistake in the first mail) and it has
> the same length as the sequence object. This variable is the result of
> a cluster analysis and it contains 5 distinct values (groups from 1 to
> 5) and some NA values. Can these NA values be a problem even if I
> select individuals with a non NA value (like individuals with CLUSTER==1 ?
>
>
>
> ----- Mail d'origine -----
> De: Chris Cameron <cjc73 at cornell.edu>
> À: Users questions <traminer-users at r-forge.wu-wien.ac.at>
> Envoyé: Thu, 15 Mar 2012 20:13:11 +0100 (CET)
> Objet: Re: [Traminer-users] error splitting sts Object
>
> I think Alexis was correct in saying " The vector containing group
> membership information should be a standalone vector and should by no
> way be added as a further column to your sequence object." Please
> examine this further, as the code below demonstrates that you are
> introducing an error in your sequences even if it is not the source of
> your particular error message.
>
> Try not appending the cluster labels vector back into the sequence
> object. This definitely changes the sequences in the subsets (in my
> example and testing). Though this is not apparent in your code, I am
> not sure if your summary(stsNWObject) was generated before or after
> you added the cluster variable.
>
> In case it helps, I think the nr variable referenced in the error is a
> variable that refers to the number of rows. You can produce an error
> message that shows nr in this context by summarizing an empty subset
> of the sequence object.
>
> # summary(atus.lim[atus.lim$CLUSTER=='foo',]) #Where "foo" is not
> present in the CLUSTER list.
>
> ## atus.lim is the sequence dataset
> # atus.lab will be the list of numbers corresponding to the
>
> # Choose Costs
> # Lets suppose that activities that are frequently observed together
> are more interchangable
> sub_cost = seqsubm(atus.lim, method="TRATE")
> # if sequence lengths were equal, then
> #indel_cost = 2
> indel_cost = .45*max(sub_cost[upper.tri(sub_cost, diag=FALSE)])
> sub_cost <= 2*indel_cost ## Check to see how many subs will not be
> allowed (they will be deleted and inserted instead)
>
> # Compute Distances with Optimal Matching('OM') and costs
> seq_dist = seqdist(atus.lim, method='OM', indel=indel_cost,
> sm=sub_cost, full.matrix=FALSE)
>
> # seq.cluster <- agnes(seq_dist, diss = TRUE, method = "ward")
> # The agnes function does not seem to be working, but we can use hclust
> # Using package fastcluster with overwritten hclust
> seq.cluster <- hclust(seq_dist, method = "ward")
> plot(seq.cluster)
>
>
> # This creates 3 clusters and produces atus.lab, which I think is what
> you want stsNWObject$CLUSTER to be
> seq.c <- cutree(seq.cluster, k = 3)
> atus.lab <- factor(seq.c, labels = paste("c", 1:3))
>
> # Make a subset:
> atus.c1 = atus.lim[atus.lim$CLUSTER=='c 1',]
> summary(atus.c1)
>
> [>] sequence object created with TraMineR version 1.8-1
> [>] 622 sequences in the data set, 619 unique
> [>] min/max sequence length: 7/12
> [>] alphabet (state labels):
> 1=1 (Sleep)
> 2=2 (Groom)
> 3=3 (Eat)
> 4=4 (Help)
> 5=5 (Chores)
> 6=6 (Work)
> 7=7 (Local)
> 8=8 (Relax)
> [>] dimensionality of the sequence space: 84
> [>] colors: 1=#7FC97F 2=#BEAED4 3=#FDC086 4=#FFFF99 5=#386CB0
> 6=#F0027F 7=#BF5B17 8=#666666
> [>] symbol for void element: %
>
> # Using your method of appending the cluster column to the sequence data
> # Note this changes the length and dimensionality of the sequences!
> atus.lim$CLUSTER <- factor(seq.c, labels = paste(1:3))
> atus.c1 = atus.lim[atus.lim$CLUSTER==1,]
> summary(atus.c1)
>
> [>] sequence object created with TraMineR version 1.8-1
> [>] 622 sequences in the data set, 619 unique
> [>] min/max sequence length: 8/13
> [>] alphabet (state labels):
> 1=1 (Sleep)
> 2=2 (Groom)
> 3=3 (Eat)
> 4=4 (Help)
> 5=5 (Chores)
> 6=6 (Work)
> 7=7 (Local)
> 8=8 (Relax)
> [>] dimensionality of the sequence space: 91
> [>] colors: 1=#7FC97F 2=#BEAED4 3=#FDC086 4=#FFFF99 5=#386CB0
> 6=#F0027F 7=#BF5B17 8=#666666
> [>] symbol for void element: %
>
> seq.c <- cutree(seq.cluster, k = 10)
> atus.lim$CLUSTER <- factor(seq.c, labels = paste(1:10))
> atus.c1 = atus.lim[atus.lim$CLUSTER==1,]
> summary(atus.c1)
>
>
> On Mar 15, 2012, at 1:10 PM, Hadrien Commenges wrote:
>
> > Hi,
> >
> > I've created a sts object with the seqdef function and I'd like to
> split this object by a factor (cluster). I canuse some functions with
> the "group=" option, but I need to work with smaller objects and I
> really want to split-Ihre Daten wurden abgeschnitten.
>
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20120316/a63c527b/attachment-0001.html>
More information about the Traminer-users
mailing list