[Traminer-users] Possible TraMineR bug: Displaying an event sequence object crashes R (reproducible)

Matthias Studer Matthias.Studer at unige.ch
Mon Nov 26 12:38:23 CET 2012


Dear Bertolt Meyer,

Many thanks for your detailed bug report. It is very important for us to 
know when something is going wrong.

Your example does not crash on my machine. I think that the reason is 
that this bug has already been reported (see 
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1368&group_id=743&atid=2975 
). If this is the case, it has been fixed in the latest version of 
TraMineR. Please try to update TraMineR. To do that, you will need to 
update R (to version 2.15.2). After that, please run your example again 
and report back your results.

Your sequences contain a lot of different events. Because of that, if 
you plan to search for frequent subsequences, you will have to set the 
maxK parameter to a low value. Otherwise it will take too much times to 
compute (i.e. years)... There is an interaction between the minimum 
support and the maxK value. If you set the pMinSupport to a high value, 
you can set maxK to an higher value. You can probably set maxK to a 
higher value if you use different kinds of time constraints.

seqefsub(my_seqe, pMinSupport=0.2, maxK=3)

In your case, I suggest you to consider using tevent="state". It will 
generate one event per "state" and not one per transition between states 
(as do the default procedure).

my_seqe <- seqecreate(my_seq, tevent="state")
my_seqe
seqefsub(my_seqe, pMinSupport=0.2, maxK=3)

Many thanks again for your bug report.
Kind regards,
Matthias Studer


Le 26.11.2012 12:00, Bertolt Meyer a écrit :
> Dear TraMineR developers,
>
> I may have come across a bug in TraMineR: R (Version 2.14.2) crashes if I try to display my event sequence object. I tried it on different machines and the problem is reproducible. An executable code snipped reproducing the problem is at the bottom of the mail.
>
> I have a data set containing 45 sequences, min/max sequence length: 176/415. These are 45 group discussions, for which the discussants' speech acts have been coded as the states. Creating the sequence works fine:
>
>> my_seq <- seqdef(my_seq_data_wide, 2:416, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>   [>] found missing values ('NA') in sequence data
>   [>] preparing 45 sequences
>   [>] coding void elements with '%' and missing values with '*'
>   [>] alphabet (state labels):
>       1 = C (Content_Neither...nor)
>       2 = CP (Content_Proposal)
>       3 = CQ (Content_Question)
>       4 = R (Regulation_Neither...nor)
>       5 = RP (Regulation_Proposal)
>       6 = RQ (Regulation_Question)
>       7 = SEn (SENegative_Neither...nor)
>       8 = SEnP (SENegative_Proposal)
>       9 = SEnQ (SENegative_Question)
>       10 = SEp (SEPositive_Neither...nor)
>       11 = SEpP (SEPositive_Proposal)
>       12 = SEpQ (SEPositive_Question)
>   [>] 45 sequences in the data set
>   [>] min/max sequence length: 176/415
>
> Calculating transition rates with seqtrate() works well. However, executing the second line of the following two lines of code crashes R:
>
> my_seqe <- seqecreate(my_seq)
> my_seqe
>
> I assume that there is something in the data that TraMineR cannot deal with, because it works if I limit the length of the sequences to 29, i.e.,
>
> my_seq_short <- seqdef(my_seq_data_wide, 2:30, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>
> my_seqe <- seqecreate(my_seq_short)
> my_seqe
>
> Could this be a bug? Please find the complete code for reproducing this problem below.
>
> Best greetings,
> Bertolt
>
> # Complete code for reproducing the problem:
> # Load data set (takes a while, ~2 MB):
> my_seq_data_long <- read.csv(file = "http://dl.dropbox.com/u/5384027/ex_data_bertolt.csv")
> str(my_seq_data_long)
>
> # Convert to wide format
> my_seq_data_wide <- reshape(my_seq_data_long[,c("Group_number", "Number", "main_min_category")],
>                        idvar = "Group_number", timevar = "Number", direction = "wide")
>
> names(my_seq_data_wide) # The sequence events span variables 2:416
>
> # Create labels for sequence data
> my_seq_data.labels <- levels(my_seq_data_long$main_min_category)
> my_seq_data.labels
>
> # Create alphabet
> my_seq_data.alphabet <- c("C", "CP", "CQ", "R", "RP", "RQ",
>                      "SEn", "SEnP", "SEnQ", "SEp", "SEpP", "SEpQ")
> my_seq_data.alphabet
>
> # Create state sequence object
> library(TraMineR)
> my_seq <- seqdef(my_seq_data_wide, 2:416, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>
> # Calculate transition rates
> round(seqtrate(my_seq), 2)
>
> # Creating an event sequence object works if I use short sequences:
> my_seq_short <- seqdef(my_seq_data_wide, 2:30, states = my_seq_data.alphabet, labels = my_seq_data.labels)
> my_seqe <- seqecreate(my_seq_short)
> my_seqe
>
> # However, if I use the full data set, R crashes when calling the second line of
> # the following two lines:
> my_seqe <- seqecreate(my_seq)
> my_seqe
>
>
> --
> Dr. Bertolt Meyer, Dipl.-Psych.
> Lehrstuhlvertretung
> Institut für Psychologie
> Wirtschafts-, Organisations- und Sozialpsychologie
>
> TU Chemnitz
> Wilhelm-Raabe-Straße 43
> 09107 Chemnitz
>
> Tel: +49 (0)371 531-32972
> Fax: +49 (0)371 531-839627
>
> mail: bertolt.meyer at psychologie.tu-chemnitz.de
> web: http://www.tu-chemnitz.de/hsw/psychologie/professuren/sozpsy/meyer.php
>
>


-- 
Matthias Studer
Institut d'études démographiques et du parcours de vie
et Département des sciences économiques
Uni-Mail, bureau 5205
40, bd du Pont d'Arve
1211 Genève 4
Tel: +41 22 379 82 15
Fax: +41 22 379 82 99

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20121126/1b518cbd/attachment.html>


More information about the Traminer-users mailing list