[Traminer-users] Possible TraMineR bug: Displaying an event sequence object crashes R (reproducible)
Matthias Studer
Matthias.Studer at unige.ch
Mon Nov 26 12:38:23 CET 2012
Dear Bertolt Meyer,
Many thanks for your detailed bug report. It is very important for us to
know when something is going wrong.
Your example does not crash on my machine. I think that the reason is
that this bug has already been reported (see
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1368&group_id=743&atid=2975
). If this is the case, it has been fixed in the latest version of
TraMineR. Please try to update TraMineR. To do that, you will need to
update R (to version 2.15.2). After that, please run your example again
and report back your results.
Your sequences contain a lot of different events. Because of that, if
you plan to search for frequent subsequences, you will have to set the
maxK parameter to a low value. Otherwise it will take too much times to
compute (i.e. years)... There is an interaction between the minimum
support and the maxK value. If you set the pMinSupport to a high value,
you can set maxK to an higher value. You can probably set maxK to a
higher value if you use different kinds of time constraints.
seqefsub(my_seqe, pMinSupport=0.2, maxK=3)
In your case, I suggest you to consider using tevent="state". It will
generate one event per "state" and not one per transition between states
(as do the default procedure).
my_seqe <- seqecreate(my_seq, tevent="state")
my_seqe
seqefsub(my_seqe, pMinSupport=0.2, maxK=3)
Many thanks again for your bug report.
Kind regards,
Matthias Studer
Le 26.11.2012 12:00, Bertolt Meyer a écrit :
> Dear TraMineR developers,
>
> I may have come across a bug in TraMineR: R (Version 2.14.2) crashes if I try to display my event sequence object. I tried it on different machines and the problem is reproducible. An executable code snipped reproducing the problem is at the bottom of the mail.
>
> I have a data set containing 45 sequences, min/max sequence length: 176/415. These are 45 group discussions, for which the discussants' speech acts have been coded as the states. Creating the sequence works fine:
>
>> my_seq <- seqdef(my_seq_data_wide, 2:416, states = my_seq_data.alphabet, labels = my_seq_data.labels)
> [>] found missing values ('NA') in sequence data
> [>] preparing 45 sequences
> [>] coding void elements with '%' and missing values with '*'
> [>] alphabet (state labels):
> 1 = C (Content_Neither...nor)
> 2 = CP (Content_Proposal)
> 3 = CQ (Content_Question)
> 4 = R (Regulation_Neither...nor)
> 5 = RP (Regulation_Proposal)
> 6 = RQ (Regulation_Question)
> 7 = SEn (SENegative_Neither...nor)
> 8 = SEnP (SENegative_Proposal)
> 9 = SEnQ (SENegative_Question)
> 10 = SEp (SEPositive_Neither...nor)
> 11 = SEpP (SEPositive_Proposal)
> 12 = SEpQ (SEPositive_Question)
> [>] 45 sequences in the data set
> [>] min/max sequence length: 176/415
>
> Calculating transition rates with seqtrate() works well. However, executing the second line of the following two lines of code crashes R:
>
> my_seqe <- seqecreate(my_seq)
> my_seqe
>
> I assume that there is something in the data that TraMineR cannot deal with, because it works if I limit the length of the sequences to 29, i.e.,
>
> my_seq_short <- seqdef(my_seq_data_wide, 2:30, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>
> my_seqe <- seqecreate(my_seq_short)
> my_seqe
>
> Could this be a bug? Please find the complete code for reproducing this problem below.
>
> Best greetings,
> Bertolt
>
> # Complete code for reproducing the problem:
> # Load data set (takes a while, ~2 MB):
> my_seq_data_long <- read.csv(file = "http://dl.dropbox.com/u/5384027/ex_data_bertolt.csv")
> str(my_seq_data_long)
>
> # Convert to wide format
> my_seq_data_wide <- reshape(my_seq_data_long[,c("Group_number", "Number", "main_min_category")],
> idvar = "Group_number", timevar = "Number", direction = "wide")
>
> names(my_seq_data_wide) # The sequence events span variables 2:416
>
> # Create labels for sequence data
> my_seq_data.labels <- levels(my_seq_data_long$main_min_category)
> my_seq_data.labels
>
> # Create alphabet
> my_seq_data.alphabet <- c("C", "CP", "CQ", "R", "RP", "RQ",
> "SEn", "SEnP", "SEnQ", "SEp", "SEpP", "SEpQ")
> my_seq_data.alphabet
>
> # Create state sequence object
> library(TraMineR)
> my_seq <- seqdef(my_seq_data_wide, 2:416, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>
> # Calculate transition rates
> round(seqtrate(my_seq), 2)
>
> # Creating an event sequence object works if I use short sequences:
> my_seq_short <- seqdef(my_seq_data_wide, 2:30, states = my_seq_data.alphabet, labels = my_seq_data.labels)
> my_seqe <- seqecreate(my_seq_short)
> my_seqe
>
> # However, if I use the full data set, R crashes when calling the second line of
> # the following two lines:
> my_seqe <- seqecreate(my_seq)
> my_seqe
>
>
> --
> Dr. Bertolt Meyer, Dipl.-Psych.
> Lehrstuhlvertretung
> Institut für Psychologie
> Wirtschafts-, Organisations- und Sozialpsychologie
>
> TU Chemnitz
> Wilhelm-Raabe-Straße 43
> 09107 Chemnitz
>
> Tel: +49 (0)371 531-32972
> Fax: +49 (0)371 531-839627
>
> mail: bertolt.meyer at psychologie.tu-chemnitz.de
> web: http://www.tu-chemnitz.de/hsw/psychologie/professuren/sozpsy/meyer.php
>
>
--
Matthias Studer
Institut d'études démographiques et du parcours de vie
et Département des sciences économiques
Uni-Mail, bureau 5205
40, bd du Pont d'Arve
1211 Genève 4
Tel: +41 22 379 82 15
Fax: +41 22 379 82 99
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20121126/1b518cbd/attachment.html>
More information about the Traminer-users
mailing list