[Traminer-users] Possible TraMineR bug: Displaying an event sequence object crashes R (reproducible)

Bertolt Meyer bertolt.meyer at psychologie.tu-chemnitz.de
Mon Nov 26 21:06:40 CET 2012


Dear Matthias,

thank you so much for your swift reply. Indeed, it seemed to have been the same issue: It works after updating TraMineR to the latest version and R to 2.15.2 . I apologize for not realizing that this was the same bug. Thanks again for your very fast help.

Best greetings,
Bertolt


Am 26.11.2012 um 12:38 schrieb Matthias Studer <Matthias.Studer at unige.ch>:

> Dear Bertolt Meyer,
> 
> Many thanks for your detailed bug report. It is very important for us to know when something is going wrong. 
> 
> Your example does not crash on my machine. I think that the reason is that this bug has already been reported (see https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1368&group_id=743&atid=2975 ). If this is the case, it has been fixed in the latest version of TraMineR. Please try to update TraMineR. To do that, you will need to update R (to version 2.15.2). After that, please run your example again and report back your results.
> 
> Your sequences contain a lot of different events. Because of that, if you plan to search for frequent subsequences, you will have to set the maxK parameter to a low value. Otherwise it will take too much times to compute (i.e. years)... There is an interaction between the minimum support and the maxK value. If you set the pMinSupport to a high value, you can set maxK to an higher value. You can probably set maxK to a higher value if you use different kinds of time constraints.
> 
> seqefsub(my_seqe, pMinSupport=0.2, maxK=3)
> 
> In your case, I suggest you to consider using tevent="state". It will generate one event per "state" and not one per transition between states (as do the default procedure).
> 
> my_seqe <- seqecreate(my_seq, tevent="state")
> my_seqe
> seqefsub(my_seqe, pMinSupport=0.2, maxK=3) 
> 
> Many thanks again for your bug report.
> Kind regards,
> Matthias Studer
> 
> 
> Le 26.11.2012 12:00, Bertolt Meyer a écrit :
>> Dear TraMineR developers,
>> 
>> I may have come across a bug in TraMineR: R (Version 2.14.2) crashes if I try to display my event sequence object. I tried it on different machines and the problem is reproducible. An executable code snipped reproducing the problem is at the bottom of the mail. 
>> 
>> I have a data set containing 45 sequences, min/max sequence length: 176/415. These are 45 group discussions, for which the discussants' speech acts have been coded as the states. Creating the sequence works fine:
>> 
>> 
>>> my_seq <- seqdef(my_seq_data_wide, 2:416, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>>> 
>>  [>] found missing values ('NA') in sequence data
>>  [>] preparing 45 sequences
>>  [>] coding void elements with '%' and missing values with '*'
>>  [>] alphabet (state labels): 
>>      1 = C (Content_Neither...nor)
>>      2 = CP (Content_Proposal)
>>      3 = CQ (Content_Question)
>>      4 = R (Regulation_Neither...nor)
>>      5 = RP (Regulation_Proposal)
>>      6 = RQ (Regulation_Question)
>>      7 = SEn (SENegative_Neither...nor)
>>      8 = SEnP (SENegative_Proposal)
>>      9 = SEnQ (SENegative_Question)
>>      10 = SEp (SEPositive_Neither...nor)
>>      11 = SEpP (SEPositive_Proposal)
>>      12 = SEpQ (SEPositive_Question)
>>  [>] 45 sequences in the data set
>>  [>] min/max sequence length: 176/415
>> 
>> Calculating transition rates with seqtrate() works well. However, executing the second line of the following two lines of code crashes R:
>> 
>> my_seqe <- seqecreate(my_seq)
>> my_seqe
>> 
>> I assume that there is something in the data that TraMineR cannot deal with, because it works if I limit the length of the sequences to 29, i.e., 
>> 
>> my_seq_short <- seqdef(my_seq_data_wide, 2:30, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>> 
>> my_seqe <- seqecreate(my_seq_short)
>> my_seqe
>> 
>> Could this be a bug? Please find the complete code for reproducing this problem below.
>> 
>> Best greetings,
>> Bertolt
>> 
>> # Complete code for reproducing the problem:
>> # Load data set (takes a while, ~2 MB):
>> my_seq_data_long <- read.csv(file = 
>> "http://dl.dropbox.com/u/5384027/ex_data_bertolt.csv"
>> )
>> str(my_seq_data_long)
>> 
>> # Convert to wide format
>> my_seq_data_wide <- reshape(my_seq_data_long[,c("Group_number", "Number", "main_min_category")], 
>>                       idvar = "Group_number", timevar = "Number", direction = "wide")
>> 
>> names(my_seq_data_wide) # The sequence events span variables 2:416
>> 
>> # Create labels for sequence data
>> my_seq_data.labels <- levels(my_seq_data_long$main_min_category)
>> my_seq_data.labels
>> 
>> # Create alphabet
>> my_seq_data.alphabet <- c("C", "CP", "CQ", "R", "RP", "RQ", 
>>                     "SEn", "SEnP", "SEnQ", "SEp", "SEpP", "SEpQ")
>> my_seq_data.alphabet
>> 
>> # Create state sequence object
>> library(TraMineR)
>> my_seq <- seqdef(my_seq_data_wide, 2:416, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>> 
>> # Calculate transition rates
>> round(seqtrate(my_seq), 2)
>> 
>> # Creating an event sequence object works if I use short sequences:
>> my_seq_short <- seqdef(my_seq_data_wide, 2:30, states = my_seq_data.alphabet, labels = my_seq_data.labels)
>> my_seqe <- seqecreate(my_seq_short)
>> my_seqe
>> 
>> # However, if I use the full data set, R crashes when calling the second line of 
>> # the following two lines:
>> my_seqe <- seqecreate(my_seq)
>> my_seqe
>> 
>> 
>> --
>> Dr. Bertolt Meyer, Dipl.-Psych.
>> Lehrstuhlvertretung 
>> Institut für Psychologie
>> Wirtschafts-, Organisations- und Sozialpsychologie
>> 
>> TU Chemnitz
>> Wilhelm-Raabe-Straße 43
>> 09107 Chemnitz
>> 
>> Tel: +49 (0)371 531-32972
>> Fax: +49 (0)371 531-839627
>> 
>> mail: 
>> bertolt.meyer at psychologie.tu-chemnitz.de
>> 
>> web: 
>> http://www.tu-chemnitz.de/hsw/psychologie/professuren/sozpsy/meyer.php
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Matthias Studer
> Institut d'études démographiques et du parcours de vie
> et Département des sciences économiques
> Uni-Mail, bureau 5205
> 40, bd du Pont d’Arve
> 1211 Genève 4
> Tel: +41 22 379 82 15
> Fax: +41 22 379 82 99
> 



More information about the Traminer-users mailing list