[Traminer-users] clustering event sequence data
Hugo Varet
varethugo at gmail.com
Mon Jul 9 19:33:07 CEST 2012
Hello Mat,
a few months ago, I wanted to perform clustering of event sequences
and Matthias
Studer told me how to do this with the TraMineRextras package. I think you
can find his message in the archives of the mailing list (he sent it on
march 16th 2012).
To extract association rules from event sequences and to get the
corresponding hazard ratios, you have to use the seqerulesdisc function
available in the same package.
Hope this helps, best regards,
Hugo
2012/7/9 Weldon, Mat <m.weldon at lancaster.ac.uk>
> Hello,****
>
> ** **
>
> I’m doing a project with a set of criminal histories (ie. Lists of
> age-stamped offences). Here is an example: oftype is the type of crime, and
> sid is the subject ID:****
>
> sid oftype age****
>
> 5556.1 5556 B&E 18****
>
> 5556.2 5556 motor vehicle 18****
>
> 5556.3 5556 motor vehicle 18****
>
> 5556.4 5556 B&E 22****
>
> 5556.5 5556 alcohol 24****
>
> 5556.6 5556 miscellaneous 29****
>
> ** **
>
> Since these are events, I’m using the event methods in TraMineR to analyse
> them. I’ve created a seqe object, and run a frequent sub-sequence analysis.
> Here is the top 10:****
>
> Subsequence Support Count****
>
> 1 (assault) 0.6261261 417****
>
> 2 (child molestation) 0.6246246 416****
>
> 3 (rape) 0.5000000 333****
>
> 4 (theft) 0.4429429 295****
>
> 5 (B&E) 0.4159159 277****
>
> 6 (noncontact SO) 0.3963964 264****
>
> 7 (public order) 0.3858859 257****
>
> 8 (alcohol) 0.3183183 212****
>
> 9 (assault)-(assault) 0.3018018 201****
>
> 10 (assault)-(rape) 0.2882883 192****
>
> ** **
>
> Computed on 666 event sequences****
>
> Constraint Value****
>
> countMethod COBJ****
>
> ** **
>
> I’d like to compute clusters of sequences, either using agnes or pam
> algorithms, and then run a discriminating sequence analysis on the clusters
> (as demonstrated by Studer et al. 2010). However, I’m a bit stuck and I
> haven’t been able to find any help in the documentation. I have a few
> questions:****
>
> ** **
>
> **1. **Is there a function for computing dissimilarity measures,
> like seqdist, that works with event sequences? Something that I can feed
> into a clustering algorithm? I don’t know how Studer et al. did it because
> no code was provided.****
>
> **2. **Is there a way to constrain frequent subsequences to be
> maximal, in the sense that if “(assault)-(assault)” is frequent then
> “(assault)” will not be listed, for example?****
>
> **3. **Is there a way to calculate association rules for sequences
> using a hazard ratio measure similar to that described in Muller et al.
> (2010)?****
>
> ** **
>
> Many thanks in advance. Best wishes,****
>
> ** **
>
> Mat****
>
> ** **
>
> Mat Weldon****
>
> Department of Mathematics and Statistics****
>
> Room B18, Fylde College ****
>
> Lancaster University****
>
> Lancaster, LA1 4YF****
>
> Tel: 07929 310475****
>
> Email: m.weldon at lancaster.ac.uk****
>
> ** **
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20120709/cf725f4e/attachment.html>
More information about the Traminer-users
mailing list