[Traminer-users] clustering event sequence data
Weldon, Mat
m.weldon at lancaster.ac.uk
Mon Jul 9 15:57:19 CEST 2012
Hello,
I'm doing a project with a set of criminal histories (ie. Lists of age-stamped offences). Here is an example: oftype is the type of crime, and sid is the subject ID:
sid oftype age
5556.1 5556 B&E 18
5556.2 5556 motor vehicle 18
5556.3 5556 motor vehicle 18
5556.4 5556 B&E 22
5556.5 5556 alcohol 24
5556.6 5556 miscellaneous 29
Since these are events, I'm using the event methods in TraMineR to analyse them. I've created a seqe object, and run a frequent sub-sequence analysis. Here is the top 10:
Subsequence Support Count
1 (assault) 0.6261261 417
2 (child molestation) 0.6246246 416
3 (rape) 0.5000000 333
4 (theft) 0.4429429 295
5 (B&E) 0.4159159 277
6 (noncontact SO) 0.3963964 264
7 (public order) 0.3858859 257
8 (alcohol) 0.3183183 212
9 (assault)-(assault) 0.3018018 201
10 (assault)-(rape) 0.2882883 192
Computed on 666 event sequences
Constraint Value
countMethod COBJ
I'd like to compute clusters of sequences, either using agnes or pam algorithms, and then run a discriminating sequence analysis on the clusters (as demonstrated by Studer et al. 2010). However, I'm a bit stuck and I haven't been able to find any help in the documentation. I have a few questions:
1. Is there a function for computing dissimilarity measures, like seqdist, that works with event sequences? Something that I can feed into a clustering algorithm? I don't know how Studer et al. did it because no code was provided.
2. Is there a way to constrain frequent subsequences to be maximal, in the sense that if "(assault)-(assault)" is frequent then "(assault)" will not be listed, for example?
3. Is there a way to calculate association rules for sequences using a hazard ratio measure similar to that described in Muller et al. (2010)?
Many thanks in advance. Best wishes,
Mat
Mat Weldon
Department of Mathematics and Statistics
Room B18, Fylde College
Lancaster University
Lancaster, LA1 4YF
Tel: 07929 310475
Email: m.weldon at lancaster.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20120709/71bf464a/attachment.html>
More information about the Traminer-users
mailing list