[Traminer-users] Flagging for the most discriminating subsequences on another data set in Traminer

bert.carremans at bnpparibasfortis.com bert.carremans at bnpparibasfortis.com
Wed Sep 14 14:32:08 CEST 2016


I am working with Traminer to find subsequences that are discriminating between two groups. My original data set CJ contained events of about 110K people.
To limit the analysis time, I took a sample of 10K people named CJ_SAMPLE.
On this sample I searched for the most frequent subsequences with the following code:
fsubseq_sample <- seqefsub(cj_sample.seq, pMinSupport = 0.05, maxK= 4)
This list of subsequences is then analyzed for the findding the most discriminating subsequences.
discr = seqecmpgroup(fsubseq_sample, group=cj_tbl$groupID, method="bonferroni")
Now I want to create dummy variables on my original data set CJ to flag whether these discriminating subsequences occurred for each person in CJ. So I created the list of sequences with
cj.seq <- seqecreate(id=cj$person_id, time=cj$date_in, event=cj$event)
I was able to do this on CJ_SAMPLE with
t = seqeapplysub(discr[0:51], method = 'presence')
But how can I do this on the orignal data set CJ? Thanks!

Best regards,

Bert
======================================================
BNP Paribas Fortis disclaimer:
http://www.bnpparibasfortis.com/e-mail-disclaimer.html
 
BNP Paribas Fortis privacy policy:
http://www.bnpparibasfortis.com/privacy-policy.html
 
======================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20160914/c6c4c0f5/attachment.html>


More information about the Traminer-users mailing list