[Traminer-users] transition rates as substitution costs with HAM metric in seqdistmc

Florian Hertel fhertel at bigsss.uni-bremen.de
Sat Feb 5 14:46:12 CET 2011


  Hi all,

first of all: thanks for TraMineR!

I encountered a problem running a multiple sequence analyzes using the 
HAM metric with substitution costs based on transition rates. I started 
with creating a substitution cost matrix (scm) for each channel based on 
the transition rates not allowing the rates to vary over the time. The 
seqdistmc command (using the HAM metric) fed with my scm reported, that 
the dimensions of the substitution matrix are not equal to the number of 
episodes in my sequences. Thus, I created an array  replicating the 
substitution cost matrix for each possible point in time. From the 
report, I read that seqdistmc created a "time-varying substitution cost 
matrix using 1 as a constant value", therefore abandoning my 
substitution cost matrix. In order to achieve my goal I just used a DHD 
metric based on my replicated substitution cost matrix which seemed to 
work fine.

Here a small example:

dat25_1 <- matrix(c("HighCl", "LowCl", "LowCl",
                   "LowCl", "HighCl", "LowCl",
                   "HighCl", "LowCl", "HighCl"),
                   3,3,dimnames=list(c("pers1","pers2","pers3"),
                   c("class1","class2","class3")))
dat25_2 <- matrix(c("BMW", "VW", "VW",
                   "BMW", "BMW", "VW",
                   "BMW", "VW", "VW"),
                   3,3,dimnames=list(c("pers1","pers2","pers3"),
                   c("car1","car2","car3")))

seqdat1 <- seqdef(dat25_1)
seqdat2 <- seqdef(dat25_2)

subm1 <- seqsubm(seqdat1,method="TRATE",time.varying=FALSE)
subm2 <- seqsubm(seqdat2,method="TRATE",time.varying=FALSE)

subm1
subm2


# The first tral with error message
seqdistmc(channels=list(seqdat1,seqdat2),method="HAM", sm=list(subm1,subm2))

subm13 <- array(subm1,c(2,2,3),dimnames(subm1))
subm23 <- array(subm2,c(2,2,3),dimnames(subm2))

#The second trial, ignoring my sm list but using a constant value of 1 
for all substitutions.
seqdistmc(channels=list(seqdat1,seqdat2),method="HAM", 
sm=list(subm13,subm23))

#The third trial, working fine (hopefully)
seqdistmc(channels=list(seqdat1,seqdat2),method="DHD", 
sm=list(subm13,subm23))

 From what I understood the first trial did not work because my 
substitution cost matrix (scm) are only 2-dimensional. But the second 
trial which should be "OM without indels" did not work out with my scm 
but seqdistmc chose the default mode instead with a constant 
substitution cost of 1. Why did the seqdist program ignored my matrix? 
Is it because the Hamming distance is the number of positions at which 
the corresponding states differ? If so, why is it then possible to 
insert an own scm?

Many thanks in advance and apologies for any inconvenience caused by the 
long message.

Best,
Florian

-- 

Florian Hertel

--
Institute of Sociology
Social Science Faculty
University of Bremen

Bremen International Graduate School of Social Sciences (BIGSSS)

--
Office:
BIGSSS / University of Bremen
Wiener Straße
28359 Bremen
Germany

phone: ++49.(0)421.218-66418
mail : fhertel at bigsss.uni-bremen.de
web  : www.bigsss-bremen.de/index.php?id=fhertel



More information about the Traminer-users mailing list