# [Traminer-users] transition rates as substitution costs with HAM metric in seqdistmc

Florian Hertel fhertel at bigsss.uni-bremen.de
Sat Feb 5 14:46:12 CET 2011

```  Hi all,

first of all: thanks for TraMineR!

I encountered a problem running a multiple sequence analyzes using the
HAM metric with substitution costs based on transition rates. I started
with creating a substitution cost matrix (scm) for each channel based on
the transition rates not allowing the rates to vary over the time. The
seqdistmc command (using the HAM metric) fed with my scm reported, that
the dimensions of the substitution matrix are not equal to the number of
episodes in my sequences. Thus, I created an array  replicating the
substitution cost matrix for each possible point in time. From the
report, I read that seqdistmc created a "time-varying substitution cost
matrix using 1 as a constant value", therefore abandoning my
substitution cost matrix. In order to achieve my goal I just used a DHD
metric based on my replicated substitution cost matrix which seemed to
work fine.

Here a small example:

dat25_1 <- matrix(c("HighCl", "LowCl", "LowCl",
"LowCl", "HighCl", "LowCl",
"HighCl", "LowCl", "HighCl"),
3,3,dimnames=list(c("pers1","pers2","pers3"),
c("class1","class2","class3")))
dat25_2 <- matrix(c("BMW", "VW", "VW",
"BMW", "BMW", "VW",
"BMW", "VW", "VW"),
3,3,dimnames=list(c("pers1","pers2","pers3"),
c("car1","car2","car3")))

seqdat1 <- seqdef(dat25_1)
seqdat2 <- seqdef(dat25_2)

subm1 <- seqsubm(seqdat1,method="TRATE",time.varying=FALSE)
subm2 <- seqsubm(seqdat2,method="TRATE",time.varying=FALSE)

subm1
subm2

# The first tral with error message
seqdistmc(channels=list(seqdat1,seqdat2),method="HAM", sm=list(subm1,subm2))

subm13 <- array(subm1,c(2,2,3),dimnames(subm1))
subm23 <- array(subm2,c(2,2,3),dimnames(subm2))

#The second trial, ignoring my sm list but using a constant value of 1
for all substitutions.
seqdistmc(channels=list(seqdat1,seqdat2),method="HAM",
sm=list(subm13,subm23))

#The third trial, working fine (hopefully)
seqdistmc(channels=list(seqdat1,seqdat2),method="DHD",
sm=list(subm13,subm23))

From what I understood the first trial did not work because my
substitution cost matrix (scm) are only 2-dimensional. But the second
trial which should be "OM without indels" did not work out with my scm
but seqdistmc chose the default mode instead with a constant
substitution cost of 1. Why did the seqdist program ignored my matrix?
Is it because the Hamming distance is the number of positions at which
the corresponding states differ? If so, why is it then possible to
insert an own scm?

Many thanks in advance and apologies for any inconvenience caused by the
long message.

Best,
Florian

--

Florian Hertel

--
Institute of Sociology
Social Science Faculty
University of Bremen

Bremen International Graduate School of Social Sciences (BIGSSS)

--
Office:
BIGSSS / University of Bremen
Wiener Straße
28359 Bremen
Germany

phone: ++49.(0)421.218-66418
mail : fhertel at bigsss.uni-bremen.de
web  : www.bigsss-bremen.de/index.php?id=fhertel

```