<div dir="ltr"><div><div><div>Dear Traminer Experts,<br><br></div>I have been using future based substitution costs as described in Studer and Ritschard 2014 and implemented in Traminer to make dissimilarity matrices. Unfortunately, the results often violate the triangle inequality. It seems to depend on what subset of my data I use, but I don't really know why the triangle inequality is violated in some cases and not in others. <br></div><div><br>Is there some appropriate way to fix this (e.g. alter one of the substitution costs manually)? Does the violation suggest that there is a problem somewhere else in my analysis or suggest something about my data? <br></div>I have pasted the output from my latest analysis below in case it is useful. <br><br>Thanks,<br><br></div>Jeremy<br><div><div><br><br><span class="Apple-style-span" style="border-collapse:separate;color:rgb(0,0,0);font-family:'Lucida Console';font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:15px;text-align:-webkit-left;text-indent:0px;text-transform:none;white-space:pre-wrap;word-spacing:0px;background-color:rgb(225,226,229)"><pre tabindex="0" class="GEWYW5YBFEB" id="rstudio_console_output" style="font-family:'Lucida Console';font-size:10pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;word-break:break-all;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;line-height:1.2"><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">###########
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue"># Calculate Substitution Costs using future similarity
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">###########
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">######
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">#default lag of 1
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">######
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">future <- seqcost(seq.hc, method="FUTURE", lag=1)
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] creating substitution-cost matrix using common future...
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] computing transition rates for states 1/2/3/4/5 ...
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">dimnames(future) = list( c("M", "S", "F", "O", "U"), c("M", "S", "F", "O", "U"))
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">round(future, 4)
</span> M S F O U
M 0.0000 0.1701 0.4147 0.6304 0.3960
S 0.1701 0.0000 0.2375 0.6125 0.3910
F 0.4147 0.2375 0.0000 0.7700 0.5552
O 0.6304 0.6125 0.7700 0.0000 0.4573
U 0.3960 0.3910 0.5552 0.4573 0.0000<br><br><span class="Apple-style-span" style="border-collapse:separate;color:rgb(0,0,0);font-family:'Lucida Console';font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:15px;text-align:-webkit-left;text-indent:0px;text-transform:none;white-space:pre-wrap;word-spacing:0px;background-color:rgb(225,226,229)"><pre tabindex="0" class="GEWYW5YBFEB" id="rstudio_console_output" style="font-family:'Lucida Console';font-size:10pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;word-break:break-all;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;line-height:1.2"><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">#largest substitution cost (F vs O)
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">maxsub <- max (max (future))
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">maxsub
</span>[1] 0.7699533
<span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">#smallest substitution cost (M vs S)
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">minsub <- min(future[lower.tri(future)])
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">minsub
</span>[1] 0.1701276</pre></span><br> <span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue"></span><br><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">######
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">#lag of 2
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">######
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">future2 <- seqcost(seq.hc, method="FUTURE", lag=2)
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] creating substitution-cost matrix using common future...
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] computing transition rates for states 1/2/3/4/5 ...
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">dimnames(future2) = list( c("M", "S", "F", "O", "U"), c("M", "S", "F", "O", "U"))
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">round(future2, 4)
</span> M S F O U
M 0.0000 0.1015 0.2740 0.4063 0.2184
S 0.1015 0.0000 0.1692 0.4116 0.2428
F 0.2740 0.1692 0.0000 0.5508 0.3660
O 0.4063 0.4116 0.5508 0.0000 0.2577
U 0.2184 0.2428 0.3660 0.2577 0.0000<br><br><span class="Apple-style-span" style="border-collapse:separate;color:rgb(0,0,0);font-family:'Lucida Console';font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:15px;text-align:-webkit-left;text-indent:0px;text-transform:none;white-space:pre-wrap;word-spacing:0px;background-color:rgb(225,226,229)"><pre tabindex="0" class="GEWYW5YBFEB" id="rstudio_console_output" style="font-family:'Lucida Console';font-size:10pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;word-break:break-all;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;line-height:1.2"><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">#largest substitution cost (F vs O)
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">maxsub2 <- max (max (future2))
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">maxsub2
</span>[1] 0.5508451
<span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">#smallest substitution cost (M vs S)
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">minsub2 <- min(future2[lower.tri(future2)])
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">minsub2
</span>[1] 0.1014867</pre></span><br></pre></span><br><span class="Apple-style-span" style="border-collapse:separate;color:rgb(0,0,0);font-family:'Lucida Console';font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:15px;text-align:-webkit-left;text-indent:0px;text-transform:none;white-space:pre-wrap;word-spacing:0px;background-color:rgb(225,226,229)"><pre tabindex="0" class="GEWYW5YBFEB" id="rstudio_console_output" style="font-family:'Lucida Console';font-size:10pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;word-break:break-all;margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px;line-height:1.2"><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">##########################################
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue"># make distance matrices
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">#########################################
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue"><br>> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">disomf1i5 <- seqdistOO(seq.hc, method = "OM", indel = .5*maxsub, sm = future)
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] 10923 sequences with 5 distinct events/states
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] 10923 distinct sequences
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] min/max sequence length: 18/18
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] computing distances using OM metric
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] total time: 59.34 secs
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)">Warning message:
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [!] at least, one substitution cost doesn't respect the triangle inequality.
[!] replacing 1 with 2 (cost=0.1701276) and then 2 with 3 (cost=0.2375203)
[!] costs less than replacing directly 1 with 3 (cost=0.4147297)
[!] total difference ([1=>2] + [2=>3] - [1=>3]): -0.007081773
</span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue"><br></span><span class="GEWYW5YBJEB ace_keyword" style="white-space:pre;color:blue">> </span><span class="GEWYW5YBMDB ace_keyword" style="color:blue">disomf2i5 <- seqdistOO(seq.hc, method = "OM", indel = .5*maxsub2, sm = future2)
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] 10923 sequences with 5 distinct events/states
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] 10923 distinct sequences
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] min/max sequence length: 18/18
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] computing distances using OM metric
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [>] total time: 59.04 secs
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)">Warning message:
</span><span class="GEWYW5YBAEB ace_constant" style="color:rgb(197,6,11)"> [!] at least, one substitution cost doesn't respect the triangle inequality.
[!] replacing 1 with 2 (cost=0.1014867) and then 2 with 3 (cost=0.1691506)
[!] costs less than replacing directly 1 with 3 (cost=0.27397)
[!] total difference ([1=>2] + [2=>3] - [1=>3]): -0.003332678 </span></pre></span><br clear="all"><div><div><div><br>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr">********************<br>Dr. Jeremy Reynolds<br>Professor<br>Department of Sociology<br>116 Baldwin Hall<br>University of Georgia<br>Athens, GA 30602-1611<br>Phone: (706) 583-8072<br>Web: <a href="http://uga.edu/soc/people/faculty/reynolds_jeremy.php" target="_blank">http://uga.edu/soc/people/faculty/reynolds_jeremy.php</a><br>Fax: (706) 542-4320</div></div></div></div>
</div></div></div></div></div></div>