<div class="gmail_quote">Hi together,
<div><br></div><div>here is the problem I needed dput for: <a href="http://www.fileuploadx.de/287440">http://www.fileuploadx.de/287440</a> (sorry, I know that this filehoster is annoying because you have to wait until you can download the file; I hope you have a coffee machine close by ;-)</div>

<div><br></div><div>In this attachment, I basically load in two data.tables DT1 and DT2 that I want to join, i.e. DT2[DT1], according to the keyed columns "Company_Code" and "intDatum" in DT1 and "DSCD" and "intDatum" in DT2. However, while "DSCD" is formatted as a character-column, "Company_Code" is formatted as a factor-column. As you can see from the long structure-object, there are plenty of levels here (the actual data.tables are very small).</div>


<div><br></div><div>Now, when I try to join those with DT2[DT1], I get:</div><div><br></div><div><span style="border-collapse:separate"><span style="text-align:-webkit-left"><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

        DSCD intDatum MONTH MV SICClass
 [1,] 997859      151  <NA> NA       44
 [2,] 997859      152  <NA> NA       44
 [3,] 998064      151  <NA> NA       15
 [4,] 998064      152  <NA> NA       15
 [5,] 142268      151  <NA> NA       53
 [6,] 142268      152  <NA> NA       53
 [7,] 142859      151  <NA> NA       56
 [8,] 142859      152  <NA> NA       56
 [9,] 143415      151  <NA> NA       63
[10,] 143415      152  <NA> NA       63
[11,] 307045      151  <NA> NA       15
[12,] 307045      152  <NA> NA       15</pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<br></pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

Basically, data.table finds no values for MV and MONTH for any DSCD and intDatum combination. However, as DT2[DSCD=="142268"] clearly shows, there are values for that DSCD:</pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<br></pre><pre style="outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;margin-top:0px;margin-bottom:0px">

<span style="border-collapse:separate"><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

       DSCD      MONTH      MV intDatum
[1,] 142268 1997-08-28 1901.12      151
[2,] 142268 1997-09-28 1829.00      152</pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<br></pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

Those, however, only show up in the join after i get rid of all the unused levels (equivalently, I can also transform the Company_Code to a character column):</pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<br></pre><pre style="outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;margin-top:0px;margin-bottom:0px">

<font face="'Lucida Console'"><span style="font-size:12px;line-height:14px;white-space:pre-wrap">DT1[, Company_Code := factor(Company_Code)]
DT2[DT1]</span></font></pre><pre style="outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;margin-top:0px;margin-bottom:0px">

<span style="border-collapse:separate;font-family:Arial;white-space:normal;font-size:medium"><span style="font-family:'Lucida Console';font-size:12px;line-height:14px;white-space:pre-wrap"><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

        DSCD intDatum      MONTH      MV SICClass
 [1,] 997859      151       <NA>      NA       44
 [2,] 997859      152       <NA>      NA       44
 [3,] 998064      151       <NA>      NA       15
 [4,] 998064      152       <NA>      NA       15
 [5,] 142268      151 1997-08-28 1901.12       53
 [6,] 142268      152 1997-09-28 1829.00       53
 [7,] 142859      151       <NA>      NA       56
 [8,] 142859      152       <NA>      NA       56
 [9,] 143415      151       <NA>      NA       63
[10,] 143415      152       <NA>      NA       63
[11,] 307045      151       <NA>      NA       15
[12,] 307045      152       <NA>      NA       15</pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<br></pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

I'm pretty sure this behaviour occurred only with version 1.8.0, probably because data.table coerced every key to factor before (see the NEWS to 1.8.0). So my question is: Is what happens here intended behavior? I'm honest with you: I'm working now for a while with R and factors are one of those things that I never got. I just don't see their use and every so often they cause me huge problems (as in this case). So I'm probably making something stupid here. The nasty thing about this issue here is that mostly, however, the joins just work as expected (believe me, I tried to produce a simple example with one column factor and one character that would reproduce this behavior, but no matter what I did, the joins afterwards always worked as expected). So in case this is expected behavior, should data.table give at least a warning that says something like "You join two data.tables whereby one keyed column is a factor, one is a character. That is probably not your intention. Convert the factor column to character or vice versa."?</pre>


<pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<br></pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

Thanks for your help!</pre><span class="HOEnZb"><font color="#888888"><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

<br></pre><pre style="font-family:'Lucida Console';font-size:9pt!important;outline-style:none;outline-width:initial;outline-color:initial;border-top-style:none;border-right-style:none;border-bottom-style:none;border-left-style:none;border-width:initial;border-color:initial;white-space:pre-wrap!important;margin-top:0px;margin-bottom:0px;line-height:1.2">

Christoph</pre></font></span></pre></span></span></pre></span></pre></span></span></div></div><br>