<div dir="ltr"><div><div><div>Hi<br>I see this problem too. I was not using data.table before 1.9, so I did no realize it ever behaved differently. In the examples I've tried, any calculation that I expect to create a factor seems to create an integer that uses the R internal integer of the factor. <br>
<br>I noticed this, I thought maybe I needed to do more explicit casting to make it come out as a factor. Here's my variable to lag a factor that beats the point into the ground. <br><br>lagFactor <- function(x, N){<br>
xold <- x<br> if (is.factor(x)) {<br> xlev <- levels(x)<br> xnum <- as.numeric(x)<br> } else {<br> xlev <- unique(x)<br> }<br> xlag <- c(rep(NA, N), xnum[-(length(xnum):(length(xnum)-N+1))])<br>
xlagf <- factor(xlev[xlag], levels = xlev)<br> xlagf<br>}<br><br></div><div>dat is a data.table with lots of lines, I can give you a copy if you want. <br><br></div><div>Now I'll show you that the result is different in and out of a data.table.<br>
</div><div><br>> xx <- lagFactor(dat$east2b, 1)<br></div></div>> table(xx)<br>xx<br> Yes No <br>130232 151885 <br>> levels(xx)<br>[1] "Yes" "No" <br>> dat[ , xx := lagFactor(east2b, 1), by = c("sippid"), roll = TRUE]<br>
> table(dat$xx)<br><br> 1 2 <br>114963 130095 <br>> levels(dat$xx)<br>NULL<br>> table(xx, dat$xx)<br> <br>xx 1 2<br> Yes 114963 0<br> No 0 130095<br><br><br></div><div>For my case, the only fix is an explicit re-factoring. <br>
</div><div><br></div><div> pj<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Mar 28, 2014 at 5:29 AM, DERVIEUX Christophe <span dir="ltr"><<a href="mailto:christophe.dervieux@rte-france.com" target="_blank">christophe.dervieux@rte-france.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div link="blue" vlink="purple" lang="FR">
<div>
<p class="MsoNormal"><span lang="EN-US">Hi, <br>
<br>
I have updated data.table package to 1.9.2 recently from 1.8.10 and I found errors on my previous code.
<br>
<br>
See reproductible example below: <br>
<br>
On 1.8.10 :<br>
DT<-data.table(X=factor(2006:2012),Y=rep(1:7,2))<br>
DT[,Z:=paste(X,.N,sep=" - "),by=list(X)][]<br>
<br>
X Y Z<br>
1: 2006 1 2006 - 2<br>
2: 2007 2 2007 - 2<br>
3: 2008 3 2008 - 2<br>
4: 2009 4 2009 - 2<br>
5: 2010 5 2010 - 2<br>
6: 2011 6 2011 - 2<br>
7: <a href="tel:2012%207%202012%20-%202" value="+12012720122" target="_blank">2012 7 2012 - 2</a><br>
8: 2006 1 2006 - 2<br>
9: 2007 2 2007 - 2<br>
10: 2008 3 2008 - 2<br>
11: 2009 4 2009 - 2<br>
12: 2010 5 2010 - 2<br>
13: 2011 6 2011 - 2<br>
14: <a href="tel:2012%207%202012%20-%202" value="+12012720122" target="_blank">2012 7 2012 - 2</a><br>
<br>
In column Z, I get the level of the factor column X <br>
pasted with count '.N' as expected<br>
<br>
However, in the 1.9.2, with same code :<br>
DT<-data.table(X=factor(2006:2012),Y=rep(1:7,2))<br>
DT[,Z:=paste(X,.N,sep=" - "),by=list(X)][]<br>
<br>
X Y Z<br>
1: 2006 1 1 - 2<br>
2: 2007 2 2 - 2<br>
3: 2008 3 3 - 2<br>
4: 2009 4 4 - 2<br>
5: 2010 5 5 - 2<br>
6: 2011 6 6 - 2<br>
7: 2012 7 7 - 2<br>
8: 2006 1 1 - 2<br>
9: 2007 2 2 - 2<br>
10: 2008 3 3 - 2<br>
11: 2009 4 4 - 2<br>
12: 2010 5 5 - 2<br>
13: 2011 6 6 - 2<br>
14: 2012 7 7 - 2<br>
<br>
as results, I do not get levels of factor column X but the numeric values associated with the level.
<br>
<br>
is this working normally? Why has it changed? Is that a bug? <br>
<br>
I use this kind of procedure to make labels for ggplot. All my previous code is not working anymore. It's kind of annoying.<br>
<br>
</span>Thanks<span class="HOEnZb"><font color="#888888"><br>
<br>
Christophe<u></u><u></u></font></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
<br>_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br></blockquote></div><br><br clear="all">
<br>-- <br>Paul E. Johnson<br>Professor, Political Science Assoc. Director<br>1541 Lilac Lane, Room 504 Center for Research Methods<br>University of Kansas University of Kansas<br><a href="http://pj.freefaculty.org" target="_blank">http://pj.freefaculty.org</a> <a href="http://quant.ku.edu" target="_blank">http://quant.ku.edu</a>
</div></div>