[datatable-help] using set to create a large date vector converted to factor

Zachary O'Keeffe zach.okeeffe at gmail.com
Sat Aug 13 20:07:25 CEST 2016


Hello,

I'm a huge fan of data.table and use it almost exclusively. I've not used
this mailing list before, but I signed up to report an error I encountered
because the error message asked me to.

I have a fairly large data.table (480,000 rows) with over 6,000 unique
dates. I can create a factor version of the date variable using the :=
syntax, but not with set(), which I generally try to use as per the
recommendation in the package documentation. Factoring a date column with
set() does work for smaller data.tables though. See below.

Note that in addition to using as.factor(BigDT[["Date"]]) I used
BigDT[,Date] and also tried creating the vector outside of set(), which
works, but when I feed it to set it does not work.

Best,

Zach

>
TestDT<-data.table(x=1:10,Date=seq(as.Date("2012-01-01"),as.Date("2012-01-10"),by="1
day"))
> set(TestDT,NULL,"DateFactor",as.factor(TestDT[["Date"]]))
> set(BigDT,NULL,"DateFactor",as.factor(BigDT[["Date"]]))
Error in set(BigDT, NULL, "DateFactor", as.factor(BigDT[["Date"]])) :
  Internal error, please report (including result of sessionInfo()) to
datatable-help: oldtncol (0) < oldncol (41) but tl of class is marked.
> BigDT[,DateFactor:=as.factor(Date)]
> nrow(BigDT)
[1] 480743
> nlevels(BigDT[["DateFactor"]])
[1] 6119
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux stretch/sid

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] data.table_1.9.6

loaded via a namespace (and not attached):
[1] tools_3.2.1  chron_2.3-47
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20160813/309b7566/attachment.html>


More information about the datatable-help mailing list