[datatable-help] "Chris crash" - followup questions

DM tb2usd at gmail.com
Fri Jan 13 18:04:24 CET 2012


I've managed to get the code to proceed correctly by replacing the "myDT[,
newCol := oldCol]" with:

    tmpCol = myDT$oldCol
    myDT$newCol = tmpCol

This has avoided the issue.  Forgive me if I am using datatable incorrectly
or abusing it, but this seems to do the trick for now, and I had several
other questions that arose from all of this inspection.

1. As instructed in an earlier post, to a different user, I tried
"gcinfo(TRUE)" and "options(datatable.verbose=TRUE)".  The former didn't
given any information that could be helpful, but the latter was quite
interesting.  I noticed that the following messages occurred frequently:

  - setkey changed the type of column 'i' from numeric to integer, no
fractional data present.
  - First column i failed radixorder1, reverting to regularorder1
  - setkey incurred a copy of the whole table, due to the coercion(s) above.
  - Non-first column 2 failed radixorder1, reverting to regularorder1

1A: Would I benefit from changing the types to integers pre-emptively, so
that setkey doesn't have to do these coercions? (See Q 2 - how do I do
that?)
1B: Why is the whole table copied if one column is coerced?  That may get
to be problematic for larger tables, or multiple copies (due to multiple
keys that are not yet coerced to integers).
1C: What can I make of the 'failed radixorder1' messages?

2: My data table objects are created from several different sources, and
several have matching columns.  However, the types are different in the
different objects - some are numeric, some integer.  Integer is a perfectly
fine universal type for these particular columns.  However, it seems that
data.table only makes this coercion when "setkey()" is executed, rather
than at the creation of the datatables.  How can I make this coercion?
Solely via DT[, selCol := as.integer(selCol)] ?  This would seem to speed
up all of that coercion & copying.

3: In the datatable vignette, p. 12 (at least in my version), there is a
statement that NA is type logical in R.  I don't know if this is causing
issues, but that's not true.  NA as logical is the default (I think), but
one can have an NA in a numeric - e.g. `x <- c(pi, NA); str(x)`.
 3A: I have numeric NAs in my data tables - could this be related to issues
observed (i.e. type complaints and segfaults)?
 3B: Some columns are entirely (numeric) NA in some of the data tables.  Is
this setting me up for heartache?  :)  These are combined with other
datatables via rbind and merge.

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20120113/cfe6fd74/attachment.htm>


More information about the datatable-help mailing list