[datatable-help] Error in coercing matrices within j expressions
Frank Erickson
FErickson at psu.edu
Tue Sep 17 23:22:50 CEST 2013
Well, rbindlist(list()) says "Null data.table" (though it doesn't pass the
is.null() test). Maybe someone else has an idea how to deal with the
no-results case. By the way, it's best to use "reply to all" to make sure
you reply to the mailing list, too; they should be able to see your message
quoted below, though.
--Frank
On Tue, Sep 17, 2013 at 5:03 PM, Nathaniel Graham <npgraham1 at gmail.com>wrote:
> Frank,
>
> Thanks. This seems to have done the trick, so long as I'm careful to
> check for
> zero-length lists and return data.table(i = integer(), j = integer()) in
> those
> cases. Essentially, I have to test every combination of i and j to see if
> it's
> "interesting" or not, and some groups have a lot of rows. At the moment
> I'm
> attacking some other low hanging fruit, like speeding up the comparisons
> I have to do.
>
> As a side note, it would be kind of nice if there was a simple way to clue
> data.table to the fact that there are no rows to return, like returning
> NULL
> or NA or similar.
>
> -------
> Nathaniel Graham
> npgraham1 at gmail.com
> npgraham1 at uky.edu
>
>
> On Tue, Sep 17, 2013 at 4:22 PM, Frank Erickson <FErickson at psu.edu> wrote:
>
>> Hi,
>>
>> I guess you could put them into a list and then rbind at the end:
>>
>> indi <- list()
>> k=1
>> indi[[k]] <- list(i=2L,j=6L); k <- k+1
>> indi[[k]] <- list(4L,5L); k <- k+1
>> rbindlist(indi)
>> # i j
>> # 1: 2 6
>> # 2: 4 5
>>
>> For some reason, I couldn't get rbindlist to work unless the first item
>> in indi had explicit names ("i" and "j"), but names aren't needed for later
>> items.
>>
>> This should be better than dynamically growing with rbind each time, but
>> there may be a faster way. If your criteria for selecting (i,j) can be
>> written down, there's likely a much faster way than looping like this.
>>
>> Best,
>>
>> --Frank
>>
>>
>>
>> On Tue, Sep 17, 2013 at 2:13 PM, Nathaniel Graham <npgraham1 at gmail.com>wrote:
>>
>>> I'm currently using a (moderately) complex function, call
>>> if f(), as a j expression to analyze my data. The data itself
>>> is about 1.2M rows, which I analyze by group.
>>> A group may have as few as one row or as many as 10K.
>>> The output from the function is a two-column data.table
>>> where the rows are interesting (for my work) pairs of
>>> observations--I have no idea how many pairs will be
>>> interesting until the function runs, but in abstract it could
>>> be every unique combination (so as many as 50M rows
>>> of output for one call to f()). It is common, and not an
>>> error, for groups to have no meaningful pairs to return.
>>>
>>> I've been using the following line to create the output for
>>> f():
>>>
>>> indices <- data.table(i = integer(), j = integer())
>>>
>>> I then append to 'indices' any useful pairs using:
>>>
>>> indices <- rbind(indices, list(idx[i], idx[j]))
>>>
>>> This works, but is very, very slow, in part because I'm
>>> using rbind(). I want to switch to using the built-in matrix,
>>> because rbind() should be much faster for them. Using
>>> the following line to create the matrix:
>>>
>>> indices <- matrix(nrow = 0, ncol = 2, dimnames =
>>> list(c(NULL),c("i","j")))
>>>
>>> results in the following error:
>>>
>>> Logical error. Type of column should have been checked by now
>>>
>>> Note that the values returned are always integers. Results are
>>> coerced via:
>>>
>>> data.table(indices)
>>>
>>> before returning from f(). If I don't explicitly coerce, I get the
>>> following error:
>>>
>>> j doesn't evaluate to the same number of columns for each group
>>>
>>> If someone could tell me what I'm doing wrong, or some other
>>> equivalent way to noticeably speed up the whole process, I'd
>>> be very grateful.
>>>
>>>
>>> -------
>>> Nathaniel Graham
>>> npgraham1 at gmail.com
>>> npgraham1 at uky.edu
>>>
>>> _______________________________________________
>>> datatable-help mailing list
>>> datatable-help at lists.r-forge.r-project.org
>>>
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130917/5f0e7469/attachment-0001.html>
More information about the datatable-help
mailing list