[datatable-help] What is going on with R 3.1 ?

Amy mathematical.coffee at gmail.com
Fri Jun 20 03:44:26 CEST 2014


Thanks for this, I knew not knowing how to do that join was a problem with
me not understanding data.table, not a problem with data.table.

Very good to know the 'bysubl' "error" is fixed in 1.9.3 (even if it is
brought about by users like me trying to do our joins wrongly :))

thanks,
Amy


On 20 June 2014 11:18, Arunkumar Srinivasan <aragorn168b at gmail.com> wrote:

> Hi Amy,
>
> Good to know that it’s not reproducible in 1.9.3. Matt already fixed it.
>
> X[Y, LHS := RHS] can not exceed nrow(X) because this assignment is made *by
> reference*. If the join from X[Y] results in more than nrow(X), then X
> will be to be re-allocated entirely.
>
> If you only want those that match with X, then you should do: X[Y, female
> := i.female, nomatch=0L].
>
> If instead you want all the rows from y, then you could do: x[y,
> allow.cartesian=TRUE].
>
>
> Arun
>
> From: Amy mathematical.coffee at gmail.com
> Reply: Amy mathematical.coffee at gmail.com
> Date: June 20, 2014 at 3:01:50 AM
> To: Arunkumar Srinivasan aragorn168b at gmail.com
> Cc: datatable-help at lists.r-forge.r-project.org
> datatable-help at lists.r-forge.r-project.org
>
> Subject:  Re: [datatable-help] What is going on with R 3.1 ?
>
>  Hi Arun,
>
> In 1.9.3 I get the "Error in vecseq(f__, len__, if (allow.cartesian) NULL
> else as.integer(max(nrow(x), : Join results in 33 rows; more than 28 =
> max(nrow(x),nrow(i))...." message and it doesn't assign the column (upon
> `x[y, female:=female]`, so no, the error doesn't occur.
>
> But as an aside, shouldn't it this command work?
> If I have x with subjects a, a, b, c, d; y with genders for subjects a--f,
> shouldn't x[y, female:=female] copy the female column from y to x,
> duplicating as necessary?
> Of course y[x] produces the table I'm after, but in the case that y has
> extra columns I /don't/ want in the output and x has extra columns I /do/,
> `y[x]` is then not the table I'm after. (But now we are straying into a
> different question, my limited understanding of how to use data.table, as
> opposed to the bug this thread is about).
>
> PS - typo on the data.table Readmein the "if you get latex errors during
> installation" bit:
>
> devtools:::install_github("datat.able", ...)
>
> "datat.able" --> "data.table".
>
> cheers
> Amy
>
>
> On 20 June 2014 10:51, Arunkumar Srinivasan <aragorn168b at gmail.com> wrote:
>
>>  Hi,
>>
>> Could you let us know if you’re able to reproduce it in the devel
>> version 1.9.3 <https://github.com/Rdatatable/data.table> as well?
>>
>>
>>  Arun
>>
>> From: mathematical.coffee mathematical.coffee at gmail.com
>> Reply: mathematical.coffee mathematical.coffee at gmail.com
>> Date: June 20, 2014 at 2:44:50 AM
>> To: datatable-help at lists.r-forge.r-project.org
>> datatable-help at lists.r-forge.r-project.org
>> Subject:  Re: [datatable-help] What is going on with R 3.1 ?
>>
>>  Hi all,
>>
>> Sorry to resurrect an old thread, but I've been experiencing these
>> problems
>> too and have come up with a reproducible example (for me anyway).
>>
>> Data.table 1.9.2, R 3.1.0
>>
>> I was trying to join some tables and got the usual "rerun with
>> allow.cartesian=TRUE" message like Michele, and then got this error:
>>
>> Error in if (!is.null(lhs)) { : missing value where TRUE/FALSE needed
>>
>> However while I was trying to strip down my data to reproduce the error, I
>> now consistently get this one instead:
>>
>> Error in `[.data.table`(x, y, `:=`(female, female)) :
>> object 'bysubl' not found
>>
>>
>> rather than the TRUE/FALSE one. But they seem to be related.
>>
>> * x has a column of subjects, some duplicated
>> * y has a column of subjects, none duplicated, and some not present in x
>> (all subjects of x are in y though).
>> * y additionally has a binary column `female` that I wish to join into x
>>
>> (I know there are other ways to do this, but this is a stripped down
>> example
>> and seems to point out something going wrong in data.table so it is just
>> an
>> illustrative example):
>>
>> ```
>> library(data.table)
>> x=fread('x.csv')
>> y=fread('y.csv')
>> setkey(x, subject)
>> setkey(y, subject)
>>
>> x[y]
>> # Error in vecseq(f__, len__, if (allow.cartesian) NULL else
>> as.integer(max(nrow(x), :
>> # Join results in 33 rows; more than 28 = max(nrow(x),nrow(i)). Check for
>> duplicate key values in i, each of which join to the same group in x over
>> and over again. If that's ok, try including `j` and dropping `by`
>> (by-without-by) so that j runs for each group to avoid the large
>> allocation.
>> If you are sure you wish to proceed, rerun with allow.cartesian=TRUE.
>> Otherwise, please search for this error message in the FAQ, Wiki, Stack
>> Overflow and datatable-help for advice.
>>
>> x[y, female:=female]
>> Error in `[.data.table`(x, y, `:=`(female, female)) :
>> object 'bysubl' not found
>> ```
>>
>> I get the above reproducibly with this dataset.
>>
>> From now onwards, if I type in 'x' or 'y' into the prompt I get nothing
>> printed at all. Additionally:
>>
>> ```
>> tables()
>> # Error in gettext(domain, unlist(args)) : invalid 'string' value
>> # Error: argument "finally" is missing, with no default
>> ```
>>
>> The only solution is to restart the R session.
>>
>> Note: this *doesn't* occur if the column I try to merge (`female` in this
>> case) is continuous, for example. I can only get it if it's logical.
>>
>> I've attached x.csv and y.csv to this email for you to play with.
>>
>> I think it might be possible to strip down the tables to less rows (x has
>> 28, y has 26) but in my (not exhaustive) attempts to do so, I didn't get
>> this particular error.
>>
>> x.csv <http://r.789695.n4.nabble.com/file/n4692401/x.csv>
>> y.csv <http://r.789695.n4.nabble.com/file/n4692401/y.csv>
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/What-is-going-on-with-R-3-1-tp4689002p4692401.html
>> Sent from the datatable-help mailing list archive at Nabble.com.
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140620/4c690032/attachment-0001.html>


More information about the datatable-help mailing list