[datatable-help] Bug when Merging with nomatch=0 ?=and =?utf-8?Q?roll=T?

Arunkumar Srinivasan aragorn168b at gmail.com
Fri Jun 20 14:24:22 CEST 2014


Awesome. Just got the email notification (from github). Thanks.

Arun

From: Michael Smith my.r.help at gmail.com
Reply: Michael Smith my.r.help at gmail.com
Date: June 20, 2014 at 2:23:32 PM
To: Arunkumar Srinivasan aragorn168b at gmail.com
Cc: datatable-help at lists.r-forge.r-project.org datatable-help at lists.r-forge.r-project.org
Subject:  Re: [datatable-help] Bug when Merging with nomatch=0 and roll=T?  

Arun,  

Thanks for your reply and the issue is here (if there's anything else I  
can do to help solve this problem let me know):  

https://github.com/Rdatatable/data.table/issues/700  

Also thanks for mentioning rollends.  

M  


On 06/20/2014 07:41 PM, Arunkumar Srinivasan wrote:  
> Michael,  
>  
> Excellent example. Perfectly reproducible on 1.9.2 and 1.9.3. And it  
> works fine on 1.8.10. The answer should've only 3 rows.  
> It'd be even more nice of you if you could file it as a bug report.  
>  
> PS: On another note.. you maybe also interested in `CS[SP, roll=TRUE,  
> rollends=TRUE]`  
> Arun  
>  
> From: Michael Smith my.r.help at gmail.com <mailto:my.r.help at gmail.com>  
> Reply: Michael Smith my.r.help at gmail.com <mailto:my.r.help at gmail.com>  
> Date: June 20, 2014 at 1:30:09 PM  
> To: Arunkumar Srinivasan aragorn168b at gmail.com  
> <mailto:aragorn168b at gmail.com>  
> Cc: datatable-help at lists.r-forge.r-project.org  
> datatable-help at lists.r-forge.r-project.org  
> <mailto:datatable-help at lists.r-forge.r-project.org>  
> Subject: Re: [datatable-help] Bug when Merging with nomatch=0 and roll=T?  
>  
>> OK, no problem, here's the code. If there are any problems pasting it  
>> into R let me know (I used parts of dput, so maybe the email line  
>> endings are messed up). If you want I can also file a bug report on  
>> github, just let me know.  
>>  
>> CS <-  
>> data.table(  
>> structure(list(LPERMCO = c(7L, 33L), datadate = structure(c(15912,  
>> 15912), class = "Date"), me = c(626550.35284, 7766.385)), .Names =  
>> c("LPERMCO",  
>> "datadate", "me"), class = "data.frame", row.names = c(NA, -2L  
>> )),  
>> key = "LPERMCO,datadate")  
>> SP <-  
>> data.table(  
>> structure(list(PERMCO = c(7L, 7L, 33L, 33L, 33L, 33L), date =  
>> structure(c(15884,  
>> 15917, 15884, 15884, 15917, 15917), class = "Date"), RET = c(-0.118303,  
>> 0.141225, -0.03137, -0.02533, 0.045967, 0.043694)), .Names = c("PERMCO",  
>> "date", "RET"), class = "data.frame", row.names = c(NA, -6L)),  
>> key = "PERMCO,date")  
>> sapply(CS[SP, nomatch = 0, roll = T], length)  
>>  
>>  
>> The relevant output looks like this, both in 1.9.2 and in dev-1.9.3, and  
>> for sapply, the "me" column should be 5 but it's 3:  
>>  
>> > CS  
>> LPERMCO datadate me  
>> 1: 7 2013-07-26 626550.353  
>> 2: 33 2013-07-26 7766.385  
>> > SP  
>> PERMCO date RET  
>> 1: 7 2013-06-28 -0.118303  
>> 2: 7 2013-07-31 0.141225  
>> 3: 33 2013-06-28 -0.031370  
>> 4: 33 2013-06-28 -0.025330  
>> 5: 33 2013-07-31 0.045967  
>> 6: 33 2013-07-31 0.043694  
>> > CS[SP, nomatch = 0, roll = T]  
>> LPERMCO datadate me RET  
>> 1: 7 2013-07-31 626550.353 0.141225  
>> 2: 33 2013-06-28 7766.385 -0.031370  
>> 3: 33 2013-06-28 7766.385 -0.025330  
>> 4: 33 2013-07-31 626550.353 0.045967  
>> 5: 33 2013-07-31 7766.385 0.043694  
>> Warning message:  
>> In cbind(LPERMCO = c(" 7", "33", "33", "33", "33"), datadate =  
>> c("2013-07-31", :  
>> number of rows of result is not a multiple of vector length (arg 3)  
>> > sapply(CS[SP, nomatch = 0, roll = T], length)  
>> LPERMCO datadate me RET  
>> 5 5 3 5  
>>  
>>  
>> Thanks,  
>> M  
>>  
>>  
>>  
>>  
>>  
>> On 06/20/2014 05:17 PM, Arunkumar Srinivasan wrote:  
>> >> For a given data.table, is there any condition … Ergo, it's a bug,  
>> >> right?  
>> >  
>> > Yes.  
>> >  
>> >> I'll be glad  
>> >> to try to boil this down to something that's reproducible.  
>> >  
>> > That'd be great.  
>> >  
>> >  
>> > Arun  
>> >  
>> > From: Michael Smith my.r.help at gmail.com <mailto:my.r.help at gmail.com>  
>> > Reply: Michael Smith my.r.help at gmail.com <mailto:my.r.help at gmail.com>  
>> > Date: June 20, 2014 at 5:37:24 AM  
>> > To: datatable-help at lists.r-forge.r-project.org  
>> > datatable-help at lists.r-forge.r-project.org  
>> > <mailto:datatable-help at lists.r-forge.r-project.org>  
>> > Subject: Re: [datatable-help] Bug when Merging with nomatch=0 and roll=T?  
>> >  
>> >> So let me rephrase my question (haven't received an answer so far):  
>> >>  
>> >> For a given data.table, is there any condition under which the lengths  
>> >> of the vectors in each column may differ? Based on my understanding,  
>> >> each data.table is also a data.frame, and with a data frame this should  
>> >> not be possible. For example, it's not possible to have a data.frame  
>> >> where the first column is a vector of length eight, and the second  
>> >> column is a vector of length nine. Ergo, it's a bug, right?  
>> >>  
>> >> If my understanding is correct, please do let me know and I'll be glad  
>> >> to try to boil this down to something that's reproducible.  
>> >>  
>> >> Thanks,  
>> >> M  
>> >>  
>> >> On 06/19/2014 11:59 AM, Michael Smith wrote:  
>> >> > By the way, I know it's not reproducible with the code below. Before  
>> >> > going into further detail, I first wanted to ask whether this looks like  
>> >> > a bug, or whether I've overlooked something obvious and this is expected  
>> >> > behavior.  
>> >> >  
>> >> > Thanks,  
>> >> > M  
>> >> >  
>> >> > On 06/19/2014 11:51 AM, Michael Smith wrote:  
>> >> >> I got the following result on my keyed data tables `CS` and `SP`, which  
>> >> >> seems like a bug (in 1.9.2 and 1.9.3 dev version) to me, since all  
>> >> >> columns should have the _same_ length:  
>> >> >>  
>> >> >>> ## Works as expected:  
>> >> >>> all((l <- sapply(CS[SP, roll = TRUE], length)) == l[1])  
>> >> >> [1] TRUE  
>> >> >>> ## Works as expected:  
>> >> >>> all((l <- sapply(CS[SP, nomatch = 0], length)) == l[1])  
>> >> >> [1] TRUE  
>> >> >>> ## Here's the potential _bug_, when combining both:  
>> >> >>> all((l <- sapply(CS[SP, nomatch = 0, roll = TRUE], length)) == l[1])  
>> >> >> [1] FALSE  
>> >> >>  
>> >> >>  
>> >> >> Thanks,  
>> >> >>  
>> >> >> M  
>> >> >>  
>> >> _______________________________________________  
>> >> datatable-help mailing list  
>> >> datatable-help at lists.r-forge.r-project.org  
>> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help  
>> >>  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140620/f9b1555f/attachment.html>


More information about the datatable-help mailing list