[datatable-help] Can crash R with a data.table query

mdowle at mdowle.plus.com mdowle at mdowle.plus.com
Thu Jul 1 18:40:05 CEST 2010


Interesting, I don't follow - the real computation for which the other
methods won't work.  Possible to give example?
Matthew

> Tom and Matthew -- Thanks for confirming the issue.
>
> I had to pull out each number (i.e. A==85 and A==25) separately because
> the real computation I had to do is not associative -- involves division,
> etc.  So the other approaches you suggested won't quite work.
>
> I think that returning NA is quite acceptable and preferred; it is better
> than having the row missing.  It provides an opportunity for the person
> analyzing the data to realize that something was amiss (i.e. A==85 was
> missing for B=="b" in example).  It is also consistent with reshaping the
> data table by having the A's as columns where we would get NAs for missing
> data.  Then performing the same computation will give an NA.
>
> Regards,
> Harish
>
>
> --- On Thu, 7/1/10, mdowle at mdowle.plus.com <mdowle at mdowle.plus.com> wrote:
>
>> From: mdowle at mdowle.plus.com <mdowle at mdowle.plus.com>
>> Subject: RE: [datatable-help] Can crash R with a data.table query
>> To: "Short, Tom" <TShort at epri.com>
>> Cc: mdowle at mdowle.plus.com, "Harish" <harishv_99 at yahoo.com>,
>> datatable-help at lists.r-forge.r-project.org
>> Date: Thursday, July 1, 2010, 5:43 AM
>>
>> I see that too now. It'll be inside dogroups.c. Harish -
>> can you add as
>> bug please to tracker, good spot.  What should the
>> result be though?  No
>> rows, for group "b", or NA?  The way the j is
>> constructed it can't be 9.
>>
>> Other ways to do that :
>>
>> DT[A%in%c(25,85),sum(C),by=B]  # ok
>>      B  V1
>> [1,] a  67
>> [2,] b   9
>> [3,] c 905
>>
>> DT[,.SD[A%in%c(85,25),sum(C)],by=B]  # ok
>>      B  V1
>> [1,] a  67
>> [2,] b   9
>> [3,] c 905
>>
>> DT[,.SD[A==25,C]+.SD[A==85,C],by=B] # crash too
>>
>> > setkey(DT,A)
>> > DT[J(c(25,85)),sum(C),by=B,mult="all"]  # ok,
>> likely fastest
>>      B  V1
>> [1,] a  67
>> [2,] b   9
>> [3,] c 905
>>
>>
>>
>> > That crashes R for me, too, somewhere in
>> data.table.dll.
>> >
>> > - Tom
>> >
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: datatable-help-bounces at lists.r-forge.r-project.org
>> >> [mailto:datatable-help-bounces at lists.r-forge.r-project.org]
>> >> On Behalf Of mdowle at mdowle.plus.com
>> >> Sent: Thursday, July 01, 2010 05:33
>> >> To: Harish
>> >> Cc: datatable-help at lists.r-forge.r-project.org
>> >> Subject: Re: [datatable-help] Can crash R with a
>> data.table query
>> >>
>> >> What you mean by 'crash'? R simply stops or theres
>> a message?
>> >> Try the clean install of latest 1.5, as per recent
>> reply on
>> >> other thread, and can go from there...
>> >>
>> >> > Hi,
>> >> >
>> >> > I am crashing R with the following code (and
>> it might have
>> >> something
>> >> > to do with data tables as well):
>> >> >
>> >> > =========
>> >> >
>> >> >
>> >> > DT <- structure(list(A = c(25L, 85L, 25L,
>> 25L, 85L), B =
>> >> > structure(c(1L, 1L, 2L, 3L, 3L), .Label =
>> c("a", "b", "c"),
>> >> class = "factor"),
>> >> >     C = c(2L, 65L, 9L,
>> 82L, 823L)), .Names = c("A", "B",
>> >> "C"), class =
>> >> > c("data.table", "data.frame"), row.names =
>> c(NA, -5L))
>> >> >
>> >> > DT[ , data.table( A, C )[ A==25, C ] +
>> data.table( A, C )[
>> >> A==85, C ],
>> >> > by=B ]
>> >> >
>> >> > =========
>> >> >
>> >> > For every B, I am trying to sum the C's where
>> A is 25 and 85.
>> >> >
>> >> > The crash has something to do with my row
>> selection
>> >> criteria.  First,
>> >> > note that for B=="b", I don't have
>> A==85.  It looks like a
>> >> numeric(0)
>> >> > is being returned in this case.
>> >> >
>> >> > In order to avoid the crash, I had to do
>> something like:
>> >> >    if ( ! identical( DT[ blah ],
>> numeric( 0 ) )
>> >> >
>> >> > It isn't just that R is unable to handle
>> operations on numeric(0)
>> >> > because I don't get a crash when I just type
>> in "numeric(0)
>> >> + 2".  So,
>> >> > my guess is that it has something to do with
>> data.table as well.
>> >> >
>> >> >
>> >> > Harish
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> _______________________________________________
>> >> > datatable-help mailing list
>> >> > datatable-help at lists.r-forge.r-project.org
>> >> >
>> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable
>> >> > -help
>> >> >
>> >>
>> >>
>> >> _______________________________________________
>> >> datatable-help mailing list
>> >> datatable-help at lists.r-forge.r-project.org
>> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
>> > atatable-help
>> >>
>> >
>>
>>
>>
>
>
>
>




More information about the datatable-help mailing list