[datatable-help] Can crash R with a data.table query

Harish harishv_99 at yahoo.com
Thu Jul 1 18:21:36 CEST 2010


Tom and Matthew -- Thanks for confirming the issue.

I had to pull out each number (i.e. A==85 and A==25) separately because the real computation I had to do is not associative -- involves division, etc.  So the other approaches you suggested won't quite work.

I think that returning NA is quite acceptable and preferred; it is better than having the row missing.  It provides an opportunity for the person analyzing the data to realize that something was amiss (i.e. A==85 was missing for B=="b" in example).  It is also consistent with reshaping the data table by having the A's as columns where we would get NAs for missing data.  Then performing the same computation will give an NA.

Regards,
Harish


--- On Thu, 7/1/10, mdowle at mdowle.plus.com <mdowle at mdowle.plus.com> wrote:

> From: mdowle at mdowle.plus.com <mdowle at mdowle.plus.com>
> Subject: RE: [datatable-help] Can crash R with a data.table query
> To: "Short, Tom" <TShort at epri.com>
> Cc: mdowle at mdowle.plus.com, "Harish" <harishv_99 at yahoo.com>, datatable-help at lists.r-forge.r-project.org
> Date: Thursday, July 1, 2010, 5:43 AM
> 
> I see that too now. It'll be inside dogroups.c. Harish -
> can you add as
> bug please to tracker, good spot.  What should the
> result be though?  No
> rows, for group "b", or NA?  The way the j is
> constructed it can't be 9.
> 
> Other ways to do that :
> 
> DT[A%in%c(25,85),sum(C),by=B]  # ok
>      B  V1
> [1,] a  67
> [2,] b   9
> [3,] c 905
> 
> DT[,.SD[A%in%c(85,25),sum(C)],by=B]  # ok
>      B  V1
> [1,] a  67
> [2,] b   9
> [3,] c 905
> 
> DT[,.SD[A==25,C]+.SD[A==85,C],by=B] # crash too
> 
> > setkey(DT,A)
> > DT[J(c(25,85)),sum(C),by=B,mult="all"]  # ok,
> likely fastest
>      B  V1
> [1,] a  67
> [2,] b   9
> [3,] c 905
> 
> 
> 
> > That crashes R for me, too, somewhere in
> data.table.dll.
> >
> > - Tom
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: datatable-help-bounces at lists.r-forge.r-project.org
> >> [mailto:datatable-help-bounces at lists.r-forge.r-project.org]
> >> On Behalf Of mdowle at mdowle.plus.com
> >> Sent: Thursday, July 01, 2010 05:33
> >> To: Harish
> >> Cc: datatable-help at lists.r-forge.r-project.org
> >> Subject: Re: [datatable-help] Can crash R with a
> data.table query
> >>
> >> What you mean by 'crash'? R simply stops or theres
> a message?
> >> Try the clean install of latest 1.5, as per recent
> reply on
> >> other thread, and can go from there...
> >>
> >> > Hi,
> >> >
> >> > I am crashing R with the following code (and
> it might have
> >> something
> >> > to do with data tables as well):
> >> >
> >> > =========
> >> >
> >> >
> >> > DT <- structure(list(A = c(25L, 85L, 25L,
> 25L, 85L), B =
> >> > structure(c(1L, 1L, 2L, 3L, 3L), .Label =
> c("a", "b", "c"),
> >> class = "factor"),
> >> >     C = c(2L, 65L, 9L,
> 82L, 823L)), .Names = c("A", "B",
> >> "C"), class =
> >> > c("data.table", "data.frame"), row.names =
> c(NA, -5L))
> >> >
> >> > DT[ , data.table( A, C )[ A==25, C ] +
> data.table( A, C )[
> >> A==85, C ],
> >> > by=B ]
> >> >
> >> > =========
> >> >
> >> > For every B, I am trying to sum the C's where
> A is 25 and 85.
> >> >
> >> > The crash has something to do with my row
> selection
> >> criteria.  First,
> >> > note that for B=="b", I don't have
> A==85.  It looks like a
> >> numeric(0)
> >> > is being returned in this case.
> >> >
> >> > In order to avoid the crash, I had to do
> something like:
> >> >    if ( ! identical( DT[ blah ],
> numeric( 0 ) )
> >> >
> >> > It isn't just that R is unable to handle
> operations on numeric(0)
> >> > because I don't get a crash when I just type
> in "numeric(0)
> >> + 2".  So,
> >> > my guess is that it has something to do with
> data.table as well.
> >> >
> >> >
> >> > Harish
> >> >
> >> >
> >> >
> >> >
> >> >
> _______________________________________________
> >> > datatable-help mailing list
> >> > datatable-help at lists.r-forge.r-project.org
> >> >
> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable
> >> > -help
> >> >
> >>
> >>
> >> _______________________________________________
> >> datatable-help mailing list
> >> datatable-help at lists.r-forge.r-project.org
> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
> > atatable-help
> >>
> >
> 
> 
> 


      


More information about the datatable-help mailing list