[datatable-help] Random segfaults

Chris Neff caneff at gmail.com
Thu Dec 15 17:08:09 CET 2011


Internal build of R. Can't upgrade until they do.  I think it is
unlikely to see 2.14 any time soon.

On 15 December 2011 10:50, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> Hi,
>
> Out of curiosity, is it impossible for you to upgrade R to the latest, or?
>
> -steve
>
>
> On Thu, Dec 15, 2011 at 10:42 AM, Chris Neff <caneff at gmail.com> wrote:
>> I always use svn up. I'll reboot and reinstall just to make sure. As
>> for reproducible, it still doesn't seem to crash in any consistent
>> place but I'll give it a stronger try with a test data set.
>>
>> All 480 tests in test.data.table() completed ok in 7.395sec
>> R version 2.12.1 (2010-12-16)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>> LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>>  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>> LC_PAPER=en_US.utf8       LC_NAME=C
>>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
>> LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  grid
>> methods   base
>>
>> other attached packages:
>> [1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
>> data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
>> [6] plyr_1.6
>>
>> On 15 December 2011 09:52, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>>
>>> And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
>>> from R-Forge won't include the fix yet. So svn up, then R CMD build, then
>>> R CMD INSTALL, right? (Just checking quick basics first).
>>>
>>>> Result of test.data.table(), sessionInfo() and confirm it's a clean
>>>> install after a reboot to make sure no old .so is still knocking around
>>>> somehow please. Definitely installed to the right library? If it's
>>>> crashing a lot then it should be reproducible?
>>>> Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
>>>> fixed there either that'll help to know....
>>>>
>>>>> Latest SVN version, no alloccol set, still crashing a lot.  I don't
>>>>> use [<- or $<-, the only times I modify a data.table are with :=  or
>>>>> by doing DT=merge(DT,blah).
>>>>>
>>>>> Any more info I can provide?
>>>>>
>>>>> On 15 December 2011 08:32, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>>>>> Great fingers and toes crossed. If you could unset alloccol option just
>>>>>> to
>>>>>> be sure please, that would be great. You're our best hope of confirming
>>>>>> it's fixed since it was biting you several times an hour. If you use
>>>>>> [<-
>>>>>> or $<- syntax then R will copy via *tmp* and at that point the *tmp*
>>>>>> data.table is similar to a data.table loaded from disk in that it isn't
>>>>>> over-allocated anymore, I realised. Also a copy() will lose
>>>>>> over-allocation until the next column addition.  That 'should' all be
>>>>>> fine
>>>>>> now in both <=2.13.2 and >=2.14.0, although the bug was something
>>>>>> simpler.
>>>>>>
>>>>>> 1.7.7 is on CRAN now and been built for windows so if CRAN check
>>>>>> results
>>>>>> tick over from "ERROR" to "OK" later today (for both windows and mac
>>>>>> old-rel), and, you're ok too, then it's fixed.
>>>>>>
>>>>>>
>>>>>>> I've updated to the latest SVN version, and I'll be sure to let you
>>>>>>> know if it still crashes (however I do have the alloccol option set to
>>>>>>> 1000, so I shouldn't be bumping into reallocation very often). Thanks
>>>>>>> for finding the bug so fast!
>>>>>>>
>>>>>>> On 14 December 2011 19:56, Matthew Dowle <mdowle at mdowle.plus.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hm. Sounds like it could be a different problem then if it was in R
>>>>>>>> 2.14. There have been quite a few fixes since 1.7.4 so if you can
>>>>>>>> reproduce with 1.7.7 would be great.  Or, we've sometimes seen that
>>>>>>>> just
>>>>>>>> after a package upgrade that a clean re-install can often fix things.
>>>>>>>> Perhaps if the .so was in use by another R process or a zombie, or
>>>>>>>> something. R seems to report data.table v1.7.4 (say) but it hasn't
>>>>>>>> fully
>>>>>>>> installed it properly and is still (perhaps partially) at 1.7.3. So
>>>>>>>> quit
>>>>>>>> all R (reboot to clear zombies too perhaps) and try reinstalling
>>>>>>>> using
>>>>>>>> R
>>>>>>>> CMD INSTALL. Next time it happens I mean. Can also run
>>>>>>>> test.data.table()
>>>>>>>> to check the install.
>>>>>>>>
>>>>>>>> On Wed, 2011-12-14 at 17:40 +0000, Timothée Carayol wrote:
>>>>>>>>> Hi --
>>>>>>>>>
>>>>>>>>> I have been having many unreproducible bugs with R 2.14, data.table
>>>>>>>>> 1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
>>>>>>>>> corrupted, and then R crashed. I had to go back to data.frame for
>>>>>>>>> the
>>>>>>>>> bits of code affected. I was doing a lot of rather unsafe
>>>>>>>>> manipulations with row names, rbind and cbinds.
>>>>>>>>> I didn't file a report, nor signal it, as it was occurring seemingly
>>>>>>>>> at random, and I was doing operations which aren't really what
>>>>>>>>> data.table was made for (tons of little manipulations on small
>>>>>>>>> data);
>>>>>>>>> still I guess I should now signal that 2.14 didn't fix everything
>>>>>>>>> for
>>>>>>>>> me. I do not know whether bugs subsist on post-1.7.4 versions.
>>>>>>>>>
>>>>>>>>> t
>>>>>>>>>
>>>>>>>>> On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
>>>>>>>>> <mdowle at mdowle.plus.com>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > Maybe, worth a try. Are you loading any data.table objects from
>>>>>>>>> disk?
>>>>>>>>> >
>>>>>>>>> >> 64 bit 2.12.1 linux.
>>>>>>>>> >>
>>>>>>>>> >> Is there an option I can set in my session in order to work
>>>>>>>>> around
>>>>>>>>> the
>>>>>>>>> >> truelength issue? I don't care if I lose some of the
>>>>>>>>> over-allocation
>>>>>>>>> >> niceties if it stops things from crashing. Looking at the
>>>>>>>>> truelength
>>>>>>>>> >> help, would just doing:
>>>>>>>>> >>
>>>>>>>>> >> options(datatable.alloc=quote(1000))
>>>>>>>>> >>
>>>>>>>>> >> stop this? I never have more than about 50 columns at a time.
>>>>>>>>> >>
>>>>>>>>> >> On 14 December 2011 11:43, Matthew Dowle <mdowle at mdowle.plus.com>
>>>>>>>>> wrote:
>>>>>>>>> >>>
>>>>>>>>> >>> You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
>>>>>>>>> to
>>>>>>>>> make
>>>>>>>>> >>> over-allocation work because R only started to initialize
>>>>>>>>> truelength to
>>>>>>>>> >>> 0
>>>>>>>>> >>> in R 2.14.0+. Before that it's unitialized (random). Trouble is
>>>>>>>>> my
>>>>>>>>> >>> attempts in R < 2.14.0 to work around that work fine for me in
>>>>>>>>> linux
>>>>>>>>> >>> 32bit
>>>>>>>>> >>> when I test in R 2.13.2, and I even test in 2.12.0 too. I test
>>>>>>>>> on
>>>>>>>>> 64bit
>>>>>>>>> >>> too but just 2.14.0.  CRAN is also showing errors on 2.13.2
>>>>>>>>> (old-rel)
>>>>>>>>> >>> for
>>>>>>>>> >>> both mac and windows.
>>>>>>>>> >>>
>>>>>>>>> >>> So, this is a pre-2.14.0 (only) problem that I'll continue to
>>>>>>>>> try
>>>>>>>>> and
>>>>>>>>> >>> fix.
>>>>>>>>> >>>
>>>>>>>>> >>> Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
>>>>>>>>> it
>>>>>>>>> adds
>>>>>>>>> >>> weight to me installing pre-2.14.0 on my 64bit instance in an
>>>>>>>>> effort to
>>>>>>>>> >>> reproduce.
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>>> This will be a crappy help request because I can't seem to
>>>>>>>>> reproduce
>>>>>>>>> >>>> it, but the past few days I've been getting a lot of segfaults.
>>>>>>>>>  The
>>>>>>>>> >>>> only common thing between every crash is that it happens when I
>>>>>>>>> do
>>>>>>>>> >>>>
>>>>>>>>> >>>> DT[, z := x]
>>>>>>>>> >>>>
>>>>>>>>> >>>> where z was not a column that existed in DT before, and x is
>>>>>>>>> either an
>>>>>>>>> >>>> existing column of DT or a separate variable, doesn't matter.
>>>>>>>>>  Beyond
>>>>>>>>> >>>> that I can't reproduce a set of steps that gets R to crash.
>>>>>>>>>  This
>>>>>>>>> is
>>>>>>>>> >>>> with the latest SVN version.
>>>>>>>>> >>>>
>>>>>>>>> >>>> Is there more information I can provide to help track this
>>>>>>>>> down?
>>>>>>>>> >>>> _______________________________________________
>>>>>>>>> >>>> datatable-help mailing list
>>>>>>>>> >>>> datatable-help at lists.r-forge.r-project.org
>>>>>>>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>>>>>> >>>>
>>>>>>>>> >>>
>>>>>>>>> >>>
>>>>>>>>> >>
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > _______________________________________________
>>>>>>>>> > datatable-help mailing list
>>>>>>>>> > datatable-help at lists.r-forge.r-project.org
>>>>>>>>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> datatable-help mailing list
>>>>>>>> datatable-help at lists.r-forge.r-project.org
>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> datatable-help mailing list
>>>> datatable-help at lists.r-forge.r-project.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>
>>>
>>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact


More information about the datatable-help mailing list