[datatable-help] Random segfaults

Chris Neff caneff at gmail.com
Fri Dec 16 16:48:00 CET 2011


On 16 December 2011 10:43, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>
> Great, thanks. Have seen this quite a bit, see FAQ 4.3. It indicates an
> earlier memory corruption happened, could have been at any point. It's not
> anything to do with locale or CHARSXP. The next step is to follow all the
> steps in section 4.3 of R-exts. Turn on gctorture, --use-gct,
> --enable-strict-barrier, and, valgrind especially. The goal is to detect
> where the earlier corruption is happening.
>
> On the tenterhook front, 1.7.7 is now passing CRAN checks for oldrel (both
> mac and windows) fully OK so that means the last fix definitely fixed the
> problem I found, so that's some progress.
>
> But, since 1.7.7+ doesn't fix it for you it means either :
>
>  i) you've found a new corruption that could happen in 2.14.0+, too.
>
> or,
>
>  ii) you've found a new problem in my workaround attempts for
> uninitialized truelength in <=2.13.2. That might lead to unexpected
> information that could lead to improvements in 2.14.0+ in unexpected
> ways.
>
> So either way it's worth following this trail, if you're ok to do so. Fast
> techniques to debug the corruptions (e.g. valgrind) might come in handy in
> future anyway.

Okay, maybe later today (or Monday) I will try this.
>
> Only other thought ... your special internal build of R ... does it
> increase R_len_t on 64bit to allow longer vectors than 2^31, by any
> chance?  I've used R_len_t quite a bit in data.table to future proof for
> when that happens, but if you've done it already in your build then that
> would help to know since it's never been tested afaik when R_len_t != int
> on 64bit.  I'm also assuming R_len_t is signed. If your R has R_len_t as
> unsigned would need to know.
>

Asked the guy who would know. Is there anyway I can find out through R?

>
>> On the current latest SVN build, with debugging enabled as listed
>> below, I get the following when trying to even print the contents of a
>> data.table:
>>
>> Error in do.call("cbind", lapply(x, format, justify = justify, ...)) :
>>   'getCharCE' must be called on a CHARSXP
>>
>> Never saw this error without debugging.  I tried printing a few times
>> in a row, got this same error, and then like the 4th time it
>> segfaulted.
>>
>> Having a hard time reproducing that, but at least it is something?
>>
>>
>> On 15 December 2011 15:05, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>>
>>> One thought ... how about turning on debugging. That way when it crashes
>>> at least you can report the file and line number. Btw, I've installed
>>> 2.12.0 on 64bit in case that managed to reproduce, but it still works
>>> for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
>>> left with you debugging at your end, but should be fairly easy ...
>>>
>>> sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
>>> data.table_1.7.7.tar.gz
>>>
>>> R -d gdb
>>>
>>> run
>>>
>>> Do the stuff that crashes it.  Does it report a C file and line number?
>>>
>>> Just to rule out possible svn / R CMD build strangeness, please also use
>>> the data.table_1.7.7.tar.gz that's on CRAN.  It still hasn't run checks
>>> for 1.7.7 so on tenterhooks for that.
>>>
>>>
>>>
>>> On Thu, 2011-12-15 at 12:26 -0500, Chris Neff wrote:
>>>> Just to come back, it still crashes at seemingly random times.   I'm
>>>> reverting back to an earlier version (1.7.1) to see if that fixes my
>>>> problem.
>>>>
>>>> On 15 December 2011 11:08, Chris Neff <caneff at gmail.com> wrote:
>>>> > Internal build of R. Can't upgrade until they do.  I think it is
>>>> > unlikely to see 2.14 any time soon.
>>>> >
>>>> > On 15 December 2011 10:50, Steve Lianoglou
>>>> > <mailinglist.honeypot at gmail.com> wrote:
>>>> >> Hi,
>>>> >>
>>>> >> Out of curiosity, is it impossible for you to upgrade R to the
>>>> latest, or?
>>>> >>
>>>> >> -steve
>>>> >>
>>>> >>
>>>> >> On Thu, Dec 15, 2011 at 10:42 AM, Chris Neff <caneff at gmail.com>
>>>> wrote:
>>>> >>> I always use svn up. I'll reboot and reinstall just to make sure.
>>>> As
>>>> >>> for reproducible, it still doesn't seem to crash in any consistent
>>>> >>> place but I'll give it a stronger try with a test data set.
>>>> >>>
>>>> >>> All 480 tests in test.data.table() completed ok in 7.395sec
>>>> >>> R version 2.12.1 (2010-12-16)
>>>> >>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>> >>>
>>>> >>> locale:
>>>> >>>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>>>> >>> LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>>>> >>>  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>>>> >>> LC_PAPER=en_US.utf8       LC_NAME=C
>>>> >>>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
>>>> >>> LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>>> >>>
>>>> >>> attached base packages:
>>>> >>> [1] stats     graphics  grDevices utils     datasets  grid
>>>> >>> methods   base
>>>> >>>
>>>> >>> other attached packages:
>>>> >>> [1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
>>>> >>> data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
>>>> >>> [6] plyr_1.6
>>>> >>>
>>>> >>> On 15 December 2011 09:52, Matthew Dowle <mdowle at mdowle.plus.com>
>>>> wrote:
>>>> >>>>
>>>> >>>> And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz
>>>> snapshot
>>>> >>>> from R-Forge won't include the fix yet. So svn up, then R CMD
>>>> build, then
>>>> >>>> R CMD INSTALL, right? (Just checking quick basics first).
>>>> >>>>
>>>> >>>>> Result of test.data.table(), sessionInfo() and confirm it's a
>>>> clean
>>>> >>>>> install after a reboot to make sure no old .so is still knocking
>>>> around
>>>> >>>>> somehow please. Definitely installed to the right library? If
>>>> it's
>>>> >>>>> crashing a lot then it should be reproducible?
>>>> >>>>> Still waiting for CRAN check results for 1.7.7 in old-rel. If
>>>> it's not
>>>> >>>>> fixed there either that'll help to know....
>>>> >>>>>
>>>> >>>>>> Latest SVN version, no alloccol set, still crashing a lot.  I
>>>> don't
>>>> >>>>>> use [<- or $<-, the only times I modify a data.table are with :=
>>>>  or
>>>> >>>>>> by doing DT=merge(DT,blah).
>>>> >>>>>>
>>>> >>>>>> Any more info I can provide?
>>>> >>>>>>
>>>> >>>>>> On 15 December 2011 08:32, Matthew Dowle
>>>> <mdowle at mdowle.plus.com> wrote:
>>>> >>>>>>> Great fingers and toes crossed. If you could unset alloccol
>>>> option just
>>>> >>>>>>> to
>>>> >>>>>>> be sure please, that would be great. You're our best hope of
>>>> confirming
>>>> >>>>>>> it's fixed since it was biting you several times an hour. If
>>>> you use
>>>> >>>>>>> [<-
>>>> >>>>>>> or $<- syntax then R will copy via *tmp* and at that point the
>>>> *tmp*
>>>> >>>>>>> data.table is similar to a data.table loaded from disk in that
>>>> it isn't
>>>> >>>>>>> over-allocated anymore, I realised. Also a copy() will lose
>>>> >>>>>>> over-allocation until the next column addition.  That 'should'
>>>> all be
>>>> >>>>>>> fine
>>>> >>>>>>> now in both <=2.13.2 and >=2.14.0, although the bug was
>>>> something
>>>> >>>>>>> simpler.
>>>> >>>>>>>
>>>> >>>>>>> 1.7.7 is on CRAN now and been built for windows so if CRAN
>>>> check
>>>> >>>>>>> results
>>>> >>>>>>> tick over from "ERROR" to "OK" later today (for both windows
>>>> and mac
>>>> >>>>>>> old-rel), and, you're ok too, then it's fixed.
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>> I've updated to the latest SVN version, and I'll be sure to
>>>> let you
>>>> >>>>>>>> know if it still crashes (however I do have the alloccol
>>>> option set to
>>>> >>>>>>>> 1000, so I shouldn't be bumping into reallocation very often).
>>>> Thanks
>>>> >>>>>>>> for finding the bug so fast!
>>>> >>>>>>>>
>>>> >>>>>>>> On 14 December 2011 19:56, Matthew Dowle
>>>> <mdowle at mdowle.plus.com>
>>>> >>>>>>>> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>> Hm. Sounds like it could be a different problem then if it
>>>> was in R
>>>> >>>>>>>>> 2.14. There have been quite a few fixes since 1.7.4 so if you
>>>> can
>>>> >>>>>>>>> reproduce with 1.7.7 would be great.  Or, we've sometimes
>>>> seen that
>>>> >>>>>>>>> just
>>>> >>>>>>>>> after a package upgrade that a clean re-install can often fix
>>>> things.
>>>> >>>>>>>>> Perhaps if the .so was in use by another R process or a
>>>> zombie, or
>>>> >>>>>>>>> something. R seems to report data.table v1.7.4 (say) but it
>>>> hasn't
>>>> >>>>>>>>> fully
>>>> >>>>>>>>> installed it properly and is still (perhaps partially) at
>>>> 1.7.3. So
>>>> >>>>>>>>> quit
>>>> >>>>>>>>> all R (reboot to clear zombies too perhaps) and try
>>>> reinstalling
>>>> >>>>>>>>> using
>>>> >>>>>>>>> R
>>>> >>>>>>>>> CMD INSTALL. Next time it happens I mean. Can also run
>>>> >>>>>>>>> test.data.table()
>>>> >>>>>>>>> to check the install.
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Wed, 2011-12-14 at 17:40 +0000, Timothée Carayol wrote:
>>>> >>>>>>>>>> Hi --
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> I have been having many unreproducible bugs with R 2.14,
>>>> data.table
>>>> >>>>>>>>>> 1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
>>>> >>>>>>>>>> corrupted, and then R crashed. I had to go back to
>>>> data.frame for
>>>> >>>>>>>>>> the
>>>> >>>>>>>>>> bits of code affected. I was doing a lot of rather unsafe
>>>> >>>>>>>>>> manipulations with row names, rbind and cbinds.
>>>> >>>>>>>>>> I didn't file a report, nor signal it, as it was occurring
>>>> seemingly
>>>> >>>>>>>>>> at random, and I was doing operations which aren't really
>>>> what
>>>> >>>>>>>>>> data.table was made for (tons of little manipulations on
>>>> small
>>>> >>>>>>>>>> data);
>>>> >>>>>>>>>> still I guess I should now signal that 2.14 didn't fix
>>>> everything
>>>> >>>>>>>>>> for
>>>> >>>>>>>>>> me. I do not know whether bugs subsist on post-1.7.4
>>>> versions.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> t
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
>>>> >>>>>>>>>> <mdowle at mdowle.plus.com>
>>>> >>>>>>>>>> wrote:
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > Maybe, worth a try. Are you loading any data.table objects
>>>> from
>>>> >>>>>>>>>> disk?
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> >> 64 bit 2.12.1 linux.
>>>> >>>>>>>>>> >>
>>>> >>>>>>>>>> >> Is there an option I can set in my session in order to
>>>> work
>>>> >>>>>>>>>> around
>>>> >>>>>>>>>> the
>>>> >>>>>>>>>> >> truelength issue? I don't care if I lose some of the
>>>> >>>>>>>>>> over-allocation
>>>> >>>>>>>>>> >> niceties if it stops things from crashing. Looking at the
>>>> >>>>>>>>>> truelength
>>>> >>>>>>>>>> >> help, would just doing:
>>>> >>>>>>>>>> >>
>>>> >>>>>>>>>> >> options(datatable.alloc=quote(1000))
>>>> >>>>>>>>>> >>
>>>> >>>>>>>>>> >> stop this? I never have more than about 50 columns at a
>>>> time.
>>>> >>>>>>>>>> >>
>>>> >>>>>>>>>> >> On 14 December 2011 11:43, Matthew Dowle
>>>> <mdowle at mdowle.plus.com>
>>>> >>>>>>>>>> wrote:
>>>> >>>>>>>>>> >>>
>>>> >>>>>>>>>> >>> You're R < 2.14.0, right?  I'm really struggling in R <
>>>> 2.14.0
>>>> >>>>>>>>>> to
>>>> >>>>>>>>>> make
>>>> >>>>>>>>>> >>> over-allocation work because R only started to
>>>> initialize
>>>> >>>>>>>>>> truelength to
>>>> >>>>>>>>>> >>> 0
>>>> >>>>>>>>>> >>> in R 2.14.0+. Before that it's unitialized (random).
>>>> Trouble is
>>>> >>>>>>>>>> my
>>>> >>>>>>>>>> >>> attempts in R < 2.14.0 to work around that work fine for
>>>> me in
>>>> >>>>>>>>>> linux
>>>> >>>>>>>>>> >>> 32bit
>>>> >>>>>>>>>> >>> when I test in R 2.13.2, and I even test in 2.12.0 too.
>>>> I test
>>>> >>>>>>>>>> on
>>>> >>>>>>>>>> 64bit
>>>> >>>>>>>>>> >>> too but just 2.14.0.  CRAN is also showing errors on
>>>> 2.13.2
>>>> >>>>>>>>>> (old-rel)
>>>> >>>>>>>>>> >>> for
>>>> >>>>>>>>>> >>> both mac and windows.
>>>> >>>>>>>>>> >>>
>>>> >>>>>>>>>> >>> So, this is a pre-2.14.0 (only) problem that I'll
>>>> continue to
>>>> >>>>>>>>>> try
>>>> >>>>>>>>>> and
>>>> >>>>>>>>>> >>> fix.
>>>> >>>>>>>>>> >>>
>>>> >>>>>>>>>> >>> Are you 64bit pre-2.14.0? Which OS?  If you are 64bit
>>>> linux then
>>>> >>>>>>>>>> it
>>>> >>>>>>>>>> adds
>>>> >>>>>>>>>> >>> weight to me installing pre-2.14.0 on my 64bit instance
>>>> in an
>>>> >>>>>>>>>> effort to
>>>> >>>>>>>>>> >>> reproduce.
>>>> >>>>>>>>>> >>>
>>>> >>>>>>>>>> >>>
>>>> >>>>>>>>>> >>>> This will be a crappy help request because I can't seem
>>>> to
>>>> >>>>>>>>>> reproduce
>>>> >>>>>>>>>> >>>> it, but the past few days I've been getting a lot of
>>>> segfaults.
>>>> >>>>>>>>>>  The
>>>> >>>>>>>>>> >>>> only common thing between every crash is that it
>>>> happens when I
>>>> >>>>>>>>>> do
>>>> >>>>>>>>>> >>>>
>>>> >>>>>>>>>> >>>> DT[, z := x]
>>>> >>>>>>>>>> >>>>
>>>> >>>>>>>>>> >>>> where z was not a column that existed in DT before, and
>>>> x is
>>>> >>>>>>>>>> either an
>>>> >>>>>>>>>> >>>> existing column of DT or a separate variable, doesn't
>>>> matter.
>>>> >>>>>>>>>>  Beyond
>>>> >>>>>>>>>> >>>> that I can't reproduce a set of steps that gets R to
>>>> crash.
>>>> >>>>>>>>>>  This
>>>> >>>>>>>>>> is
>>>> >>>>>>>>>> >>>> with the latest SVN version.
>>>> >>>>>>>>>> >>>>
>>>> >>>>>>>>>> >>>> Is there more information I can provide to help track
>>>> this
>>>> >>>>>>>>>> down?
>>>> >>>>>>>>>> >>>> _______________________________________________
>>>> >>>>>>>>>> >>>> datatable-help mailing list
>>>> >>>>>>>>>> >>>> datatable-help at lists.r-forge.r-project.org
>>>> >>>>>>>>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>> >>>>>>>>>> >>>>
>>>> >>>>>>>>>> >>>
>>>> >>>>>>>>>> >>>
>>>> >>>>>>>>>> >>
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> >
>>>> >>>>>>>>>> > _______________________________________________
>>>> >>>>>>>>>> > datatable-help mailing list
>>>> >>>>>>>>>> > datatable-help at lists.r-forge.r-project.org
>>>> >>>>>>>>>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> _______________________________________________
>>>> >>>>>>>>> datatable-help mailing list
>>>> >>>>>>>>> datatable-help at lists.r-forge.r-project.org
>>>> >>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>> >>>>>>>>
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> datatable-help mailing list
>>>> >>>>> datatable-help at lists.r-forge.r-project.org
>>>> >>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>> >>>>>
>>>> >>>>
>>>> >>>>
>>>> >>> _______________________________________________
>>>> >>> datatable-help mailing list
>>>> >>> datatable-help at lists.r-forge.r-project.org
>>>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Steve Lianoglou
>>>> >> Graduate Student: Computational Systems Biology
>>>> >>  | Memorial Sloan-Kettering Cancer Center
>>>> >>  | Weill Medical College of Cornell University
>>>> >> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>>
>>>
>>
>
>


More information about the datatable-help mailing list