[datatable-help] Random segfaults

Chris Neff caneff at gmail.com
Fri Dec 16 16:20:20 CET 2011


Just posting things as I find them.  I run my script (and it makes it
through no complaints), but then I just try to modify it slightly more
like:

DT[, w := x*y]

where x,y are both integer columns of DT (and w doesn't previously
exist), and I get the following:

Error in match(as.vector(x), y, 0L) :
  'translateCharUTF8' must be called on a CHARSXP

If I then try to print DT again I get the same error as above:

Error in do.call("cbind", lapply(x, format, justify = justify, ...)) :
  'getCharCE' must be called on a CHARSXP


The problem is I cant get this to reproduce on simpler code.  So I
just have to tell you what I see when I see it.





On 16 December 2011 09:38, Chris Neff <caneff at gmail.com> wrote:
> On the current latest SVN build, with debugging enabled as listed
> below, I get the following when trying to even print the contents of a
> data.table:
>
> Error in do.call("cbind", lapply(x, format, justify = justify, ...)) :
>   'getCharCE' must be called on a CHARSXP
>
> Never saw this error without debugging.  I tried printing a few times
> in a row, got this same error, and then like the 4th time it
> segfaulted.
>
> Having a hard time reproducing that, but at least it is something?
>
>
> On 15 December 2011 15:05, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>
>> One thought ... how about turning on debugging. That way when it crashes
>> at least you can report the file and line number. Btw, I've installed
>> 2.12.0 on 64bit in case that managed to reproduce, but it still works
>> for me ok as does 32bit 2.12.0, and both 32 and 64bit 2.14.0. So we're
>> left with you debugging at your end, but should be fairly easy ...
>>
>> sudo MAKEFLAGS='CFLAGS=-O0\ -g\ -Wall\ -pedantic' R CMD INSTALL
>> data.table_1.7.7.tar.gz
>>
>> R -d gdb
>>
>> run
>>
>> Do the stuff that crashes it.  Does it report a C file and line number?
>>
>> Just to rule out possible svn / R CMD build strangeness, please also use
>> the data.table_1.7.7.tar.gz that's on CRAN.  It still hasn't run checks
>> for 1.7.7 so on tenterhooks for that.
>>
>>
>>
>> On Thu, 2011-12-15 at 12:26 -0500, Chris Neff wrote:
>>> Just to come back, it still crashes at seemingly random times.   I'm
>>> reverting back to an earlier version (1.7.1) to see if that fixes my
>>> problem.
>>>
>>> On 15 December 2011 11:08, Chris Neff <caneff at gmail.com> wrote:
>>> > Internal build of R. Can't upgrade until they do.  I think it is
>>> > unlikely to see 2.14 any time soon.
>>> >
>>> > On 15 December 2011 10:50, Steve Lianoglou
>>> > <mailinglist.honeypot at gmail.com> wrote:
>>> >> Hi,
>>> >>
>>> >> Out of curiosity, is it impossible for you to upgrade R to the latest, or?
>>> >>
>>> >> -steve
>>> >>
>>> >>
>>> >> On Thu, Dec 15, 2011 at 10:42 AM, Chris Neff <caneff at gmail.com> wrote:
>>> >>> I always use svn up. I'll reboot and reinstall just to make sure. As
>>> >>> for reproducible, it still doesn't seem to crash in any consistent
>>> >>> place but I'll give it a stronger try with a test data set.
>>> >>>
>>> >>> All 480 tests in test.data.table() completed ok in 7.395sec
>>> >>> R version 2.12.1 (2010-12-16)
>>> >>> Platform: x86_64-pc-linux-gnu (64-bit)
>>> >>>
>>> >>> locale:
>>> >>>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>>> >>> LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>>> >>>  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>>> >>> LC_PAPER=en_US.utf8       LC_NAME=C
>>> >>>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
>>> >>> LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>> >>>
>>> >>> attached base packages:
>>> >>> [1] stats     graphics  grDevices utils     datasets  grid
>>> >>> methods   base
>>> >>>
>>> >>> other attached packages:
>>> >>> [1] hexbin_1.26.0      lattice_0.19-33    RColorBrewer_1.0-5
>>> >>> data.table_1.7.8   ggplot2_0.8.9      reshape_0.8.4
>>> >>> [6] plyr_1.6
>>> >>>
>>> >>> On 15 December 2011 09:52, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>> >>>>
>>> >>>> And you did an 'svn up' (or equivalent)? Grabbing daily tar.gz snapshot
>>> >>>> from R-Forge won't include the fix yet. So svn up, then R CMD build, then
>>> >>>> R CMD INSTALL, right? (Just checking quick basics first).
>>> >>>>
>>> >>>>> Result of test.data.table(), sessionInfo() and confirm it's a clean
>>> >>>>> install after a reboot to make sure no old .so is still knocking around
>>> >>>>> somehow please. Definitely installed to the right library? If it's
>>> >>>>> crashing a lot then it should be reproducible?
>>> >>>>> Still waiting for CRAN check results for 1.7.7 in old-rel. If it's not
>>> >>>>> fixed there either that'll help to know....
>>> >>>>>
>>> >>>>>> Latest SVN version, no alloccol set, still crashing a lot.  I don't
>>> >>>>>> use [<- or $<-, the only times I modify a data.table are with :=  or
>>> >>>>>> by doing DT=merge(DT,blah).
>>> >>>>>>
>>> >>>>>> Any more info I can provide?
>>> >>>>>>
>>> >>>>>> On 15 December 2011 08:32, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>> >>>>>>> Great fingers and toes crossed. If you could unset alloccol option just
>>> >>>>>>> to
>>> >>>>>>> be sure please, that would be great. You're our best hope of confirming
>>> >>>>>>> it's fixed since it was biting you several times an hour. If you use
>>> >>>>>>> [<-
>>> >>>>>>> or $<- syntax then R will copy via *tmp* and at that point the *tmp*
>>> >>>>>>> data.table is similar to a data.table loaded from disk in that it isn't
>>> >>>>>>> over-allocated anymore, I realised. Also a copy() will lose
>>> >>>>>>> over-allocation until the next column addition.  That 'should' all be
>>> >>>>>>> fine
>>> >>>>>>> now in both <=2.13.2 and >=2.14.0, although the bug was something
>>> >>>>>>> simpler.
>>> >>>>>>>
>>> >>>>>>> 1.7.7 is on CRAN now and been built for windows so if CRAN check
>>> >>>>>>> results
>>> >>>>>>> tick over from "ERROR" to "OK" later today (for both windows and mac
>>> >>>>>>> old-rel), and, you're ok too, then it's fixed.
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>> I've updated to the latest SVN version, and I'll be sure to let you
>>> >>>>>>>> know if it still crashes (however I do have the alloccol option set to
>>> >>>>>>>> 1000, so I shouldn't be bumping into reallocation very often). Thanks
>>> >>>>>>>> for finding the bug so fast!
>>> >>>>>>>>
>>> >>>>>>>> On 14 December 2011 19:56, Matthew Dowle <mdowle at mdowle.plus.com>
>>> >>>>>>>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>> Hm. Sounds like it could be a different problem then if it was in R
>>> >>>>>>>>> 2.14. There have been quite a few fixes since 1.7.4 so if you can
>>> >>>>>>>>> reproduce with 1.7.7 would be great.  Or, we've sometimes seen that
>>> >>>>>>>>> just
>>> >>>>>>>>> after a package upgrade that a clean re-install can often fix things.
>>> >>>>>>>>> Perhaps if the .so was in use by another R process or a zombie, or
>>> >>>>>>>>> something. R seems to report data.table v1.7.4 (say) but it hasn't
>>> >>>>>>>>> fully
>>> >>>>>>>>> installed it properly and is still (perhaps partially) at 1.7.3. So
>>> >>>>>>>>> quit
>>> >>>>>>>>> all R (reboot to clear zombies too perhaps) and try reinstalling
>>> >>>>>>>>> using
>>> >>>>>>>>> R
>>> >>>>>>>>> CMD INSTALL. Next time it happens I mean. Can also run
>>> >>>>>>>>> test.data.table()
>>> >>>>>>>>> to check the install.
>>> >>>>>>>>>
>>> >>>>>>>>> On Wed, 2011-12-14 at 17:40 +0000, Timothée Carayol wrote:
>>> >>>>>>>>>> Hi --
>>> >>>>>>>>>>
>>> >>>>>>>>>> I have been having many unreproducible bugs with R 2.14, data.table
>>> >>>>>>>>>> 1.7.4 and ubuntu 64 bits about 10 days ago. Data was getting
>>> >>>>>>>>>> corrupted, and then R crashed. I had to go back to data.frame for
>>> >>>>>>>>>> the
>>> >>>>>>>>>> bits of code affected. I was doing a lot of rather unsafe
>>> >>>>>>>>>> manipulations with row names, rbind and cbinds.
>>> >>>>>>>>>> I didn't file a report, nor signal it, as it was occurring seemingly
>>> >>>>>>>>>> at random, and I was doing operations which aren't really what
>>> >>>>>>>>>> data.table was made for (tons of little manipulations on small
>>> >>>>>>>>>> data);
>>> >>>>>>>>>> still I guess I should now signal that 2.14 didn't fix everything
>>> >>>>>>>>>> for
>>> >>>>>>>>>> me. I do not know whether bugs subsist on post-1.7.4 versions.
>>> >>>>>>>>>>
>>> >>>>>>>>>> t
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Wed, Dec 14, 2011 at 5:31 PM, Matthew Dowle
>>> >>>>>>>>>> <mdowle at mdowle.plus.com>
>>> >>>>>>>>>> wrote:
>>> >>>>>>>>>> >
>>> >>>>>>>>>> > Maybe, worth a try. Are you loading any data.table objects from
>>> >>>>>>>>>> disk?
>>> >>>>>>>>>> >
>>> >>>>>>>>>> >> 64 bit 2.12.1 linux.
>>> >>>>>>>>>> >>
>>> >>>>>>>>>> >> Is there an option I can set in my session in order to work
>>> >>>>>>>>>> around
>>> >>>>>>>>>> the
>>> >>>>>>>>>> >> truelength issue? I don't care if I lose some of the
>>> >>>>>>>>>> over-allocation
>>> >>>>>>>>>> >> niceties if it stops things from crashing. Looking at the
>>> >>>>>>>>>> truelength
>>> >>>>>>>>>> >> help, would just doing:
>>> >>>>>>>>>> >>
>>> >>>>>>>>>> >> options(datatable.alloc=quote(1000))
>>> >>>>>>>>>> >>
>>> >>>>>>>>>> >> stop this? I never have more than about 50 columns at a time.
>>> >>>>>>>>>> >>
>>> >>>>>>>>>> >> On 14 December 2011 11:43, Matthew Dowle <mdowle at mdowle.plus.com>
>>> >>>>>>>>>> wrote:
>>> >>>>>>>>>> >>>
>>> >>>>>>>>>> >>> You're R < 2.14.0, right?  I'm really struggling in R < 2.14.0
>>> >>>>>>>>>> to
>>> >>>>>>>>>> make
>>> >>>>>>>>>> >>> over-allocation work because R only started to initialize
>>> >>>>>>>>>> truelength to
>>> >>>>>>>>>> >>> 0
>>> >>>>>>>>>> >>> in R 2.14.0+. Before that it's unitialized (random). Trouble is
>>> >>>>>>>>>> my
>>> >>>>>>>>>> >>> attempts in R < 2.14.0 to work around that work fine for me in
>>> >>>>>>>>>> linux
>>> >>>>>>>>>> >>> 32bit
>>> >>>>>>>>>> >>> when I test in R 2.13.2, and I even test in 2.12.0 too. I test
>>> >>>>>>>>>> on
>>> >>>>>>>>>> 64bit
>>> >>>>>>>>>> >>> too but just 2.14.0.  CRAN is also showing errors on 2.13.2
>>> >>>>>>>>>> (old-rel)
>>> >>>>>>>>>> >>> for
>>> >>>>>>>>>> >>> both mac and windows.
>>> >>>>>>>>>> >>>
>>> >>>>>>>>>> >>> So, this is a pre-2.14.0 (only) problem that I'll continue to
>>> >>>>>>>>>> try
>>> >>>>>>>>>> and
>>> >>>>>>>>>> >>> fix.
>>> >>>>>>>>>> >>>
>>> >>>>>>>>>> >>> Are you 64bit pre-2.14.0? Which OS?  If you are 64bit linux then
>>> >>>>>>>>>> it
>>> >>>>>>>>>> adds
>>> >>>>>>>>>> >>> weight to me installing pre-2.14.0 on my 64bit instance in an
>>> >>>>>>>>>> effort to
>>> >>>>>>>>>> >>> reproduce.
>>> >>>>>>>>>> >>>
>>> >>>>>>>>>> >>>
>>> >>>>>>>>>> >>>> This will be a crappy help request because I can't seem to
>>> >>>>>>>>>> reproduce
>>> >>>>>>>>>> >>>> it, but the past few days I've been getting a lot of segfaults.
>>> >>>>>>>>>>  The
>>> >>>>>>>>>> >>>> only common thing between every crash is that it happens when I
>>> >>>>>>>>>> do
>>> >>>>>>>>>> >>>>
>>> >>>>>>>>>> >>>> DT[, z := x]
>>> >>>>>>>>>> >>>>
>>> >>>>>>>>>> >>>> where z was not a column that existed in DT before, and x is
>>> >>>>>>>>>> either an
>>> >>>>>>>>>> >>>> existing column of DT or a separate variable, doesn't matter.
>>> >>>>>>>>>>  Beyond
>>> >>>>>>>>>> >>>> that I can't reproduce a set of steps that gets R to crash.
>>> >>>>>>>>>>  This
>>> >>>>>>>>>> is
>>> >>>>>>>>>> >>>> with the latest SVN version.
>>> >>>>>>>>>> >>>>
>>> >>>>>>>>>> >>>> Is there more information I can provide to help track this
>>> >>>>>>>>>> down?
>>> >>>>>>>>>> >>>> _______________________________________________
>>> >>>>>>>>>> >>>> datatable-help mailing list
>>> >>>>>>>>>> >>>> datatable-help at lists.r-forge.r-project.org
>>> >>>>>>>>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>> >>>>>>>>>> >>>>
>>> >>>>>>>>>> >>>
>>> >>>>>>>>>> >>>
>>> >>>>>>>>>> >>
>>> >>>>>>>>>> >
>>> >>>>>>>>>> >
>>> >>>>>>>>>> > _______________________________________________
>>> >>>>>>>>>> > datatable-help mailing list
>>> >>>>>>>>>> > datatable-help at lists.r-forge.r-project.org
>>> >>>>>>>>>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> _______________________________________________
>>> >>>>>>>>> datatable-help mailing list
>>> >>>>>>>>> datatable-help at lists.r-forge.r-project.org
>>> >>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> datatable-help mailing list
>>> >>>>> datatable-help at lists.r-forge.r-project.org
>>> >>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>> _______________________________________________
>>> >>> datatable-help mailing list
>>> >>> datatable-help at lists.r-forge.r-project.org
>>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Steve Lianoglou
>>> >> Graduate Student: Computational Systems Biology
>>> >>  | Memorial Sloan-Kettering Cancer Center
>>> >>  | Weill Medical College of Cornell University
>>> >> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>>


More information about the datatable-help mailing list