[datatable-help] data.table - grouping character values

Nicolas Servant Nicolas.Servant at curie.fr
Wed May 11 18:00:09 CEST 2011


My original data.table has 6251012 lines and 2 columns.
After grouping of the reads, I have 223020 lines
But the error is not reproducible and so not linked to a particular feature.
I think that this is a memory issue, it mainly depends of the amount of
data loaded in my session
The problem is that with 6251012 lines, this is my smallest dataset ;),
others can have up to 20 million reads.
Thanks again

Best,
Nicolas


Steve Lianoglou a écrit :
> On Tue, May 10, 2011 at 1:52 PM, Nicolas Servant
> <Nicolas.Servant at curie.fr> wrote:
>   
>> Indeed, your example also works on my session ... it seems to be linked
>> to one of my reads features.
>> Because even with my data
>>
>>     
>>> g$V1
>>>       
>>  [1]Error: 'getCharCE' must be called on a CHARSXP
>>
>>     
>>> head(g)$V1
>>>       
>> [1] "Simple_repeat,LINE" "snRNA,snRNA"        "Simple_repeat"
>>
>>
>>
>> Finally I found it but data.table really doesn't like it :)
>>     
>>> dt["1335_868_1708_F3-ad"]
>>>       
>>  *** caught segfault ***
>> address (nil), cause 'unknown'
>>     
>
> Ouch ... here's where Matthew will likely have to step in ;-)
>
> Out of curiosity, how big is your data.table, ie. how many rows and columns?
>
> -steve
>
>   


-- 
Nicolas Servant
Equipe Bioinformatique
Institut Curie
26, rue d'Ulm - 75248 Paris Cedex 05 - FRANCE

Email: Nicolas.Servant at curie.fr
Tel: 01 56 24 69 85
http://bioinfo.curie.fr/



More information about the datatable-help mailing list