[datatable-help] data.table - grouping character values

Nicolas Servant Nicolas.Servant at curie.fr
Tue May 10 18:29:11 CEST 2011


Thanks a lot. It works but then, do you how I can have access to the
result !
For instance :
>g$V1
 [1]Error: 'getCharCE' must be called on a CHARSXP

Regards,

Steve Lianoglou a écrit :
> Hi,
>
> On Tue, May 10, 2011 at 7:56 AM, Nicolas Servant
> <Nicolas.Servant at curie.fr> wrote:
>   
>> Dear Datatable users,
>>
>> I'm interested in using the data.table package to reduce and group high
>> dimensional data (from next generation sequencing)
>> I have my data.table which show me for each reads (features), the
>> associated annotation.
>> And because one feature can have several annotations, I would like to
>> group these annotations per feature
>> The numerical operations on the annotation work well (for instance
>> length()), but the functions on string values seem to not work (for
>> instance paste())
>> At the end, I would like to have something like :
>>
>>                    reads         annot
>> [1,]  1279_1000_530_F3-ad Simple_repeat, LINE
>> [2,]  1279_1000_940_F3-ad         snRNA
>> ...
>> Do you have any suggestion to do that ?
>>
>>     
>>> dt[1:20,]
>>>       
>>                     reads         annot
>>  [1,]  1279_1000_530_F3-ad Simple_repeat
>>  [2,]  1279_1000_530_F3-ad          LINE
>>  [3,]  1279_1000_940_F3-ad         snRNA
>>  [4,]  1279_1000_940_F3-ad         snRNA
>>  [5,] 1279_1018_1051_F3-ad Simple_repeat
>>     
>
> A similar question using paste came up a few days ago ... this should
> do the trick:
>
> R> key(dt) <- 'reads'
> R> dt[, paste(annot, collapse=','), by=reads]
>
> Hope that helps,
> -steve
>
>   


-- 
Nicolas Servant
Equipe Bioinformatique
Institut Curie
26, rue d'Ulm - 75248 Paris Cedex 05 - FRANCE

Email: Nicolas.Servant at curie.fr
Tel: 01 56 24 69 85
http://bioinfo.curie.fr/



More information about the datatable-help mailing list