[datatable-help] data.table - grouping character values
Steve Lianoglou
mailinglist.honeypot at gmail.com
Tue May 10 15:16:37 CEST 2011
Hi,
On Tue, May 10, 2011 at 7:56 AM, Nicolas Servant
<Nicolas.Servant at curie.fr> wrote:
> Dear Datatable users,
>
> I'm interested in using the data.table package to reduce and group high
> dimensional data (from next generation sequencing)
> I have my data.table which show me for each reads (features), the
> associated annotation.
> And because one feature can have several annotations, I would like to
> group these annotations per feature
> The numerical operations on the annotation work well (for instance
> length()), but the functions on string values seem to not work (for
> instance paste())
> At the end, I would like to have something like :
>
> reads annot
> [1,] 1279_1000_530_F3-ad Simple_repeat, LINE
> [2,] 1279_1000_940_F3-ad snRNA
> ...
> Do you have any suggestion to do that ?
>
>>dt[1:20,]
>
> reads annot
> [1,] 1279_1000_530_F3-ad Simple_repeat
> [2,] 1279_1000_530_F3-ad LINE
> [3,] 1279_1000_940_F3-ad snRNA
> [4,] 1279_1000_940_F3-ad snRNA
> [5,] 1279_1018_1051_F3-ad Simple_repeat
A similar question using paste came up a few days ago ... this should
do the trick:
R> key(dt) <- 'reads'
R> dt[, paste(annot, collapse=','), by=reads]
Hope that helps,
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the datatable-help
mailing list