[datatable-help] data.table - grouping character values
Nicolas Servant
Nicolas.Servant at curie.fr
Tue May 10 13:56:36 CEST 2011
Dear Datatable users,
I'm interested in using the data.table package to reduce and group high
dimensional data (from next generation sequencing)
I have my data.table which show me for each reads (features), the
associated annotation.
And because one feature can have several annotations, I would like to
group these annotations per feature
The numerical operations on the annotation work well (for instance
length()), but the functions on string values seem to not work (for
instance paste())
At the end, I would like to have something like :
reads annot
[1,] 1279_1000_530_F3-ad Simple_repeat, LINE
[2,] 1279_1000_940_F3-ad snRNA
...
Thanks
Regards,
Nicolas Servant
Do you have any suggestion to do that ?
>dt[1:20,]
reads annot
[1,] 1279_1000_530_F3-ad Simple_repeat
[2,] 1279_1000_530_F3-ad LINE
[3,] 1279_1000_940_F3-ad snRNA
[4,] 1279_1000_940_F3-ad snRNA
[5,] 1279_1018_1051_F3-ad Simple_repeat
>g=dt[,length(annot), by=reads]
>head(g)
reads V1
[1,] 1279_1000_530_F3-ad 2
[2,] 1279_1000_940_F3-ad 2
[3,] 1279_1018_1051_F3-ad 1
[4,] 1279_1019_49_F3-ad 13
[5,] 1279_1019_571_F3-ad 14
[6,] 1279_1024_555_F3-ad 1
--
Nicolas Servant
Equipe Bioinformatique
Institut Curie
26, rue d'Ulm - 75248 Paris Cedex 05 - FRANCE
Email: Nicolas.Servant at curie.fr
Tel: 01 56 24 69 85
http://bioinfo.curie.fr/
More information about the datatable-help
mailing list