[datatable-help] Sum first 3 non zero elements of row

Gabor Grothendieck ggrothendieck at gmail.com
Tue May 21 14:28:56 CEST 2013


On Tue, May 21, 2013 at 5:10 AM, JNV <jose at memo2.nl> wrote:
> Hi there,
> I've got this matrix D with, say 10 rows and 20 columns. For each row I want
> to sum the first 3 non zero elements and put them in a vector z.
>
> So if the first row D[1,] is
> 0 3 5 0 8 9 3 2 4 0
>
> then I want z
> z<-D[1,2]+D[1,3]+D[1,5]
>
> But if there are less than 3 non zero elements, those should be summed. If
> there are no non zero elements, the result must be zero.
>
> So if the first row D[1,] is
> 0 0 3 0 1 0 0 0 0 0
>
> then I want z
> z<-D[1,3]+D[1,5]
>

Here is a matrix, D, with those two rows  The t(apply(...)) replaces
the first non-zero element in each row with 1, the 2nd with 2, etc.
(It puts garbage into the elements that are 0.)    We then convert
this to T/F according to whether each element less than or equal to 3
or not and multiply by the original data which both zaps the garbage
in the zero positions and zaps those positions which are a 4th or more
non-zero in each row.  This multiplication also inserts the correct
values into the good positions.  Finally we sum the rows using what is
left:

> D <- matrix( c(0, 0, 3, 0, 5, 3, 0, 0, 8, 1, 9, 0, 3,
+ 0, 2, 0, 4, 0, 0, 0), 2)
>
> D
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    0    3    5    0    8    9    3    2    4     0
[2,]    0    0    3    0    1    0    0    0    0     0
>
> as.data.table(D)[, rowSums((t(apply(.SD > 0, 1, cumsum)) <= 3) * .SD)]
[1] 16  4

Not sure if this really benefits from data.table as we could have
written this without data.table:

> rowSums((t(apply(D > 0, 1, cumsum)) <= 3) * D)
[1] 16  4


--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com


More information about the datatable-help mailing list