[datatable-help] Removing rows from a matrix based on column entries

ashruser ashokkrish at gmail.com
Mon Feb 22 06:33:46 CET 2016


Dear R Users,

I have a question about removing rows from a matrix. All matrix entries are
either a 0 or a 1. The rows are sorted according to the row sum.

Here is an example matrix

e1  <- c(0,0,0,1,0,0,0)
e2  <- c(1,0,0,0,0,0,0)
e3  <- c(0,1,0,0,0,0,0)
e4  <- c(0,0,1,0,1,0,0)
e5  <- c(1,1,0,0,0,0,0)
e6  <- c(1,0,0,0,1,0,0)
e7  <- c(0,0,1,0,1,1,0)
e8  <- c(0,0,1,0,1,0,1)
e9  <- c(1,1,0,1,1,0,0)
e10 <- c(0,0,1,1,0,1,1)

(E <- rbind(e1, e2, e3, e4, e5, e6, e7, e8, e9, e10))

Which prints 

> (E <- rbind(e1, e2, e3, e4, e5, e6, e7, e8, e9, e10))
    [,1] [,2] [,3] [,4] [,5] [,6] [,7]
e1      0    0    0    1    0    0    0
e2      1    0    0    0    0    0    0
e3      0    1    0    0    0    0    0
e4      0    0    1    0    1    0    0
e5      1    1    0    0    0    0    0
e6      1    0    0    0    1    0    0
e7      0    0    1    0    1    1    0
e8      0    0    1    0    1    0    1
e9      1    1    0    1    1    0    0
e10    0    0    1    1    0    1    1

I want to remove rows in the following fashion. If a row has a single 1 then
all following rows below that with a 1 in that column position should be
removed.  So we observe rows e1 e2 and e3 can successively remove rows e5,
e6, e9 and e10. Leaving us with rows e1, e2, e3, e4, e7 and e8.


for (v in 2:dim(E)[1])
	{
		print(v)
		print(E[v, 4])

		if (E[v, 4] == 1) E <- E[-v,]	
	}

Removing rows inside a for-loop is giving me an error. So I thought I ll
first find rows (if any) have a rowsum 1 and identified them. Then I try to
remove the following rows with a 1 in that position using a for-loop. Once
again an error.

UnitRowsum <- E[which(rowSums(E) == 1),]
UnitRowsum

for (v in 1:dim(UnitRowsum)[1])
	{
		print(which(UnitRowsum[v, ] == 1))
	}

Furthermore I want to continue row removals based on rows with sum greater
than one and removes all following rows that have a 1 in all those positions
and so on. To example what I mean I have now have a reduced matrix

    [,1] [,2] [,3] [,4] [,5] [,6] [,7]
e1     0    0    0    1    0    0    0
e2     1    0    0    0    0    0    0
e3     0    1    0    0    0    0    0
e4     0    0    1    0    1    0    0
e7     0    0    1    0    1    1    0
e8     0    0    1    0    1    0    1

Row e4 dominates rows e7 and e8 and therefore have to be removed as well.
This continues till no more rows can be removed.

Could you please help me.

Sincerely,
Ash



--
View this message in context: http://r.789695.n4.nabble.com/Removing-rows-from-a-matrix-based-on-column-entries-tp4717649.html
Sent from the datatable-help mailing list archive at Nabble.com.


More information about the datatable-help mailing list