Prradep - 1 year ago 63

R Question

I have a matrix and would like to subset it using mapping and function.

Example: Randomly populated matrix using

`runif`

`set.seed`

`set.seed(1)`

exp.mat <- matrix(runif(9*6, 5.0, 10), nrow = 9, ncol = 6)

rownames(exp.mat) <- c('a','b1','b2','b3','c','d1','d2','e1','e2')

colnames(exp.mat) <- c('s1','s2','s3','s4','s5','s6')

exp.mat

s1 s2 s3 s4 s5 s6

a 5.353395 6.661973 6.733417 8.562573 6.198147 8.024666

b1 5.497331 8.254352 6.668875 6.999972 5.294672 8.273620

b2 6.581359 6.290084 7.381756 6.626761 8.211441 6.765986

b3 7.593171 7.392726 9.460992 8.785436 9.381346 6.351301

c 8.310025 8.831553 9.321697 6.013461 8.894573 9.963420

d1 7.034151 5.421235 6.949948 8.555606 8.986544 8.167466

d2 9.564380 9.376607 8.886603 5.608460 7.276372 6.066041

e1 6.468017 6.695365 9.803090 6.227443 7.050420 5.646862

e2 7.295329 9.197202 7.173297 5.716522 9.054351 7.390590

Mappings with column

`rown`

`rownames`

`map`

`maps <- data.frame(rown=c('a','b1','b2','b3','c','d1','d2','e1','e1'),`

map =c('a','b','b','b','c','d','d','e','f'))

maps

rown map

1 a a

2 b1 b

3 b2 b

4 b3 b

5 c c

6 d1 d

7 d2 d

8 e1 e

9 e1 f

Function,

`mean`

`apply(exp.mat, 1, mean)`

a b1 b2 b3 c d1 d2 e1 e2

6.922362 6.831470 6.976231 8.160829 8.555789 7.519158 7.796410 6.981866 7.637882

Based on the mappings,

- if there is only one value in mapping to
`rown`

then it should`map`

directly copy entire row. eg:,`a`

have only one mapping.`c`

- if there are more than one value in mapping to
`rown`

then it`map`

should copy the entire row which has the highest value from the resultant function above. eg:,`b1`

,`b2`

maps to`b3`

;`b`

has highest`b3`

. So, it has to chose`mean`

and likewise`b3`

.`d2`

- if there is a value in mapping to more than one value in
`rown`

then it should discard those rows. eg:`map`

has more than one mapping value`e1`

,`e`

.`f`

- if there is no mapping, then discard the row. eg: has no corresponding mapping.
`e2`

Expected output: subsetted matrix

`> exp.mat.trans`

s1 s2 s3 s4 s5 s6

a 5.353395 6.661973 6.733417 8.562573 6.198147 8.024666

b 7.593171 7.392726 9.460992 8.785436 9.381346 6.351301

c 8.310025 8.831553 9.321697 6.013461 8.894573 9.963420

d 9.564380 9.376607 8.886603 5.608460 7.276372 6.066041

Please advise, how to achieve this in an efficient manner?

I have achieved this eyeballing and the code below

`exp.mat.trans <- exp.mat[c(1,4,5,7),]`

rownames(exp.mat.trans) <- c('a','b','c','d')

It might be useful to identify just the indices as there is no transformation of the values?

`# Index Subsetting`

ind <- c(1,4,5,7)

exp.mat.trans2 <- exp.mat[ind,]

rownames(exp.mat.trans2) <- maps[ind, 'map']

`exp.mat.trans`

`exp.mat.trans2`

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

If you want to have an efficient solution I think it would be better to use data.tables for the mapping. Your input matrix is something different if I run it. I found the following solution for the problem:

```
set.seed(1)
exp.mat <- matrix(runif(9*6, 5.0, 10), nrow = 9, ncol = 6)
rownames(exp.mat) <- c('a','b1','b2','b3','c','d1','d2','e1','e2')
colnames(exp.mat) <- c('s1','s2','s3','s4','s5','s6')
> exp.mat
s1 s2 s3 s4 s5 s6
a 6.327543 5.308931 6.900176 6.911940 8.971199 8.946781
b1 6.860619 6.029873 8.887226 9.348454 5.539718 5.116656
b2 7.864267 5.882784 9.673526 6.701745 8.618555 7.386150
b3 9.541039 8.435114 6.060713 7.410401 7.056372 8.661569
c 6.008410 6.920519 8.258369 7.997829 9.104731 8.463658
d1 9.491948 8.849207 5.627775 7.467707 8.235301 7.388098
d2 9.723376 7.488496 6.336103 5.931088 8.914664 9.306047
e1 8.303989 8.588093 6.930570 9.136867 7.765182 7.190486
e2 8.145570 9.959530 5.066952 8.342334 7.648598 6.223986
maps <- data.table(rown=c('a','b1','b2','b3','c','d1','d2','e1','e1'),
map =c('a','b','b','b','c','d','d','e','f'))
#RULE 2 calculate mean of each row
maps[, value := rowMeans(exp.mat)]
# aggregate such that we know which mapping should be made (RULE 2)
maps <- maps[, rown[which.max(value)], by = map]
# Delete if more mappings are made first find the number of mappings (RULE 3)
number_map <- maps[,.N, by = V1]
setkey(maps, "V1")
# Delete if more than one time a mapping is found
maps <- maps[number_map[N < 2, V1]]
# Now subset the matrix
exp.mat[maps$V1[maps$V1 %in% rownames(exp.mat)],]
s1 s2 s3 s4 s5 s6
a 6.327543 5.308931 6.900176 6.911940 8.971199 8.946781
b3 9.541039 8.435114 6.060713 7.410401 7.056372 8.661569
c 6.008410 6.920519 8.258369 7.997829 9.104731 8.463658
d2 9.723376 7.488496 6.336103 5.931088 8.914664 9.306047
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**