user2117258 - 11 months ago 60

R Question

For each cluster in

`temp3`

Data:

`> head(temp3)`

X Y Transcripts Genes Timepoint Run Cluster

6B_0_GACCGCGATATT -102.1425877 13.944831 134028 11269 Day 0 6B 2

6B_0_ATTGCGGAGACA -38.6617527 0.600154 106849 10947 Day 0 6B 3

6B_0_ATGGTCACCACT -23.3275424 34.178312 105817 10495 Day 0 6B 4

6B_0_ATATTGCTAATC -0.6069128 52.449397 79920 9650 Day 0 6B 4

6B_0_ATCTAATCTACC -0.4738788 54.756711 72912 9294 Day 0 6B 4

6B_0_CGCAGTGTGCCC 108.5333675 76.637930 70132 9291 Day 0 6B 6

Code:

`library(dplyr)`

temp3 %>% group_by(Cluster) %>% mutate(., Centroid=rowMeans(cbind(.$X, .$Y), na.rm = TRUE))

Which returns:

Error: incompatible size (13792), expecting 198 (the group size) or 1

another approach:

`library(cluster)`

temp3 %>% group_by(Cluster) %>% mutate(., Centroid=pam(cbind(.$X, .$Y), 1)$medoids)

returns:

Error: incompatible size (2), expecting 198 (the group size) or 1

Answer Source

How about just

```
temp3 %>% group_by(Cluster) %>% mutate(meanX=mean(X), meanY=mean(Y))
```

if you want a result with the same dimensions as the input.

Or, if you just want one row per cluster (which seems more likely):

```
temp3 %>% group_by(Cluster) %>% summarise(meanX=mean(X), meanY=mean(Y))
```