user7189426 - 6 months ago 31

R Question

I have a matrix of two columns. Some entries of column x are same, for example,

x[7]==x[8]== -0.11, x[14]==x[15]==x[16]==x[17]==0.01.

My question is: if the entries of column x are same, how can I compute the mean of their corresponding entries of v? and keep only one x and their mean of entries of v in the matrix? For example, for x[7] and x[8], their corresponding mean of v = mean(v[7]+v[8]). I need keep one -0.11 and the corresponding mean in the matrix.

`x v`

[1,] -0.22 2.575144e-02

[2,] -0.21 1.991324e-01

[3,] -0.15 7.737715e-02

[4,] -0.15 2.470678e-02

[5,] -0.13 2.135258e-01

[6,] -0.12 1.252464e-01

[7,] -0.11 1.667752e-01

[8,] -0.11 9.163501e-03

[9,] -0.10 2.191712e-01

[10,] -0.08 1.974091e-02

[11,] -0.02 1.362226e-01

[12,] -0.01 1.623944e-04

[13,] -0.01 1.497634e-02

[14,] 0.01 1.811620e-02

[15,] 0.01 1.222637e-02

[16,] 0.01 1.668605e-02

[17,] 0.01 6.495694e-02

[18,] 0.03 2.702536e-03

[19,] 0.03 5.727469e-02

Thanks!

Answer

What we are doing here is, just group by "x", and then take the mean of corresponding "v".

```
library(data.table)
setDT(dt)[, mean(v),by = "x"]
# x V1
# 1: -0.22 0.025751440
# 2: -0.21 0.199132400
# 3: -0.15 0.051041965
# 4: -0.13 0.213525800
# 5: -0.12 0.125246400
# 6: -0.11 0.087969351
# 7: -0.10 0.219171200
# 8: -0.08 0.019740910
# 9: -0.02 0.136222600
# 10: -0.01 0.007569367
# 11: 0.01 0.027996390
# 12: 0.03 0.029988613
```