user127886 user127886 - 1 month ago 17
R Question

Grouping columns into count in R dataframe

So i have these 4 different cols in total in a dataframe

port ip service numberOfTimes
1 22 11.11.79.100 ssh 16
2 80 11.11.79.100 www 19
3 111 11.13.79.110 ipw 21
4 123 11.13.79.110 ssh 50
5 22 64.50.80.140 cde 45
6 80 64.50.80.140 www 16
7 22 71.11.64.100 ssh 234
8 80 71.11.64.100 you 33
9 22 100.15.31.1 ssh 99
10 41 120.15.31.12 has 19


So I have the following question:

Is it possible to group using r to the following such that it can become something like that?

After

port ip(count of same ip) service numberOfTimes
22 4 ssh 399 (#1+#5+#7+#9)
80 3 www 68 (#2+#6+#8)


so on and so for the rest of the ports

Answer

Using dplyr, this is quite straightforward:

testData %>%
  group_by(port, service) %>%
  summarise(`Number of IPs` = n_distinct(ip)
            , `Total number of times` = sum(numberOfTimes))

Which for the sample data you included gives:

   port service `Number of IPs` `Total number of times`
  <int>   <chr>           <int>                   <int>
1    22     cde               1                      45
2    22     ssh               3                     349
3    41     has               1                      19
4    80     www               2                      35
5    80     you               1                      33
6   111     ipw               1                      21
7   123     ssh               1                      50

If you are getting some sort of an error (alluded to in a comment), you will need to provide data that actually causes that error before people can help you.