KHAN KHAN - 2 months ago 12
R Question

Calculate the sum of occupants per house

I have this dataset.

http://www.sharecsv.com/s/6431424362c152943f45f2dca8891ad4/examplecsv.csv

It contains a list of hotels by ID and the number of occupants. How can i calculate the occupants sum for each hotel id (group by hotel Id)?

Additionally how can i do the same except split it by the date either 1st or 2nd of the month.

IN SQL i would do this but how can i do this in R

SELECT houses, sum(occupants),
FROM mytable
GROUP BY houses;

SELECT houses, sum(occupants),
FROM mytable
GROUP BY houses;
where date = "1998-01-02"


(i dont believe this is a duplicate, the linked possible answers describes the difference of many things of which i know nothing of so it hasent been helpful in solving my small? problem))

Answer

you can do it with the dplyr package:

install.packages("dplyr")
library(dplyr)

then you read in the csv file:

lala = read.csv(file="C:/...YOUR FILE PATH.../examplecsv.csv", header=TRUE, sep=",")

and here is the code to calculate the occupants sum for each hotel id:

grp1 = group_by(lala,Houses)
a = summarise(grp1,su=sum(Occupants, na.rm = TRUE))

and the following code does the same, except that this is for the 1st of the month:

data0101 = filter(lala, Date == "01/01/1998")
grp2 = group_by(data0101,Houses)
b = summarise(grp2,su=sum(Occupants, na.rm = TRUE))

and this for the 2nd of the month:

data0201 = filter(lala, Date == "02/01/1998")
grp3 = group_by(data0201,Houses)
c = summarise(grp3,su=sum(Occupants, na.rm = TRUE))

I assumed you had only data of January 1998.