Masi Masi - 1 month ago 14
R Question

How to choose data by headers descriptively in R .CSV data?

I want to choose data descriptively by headers.
Here, an example to choose IDs of males in .CSV data.
You can do

data[3] == "males"
with the following data but I would like to do
data[Gender] == "males"
to avoid any mistakes.
File data.csv

ID,Age,Gender
100,69,male
101,75,female
102,84,female
103,,male
104,66,female


Code where lastline pseudocode

data = read.csv("/home/masi/data.csv",header = TRUE,sep = ",")
str(data)

# PSseudocode
#data.Gender == "male"
#data[Gender] == "male"


Eli



Now, we have a list of males, and we want to return the IDs corresponding to those males

eliData <- data$Gender == "male"
# to return IDs corresponding to males
# Pseudocode
data$ID == eliData


Pseudocode returns false for all.

Motivation: to make characteristic correlation matrices for different epidemiological groups where each data point has many own characteristics.

OS: Debian 8.5

R: 3.1.1

Answer

You can use $ notation in R for this. data$Gender == "male" is what you want. To get the ids from the rows where "male" is the gender you can do this

males <- data$Gender == "male"
maleIDs <- data[which(males), ]$ID
Comments