Jingnan Lai Jingnan Lai - 23 days ago 8
R Question

Can R read unregular xlsx?

enter image description here

I have so many(about 1,000) xlsx like the picture above. And I want to read every xlsx and get the data of every candatate's name, number and age. But I don't know how to read this unregular xlsx?

Answer Source

I don't know if any R Excel API is smart enough to handle your column formatting, but there is an easy workaround. You can just save the above worksheet in CSV format. Doing this for the data you showed above left me with the following three CSV lines:

Title,,,,,
name,mike,number,123214,age,28
,score,,ddd,aaa,bbb

You can try the following code:

df <- read.csv(file="path/to/your/file.csv", header=FALSE)
df <- df[2:nrow(df), ]      # drop first row

To get the name, number, and age for Mike:

name   <- df[1, 2]
number <- df[1, 4]
age    <- df[1, 6]