user2972129 user2972129 - 10 months ago 59
R Question

Efficiency with repetitive code - For loops with variable names

Consider the following dataframe in R,

df <- data.frame(ID = 1:7, Group = c(rep(1,2), rep(2, 3), rep(3,2)), Year = c(rep(2011, 4), rep(2012, 3)), X = rnorm(7))

I am working in base R, and would like to achieve the following task in a more efficient way.

Group1 <- df[df$Group == 1,]
Group2 <- df[df$Group == 2,]
Group3 <- df[df$Group == 3,]

Here I am producing three separate dataframes by group, and giving the variable names a systematic naming scheme. This code is repetitive, and I would be looking to do this a better way (usually I have many more "groups", and so these ugly repetitive lines of code are taking up much space).

For my own learning, I would also love to see an example of this working a for loop, even though I'm sure there are better ways - something along the lines of:

for (i in 1:3){
Groupi <- df[df$Group == i,] }

Though this is obviously incorrect, hopefully you can see the intuition.

Any examples of more efficient working would be appreciated, thankyou.

Answer Source

I think you would be better served with split as described in comments. However you can achieve what you're after with a loop using assign.

for (i in 1:3) {
  assign(paste0("Group", i), df[df$Group==i,])

Also, careful with your indexing, you will need a comma to indicate "all columns".