Pablo Báez - 8 months ago 42

R Question

I'm trying to get a data frame from another, performing repetitions of certain values (a, b, c and d in my example) a certain number of times (whose values appear in each cell of my first data frame). To illustrate this better, I show the data:

`df<-data.frame(replicate(4,sample(20:50,10,rep=TRUE)))`

a<-0

b<-1

c<-2

d<-9

I tried first:

`for (i in 1:10)`

{

print(rep(a, df[i,1]))

}

But when I tried to save the output, it gives me only the first row analysis:

`for (i in 1:10)`

{

output<-print(rep(a, df[i,1]))

}

Then I tried with something more complex like:

`myfunc<-function(n){`

a<-0

b<-1

c<-2

d<-9

IDs<- matrix(n[,1]) #A new column with the IDs for each row(rownames)

w = NULL

x = NULL

y = NULL

z = NULL

for (i in 1:nrow(n)) {

w<-rbind(t(as.matrix(rep(a, n[i,1]))))

x<-rbind(t(as.matrix(rep(b, n[i,2]))))

y<-rbind(t(as.matrix(rep(c, n[i,3]))))

z<-rbind(t(as.matrix(rep(d, n[i,4]))))

}

output<-cbind(IDs, w, x, y, z)

return(output <- as.data.frame(output))

}

But I do not get what I need.

For a matrix like this:

The expected output will be:

first row: 21 times 0, 46 times 1, 25 times 2 and 28 times 9. All in 120 columns... and so on with the other rows

I really appreciate if you can help me to solve this issue.

Answer

If I'm understanding correctly, moving from a `for`

loop to `lapply`

should get you what you want.

```
lapply(1:10, function(i) rep(a, df[i, 1]))
```

You can then generalize that for all columns by

```
l <- list(a = 0, b = 1, c = 2, d = 9)
lapply(seq_along(l), function(i) lapply(1:10, function(j) rep(l[[i]], df[j, i])))
```

Which gives you a nested list and (I think) your desired output.

Now that I understand better what you want I think I can help better. But it seems to me that you have an issue here in that you're wanting a matrix but, at least in the example you've provided, each row of the matrix would be of a different length. Rather than padding these with `NA`

, I just created a fifth column that evened things out. See if the below gets at what you're wanting.

```
df$X5 <- (max(rowSums(df)) + 5) - rowSums(df)
l <- list(a = 0, b = 1, c = 2, d = 9, e = 5)
tmp <- lapply(seq_along(l), function(i) {
lapply(1:nrow(df), function(j) rep(l[[i]], df[j, i]))
})
max_col <- max(rowSums(df))
m <- matrix(rep(NA, length(l)*max_col), ncol = max_col)
for(i in seq_along(l)) {
m[i, ] <- unlist(lapply(tmp, "[[", i))
}
```

Source (Stackoverflow)