Erdogan CEVHER - 1 year ago 61

R Question

(reproducible example given) How to pass the additional argument

`nrow`

`as.data.frame`

In

`?as.data.frame`

... additional arguments to be passed to or from methods.

With the co-worker

`matrix(..., nrow)`

`set.seed(1)`

df <- as.data.frame(matrix(c(rnorm(5),rnorm(5), rnorm(5)), nrow=5, byrow=TRUE))

df

# V1 V2 V3

# 1 -0.6264538 0.1836433 -0.8356286

# 2 1.5952808 0.3295078 -0.8204684

# 3 0.4874291 0.7383247 0.5757814

# 4 -0.3053884 1.5117812 0.3898432

# 5 -0.6212406 -2.2146999 1.1249309

Without

`matrix(..., nrow)`

`set.seed(1)`

df <- as.data.frame(c(rnorm(5),rnorm(5), rnorm(5)))

df

# c(rnorm(5), rnorm(5), rnorm(5))

# 1 -0.6264538

# 2 0.1836433

# ..................................

# 15 1.1249309

I want to pass

`nrow`

`as.data.frame`

`matrix(...,nrow)`

`as.data.frame`

Answer Source

`c(rnorm(5),rnorm(5), rnorm(5))`

is just a vector. (And, btw, would be simpler to write as `rnorm(15)`

.) When you call `as.data.frame`

on a vector, S3 dispatch will end up using `as.data.frame.vector`

. Your question assumes that internally `as.data.frame.vector`

converts the input to a `matrix`

before putting it into a data frame. **This is an incorrect assumption.**

Because `as.data.frame.vector`

would only ever be called on a single vector, it knows it only has one column to deal with so it has a relatively simple job. You can look at the code by typing `as.data.frame.vector`

and you will see that **no matrices are used** and that, in this method, `...`

is also not used in the function body.

You have code that works, `as.data.frame(matrix(your_vector, nrow = your_nrow))`

. It's a good solution. Be content.

It makes sense for `matrix`

or `as.matrix`

to have an `nrow`

argument because all elements of a matrix must have the **same type**. Thus it is common for a vector (in which all elements must also have the same type) gets turned into a matrix with rows and columns. A `data.frame`

allows each column to be of different types, so "wrapping" input data from one column to the next is unusual - it's not assumed that the next column is a continuation of the previous. Given your example, it's worth asking if you even *want* a data frame - computations with matrices are much faster as it is a simpler data structure.

There are *many* ways to create the data frame you want. The following will all work (only the column names will differ, the data values are the same). How you generate the input vector is up to you.

```
set.seed(1)
d1 = as.data.frame(matrix(rnorm(15), nrow = 5))
set.seed(1)
d2 = data.frame(replicate(3, rnorm(5)))
set.seed(1)
d3 = data.frame(rnorm(5), rnorm(5), rnorm(5))
set.seed(1)
my_vectors = list(rnorm(5), rnorm(5), rnorm(5))
d4 = as.data.frame(do.call(cbind, my_vectors))
```