user3552144 user3552144 - 3 months ago 8
R Question

Attempting to randomize a data frame twice and add both of those samples to a new data frame

So I have a data frame:

> MLSpredictions
fit se.fit residual.scale upr lwr
1 1.392213 0.1476321 1 1.681572 1.102854
2 1.448370 0.1709856 1 1.783501 1.113238
3 1.392213 0.1476321 1 1.681572 1.102854
4 1.448370 0.1709856 1 1.783501 1.113238
5 1.448370 0.1709856 1 1.783501 1.113238
6 1.448370 0.1709856 1 1.783501 1.113238
7 1.506792 0.1969097 1 1.892734 1.120849
8 1.506792 0.1969097 1 1.892734 1.120849
9 1.567570 0.2253572 1 2.009270 1.125870
10 1.567570 0.2253572 1 2.009270 1.125870
11 1.630800 0.2563338 1 2.133214 1.128386
12 1.448370 0.1709856 1 1.783501 1.113238
13 1.448370 0.1709856 1 1.783501 1.113238
14 1.448370 0.1709856 1 1.783501 1.113238
15 1.506792 0.1969097 1 1.892734 1.120849
16 1.567570 0.2253572 1 2.009270 1.125870
17 1.567570 0.2253572 1 2.009270 1.125870
18 1.567570 0.2253572 1 2.009270 1.125870
19 1.567570 0.2253572 1 2.009270 1.125870


I would like to sample this entire data frame TWICE and add both of those samples to a new data frame, MLSSeason:

My attempt was:

MLSSeason[1:19] = sample(MLSpredictions)
MLSSeason[20:38] = sample(MLSpredictions)


but that does not give me the right solution. Ideally, MLSSeason will have 38 rows with two of each MLSprediction sampled inside.

Answer

You can't feed a data frame to sample. It won't give you any error, but the data frame is returned unchanged. Instead, you should generate the row index.

MLSSeason <- MLSpredictions[c(sample(nrow(MLSpredictions)), sample(nrow(MLSpredictions))), ]

Note, this is not equivalent to:

MLSpredictions[samp‌​le(nrow(MLSprediction‌​s)),]

where you can't have duplicated rows.

Comments