LUSAQX - 2 months ago 12

Python Question

I used numpy.random.permutation() to generate random order to an original data frame X and want to assign whole X to X_perm by the random order.

`X_perm=X`

y_perm=y

perm = np.random.permutation(X.shape[0])

for i in range(len(perm)):

X_perm.loc[i]=(X.loc[perm[i]])

y_perm.loc[i]=(y.loc[perm[i]])

Just found that after running the code, the first record of X given by X[0:1] changed comparing to the case before running.

Strange. I didn't make any operation on X but assign its values to a new data frame. How did it cause the alteration of X value?

Cheers

Answer

The reason for this unexpected behavior is that X_perm is not an array that is independent of X. X_perm is a reference to X. So modifications to X_perm are also modifications made to X.

To demonstrate this:

```
import numpy as np
a = np.arange(16)
print a
b = a # as your X_perm = X
print b # same as print a above
b[0] = -999
print a # has been modified
print b # has been modified
a[-1] = -999
print a # has been modified
print b # has been modified
# using copy
a = np.arange(16)
print a
b = a.copy() # b is separate reference to array
print b # same as print a above
b[0] = -999
print a # has NOT been modified
print b # has been modified
a[-1] = -999
print a # has been modified
print b # has NOT been modified
```

To do what you want, you need to X_perm to be a copy of X.

```
X_perm = X.copy()
```

See also this relevant numpy doc on copy