Suppose I have something like the code below
for idx in xrange(0, 10):
train_test_split = training.randomSplit(weights=[0.75, 0.25])
train_cv = train_test_split
test_cv = train_test_split
# scale train_cv and test_cv
Therefore, it's actually not possible to 'change' an RDD only transform them. So, no, the original data will not be affected.