just wondering if a better solution exists for this sort of problem.
We know that for a X/Y percentage split of an even number we can get an exact split of the data - for example for data size 10:
10 * .6 = 6
10 * .4 = 4
11 * .6 = 6.6
11 * .4 = 4.4
i = 6.6
First set = 0..6
Second set = 6..10
First set = 0..7
Second set = 7..12
First of all, notice that your problem is not limited to odd-sized arrays as you claim, but any-sized arrays. How would you make the 56%-44% split of a 10 element array? Or a 60%-40% split of a 4 element array?
There is no standard procedure. In many cases, programmers do not care that much about an exact split and they either do it by flooring or rounding one quantity (the size of the first set), while taking the complementary (array length - rounded size) for the other (the size of the second).
This might be ok in most cases when this is an one-off calculation and accuracy is not required. You have to ask yourself what your requirements are. For example: are you taking thousands of 10-sized arrays and each time you are splitting them 56%-44% doing some calculations and returning a result? You have to ask yourself what accuracy do you want. Do you care if your result ends up being the 60%-50% split or the 50%-50% split?
As another example imagine that you are doing a 4-way equal split of 25%-25%-25%-25%. If you have 10 elements and you apply the rounding technique you end up with 3,3,3,1 elements. Surely this will mess up your results.
If you do care about all these inaccuracies then the first step is consider whether you can to adjust either the array size and/or the split ratio(s).
If these are set in stone then the only way to have an accurate split of any ratios of any sized array is to make it probabilistic. You have to split multiple arrays for this to work (meaning you have to apply the same split ratio to same-sized arrays multiple times). The more arrays the better.
So imagine that you have to make a 56%-44% split of a 10 sized array. This means that you need to split it in 5.6 elements and 4.4 elements on the average.
There are many ways you can achieve a 5.6 element average. The easiest one (and the one with the smallest variance in the sequence of tries) is to have 60% of the time a set with 6 elements and 40% of the time a set that has 5 elements.
0.6*6 + 0.4*5 = 5.6
In terms of code this is what you can do to decide on the size of the set each time:
import random arraySize = 10 firstSplit = 0.56 avgSplitSize = arraySize * firstSplit flooredSplitSize = int(avgSplitSize) if avgSplitSize > flooredSplitSize: if random.uniform(0,1) > avgSplitSize - flooredSplitSize: thisSplitSize = flooredSplitSize else: thisSplitSize = flooredSplitSize + 1 else: thisSplitSize = avgSplitSize
You could make the code more compact, I just made an outline here so you get the idea. I hope this helps.