A. Ewaiss - 1 year ago 65

Python Question

In the 5-star rating system, I have a known number of ratings N (Votes.)

I also have the final (weighted) average of all those N ratings, let's say it is R (float to two decimal places).

I would like to know the best approach (Algorithm) to generate all possible combinations (Total of weighted averages) and print out only the one(s) which result in R. Printing "ALL" possible combinations is not what I am looking for, because it will run in the tens of billions for a large N and small R. I am one step removed from being a python newbie, but it will be the language of choice, and this very exercise is my introduction to the language. What is the best approach to this problem? Approach is my question, but any hint of code is greatly appreciated.

**Example:**

N= 20 customers rated a product

R = 3.85 is the average rating

The Output:

[14, 0, 0, 1, 5]

is one possible combination out of 146.

"14 five-stars, 0 four-stars, 0 three-stars, 1 two-stars, and 5 one-star"

And the combination:

[487, 0, 1, 0, 12]

is one possible combination out of 1154 for N=500 and R=4.90 etc.,.

Answer Source

The total number of stars is N*R (20 * 3.85 = 77 in your example). Now what you have similar to the change making problem except that your total number of coins is fixed.

An efficient solution might be to start with as many large coins (5 star ratings) as would not exceed the total, and decrease until you have ratings that would not make the total. You still end up checking solutions that don't work, but it will be significantly faster than checking all solutions, especially for large problem sizes.

Here is my solution: (EDIT: Solution debugged. I don't think its the optimal solution, but its better than brute force. 1931 recursive calls for the sample case of N=20 R=3.85)

```
def distribution(total, maxRating, N, solution):
if total == 0 and N == 0:
return [solution + [0] * maxRating] #we found a solution
if total == 0 or N == 0:
return [] # no solution possible
largestUpperLimit = min(total // maxRating, N) # an upper limit for the number of reviews with the largest rating
largestLowerLimit = max((total - N * (maxRating -1)) // maxRating, 0) # a lower limit for the number of reviews with the largest rating
if N < largestLowerLimit:
return [] # there aren't enough ratings to make any solutions
else:
solutions = []
for i in range(largestLowerLimit, largestUpperLimit + 1): # plus 1 to include the upper limit
solutions.extend(distribution(total - i * maxRating, maxRating - 1, N - i, solution + [i]))
return solutions
# Using the function example:
solutions = distribution(N * R, 5, N, [])
```