Jake Kalcher Jake Kalcher - 7 months ago 13
Python Question

Sorting data from a csv alphabetically, highest to lowest and average

This is the next step in my currently unresolved question in which I am attempting to sort the scores from 3 different teams. I have very limited knowledge of python because I am new to programming so my problem solving of this current project is quite difficult.

To begin I will need the example data (shown below) which are split over two cells to be sorted alphabetically according to the names, I will have this for 3 different teams in 3 different files. I am also trying to sort it out from highest to lowest depending on the score, this has proven of much difficulty to me so far.

Jake,5
Jake,3
Jake,7
Jeff,6
Jeff,4
Fred,5


The third and final way to sort I am trying to do is by average. For this I had attempted to make it so if the user had there name 2 or 3 times (as the program will store the last 3 scores for each user, this is a currently unresolved problem) then it would add their scores then divide by how many of them it had there. Unfortunately this was very difficult for me and i struggled to be able to get any output, though I had an idea that this will print their average scores to a separate file then re-read the scores.

The current layout I have so far is shown below:

admin_data = []
team_choice = input("Choose a team to sort")
if team_choice == 'Team 1':
path = 'team1scores.csv'

elif team_choice == 'Team 2':
path = 'team2scores.csv'

elif team_choice == 'Team 3':
path = 'team3scores.csv'

else:
print("--Error Defining File Path--")

print("As an admin you have access to sorting the data")
print("1 - Alpahbetical")
print("2 - Highest to Lowest")
print("3 - Average Score")

admin_int = int(input("Choose either 1, 2 or 3?"))

if sort_int == 1 and team_choice == 'Team 1':
do things

elif sort_int == 2 and team_choice == 'Team 1':
do things

elif sort_int == 3 and team_choice == 'Team 1':
do things


This part of the program will be used for each file, but have had no luck producing any solutions for each of the different sorting ways I need. I will also appreciate if the answer for the first part of my project is answered too.

EDIT (16:43):
I have managed to complete the highest to lowest part of the program but is printing:

[['Fred', '9'], ['George', '7'], ['Jake', '5'], ['Jake', '4'], ['Derek', '4'], ['Jake', '2']]


So if this is the formatting I read the data as, how will I be able to read the file for duplicate names and add the scores if they are in arrays like this?

Answer

The first step would be to break down the problem into small steps:

  1. How to open and handle the file (using the with statement at the bottom of that section)
  2. How to traverse a csv file
  3. How to sort the entries
  4. How to sort by the second value of each row
  5. How to print each element of a list on a separate line
  6. How to count total scores

Expanding on the last one you can total up the scores as well as the number of entries for each name like this:

import csv
import collections
...
with open(path) as f:
    entries = collections.Counter()
    total_scores = collections.Counter()
    for name,score in csv.reader(f):
        total_scores[name] += int(score)
        entries[name] += 1

Then you can calculate the average score for each person with total_scores[name] / entries[name]

for name in sorted(entries):
    ave_score = total_scores[name] / entries[name]
    print(name,ave_score) #sep=", ")

the other two actions are quite simple with a few of the steps listed above.

import csv
import collections
from operator import itemgetter

...

if sort_int == 1:
    with open(path) as f:
        reader = csv.reader(f)
        for name, score in sorted(reader):
            print(name,score)

elif sort_int == 2:
    with open(path) as f:
        entries = sorted(csv.reader(f), 
                         key=itemgetter(1), 
                         reverse=True)
        for name,score in entries:
            print(name,score)

elif sort_int == 3:
    with open(path) as f:
        entries = collections.Counter()
        total_scores = collections.Counter()
        for name,score in csv.reader(f):
            score = int(score)
            total_scores[name] += score
            entries[name] += 1

        for name in sorted(entries):
            ave_score = total_scores[name] / entries[name]
            print(name,ave_score)

If you want to apply the highest to lowest to the average scores then you will need to make a reference to all the averages such as a dict:

ave_scores = {}
for name in sorted(entries):
    ave_score = total_scores[name] / entries[name]
    ave_scores[name] = ave_score

for name,ave_score in sorted(ave_scores.items(), key = itemgetter(1), reversed=True):
    print(name,ave_score)