DiamondDogs95 DiamondDogs95 - 7 months ago 104
SQL Question

Add column from one .csv to another .csv file using Python

I am currently writing a script where I am creating a csv file ('tableau_input.csv') composed both of other csv files columns and columns created by myself. I tried the following code:

def make_tableau_file(mp, current_season = 2016):
# Produces a csv file containing predicted and actual game results for the current season
# Tableau uses the contents of the file to produce visualization

game_data_filename = 'game_data' + str(current_season) + '.csv'
datetime_filename = 'datetime' + str(current_season) + '.csv'

with open('tableau_input.csv', 'wb') as writefile:
tableau_write = csv.writer(writefile)
tableau_write.writerow(['Visitor_Team', 'V_Team_PTS', 'Home_Team', 'H_Team_PTS', 'True_Result', 'Predicted_Results', 'Confidence', 'Date'])

with open(game_data_filename, 'rb') as readfile:
scores = csv.reader(readfile)
scores.next()

for score in scores:
tableau_content = score[1::]
# Append True_Result
if int(tableau_content[3]) > int(tableau_content[1]):
tableau_content.append(1)
else:
tableau_content.append(0)
# Append 'Predicted_Result' and 'Confidence'
prediction_results = mp.make_predictions(tableau_content[0], tableau_content[2])
tableau_content += list(prediction_results)

tableau_write.writerow(tableau_content)

with open(datetime_filename, 'rb') as readfile2:
days = csv.reader(readfile2)
days.next()

for day in days:
tableau_write.writerow(day)


'tableau_input.csv' is the file I am creating. The columns 'Visitor_Team', 'V_Team_PTS', 'Home_Team', 'H_Team_PTS' come from 'game_data_filename'(e.g tableau_content = score[1::]). The columns 'True_Result', 'Predicted_Results', 'Confidence' are columns created in the first for loop.
So far, everything works but finally I tried to add to the 'Date' column data from the 'datetime_filename' using the same structure as above but when I open my 'tableau_input' file, there is no data in my 'Date' column. Can someone solve this problem?

For info, below are screenshots of csv files respectively for 'game_data_filename' and 'datetime_filename' (nb: datetime values are in datetime format)
enter image description here

enter image description here

Answer

It's hard to test this as I don't really know what the input should look like, but try something like this:

def make_tableau_file(mp, current_season=2016):
    # Produces a csv file containing predicted and actual game results for the current season
    # Tableau uses the contents of the file to produce visualization

    game_data_filename = 'game_data' + str(current_season) + '.csv'
    datetime_filename = 'datetime' + str(current_season) + '.csv'

    with open('tableau_input.csv', 'wb') as writefile:
        tableau_write = csv.writer(writefile)
        tableau_write.writerow(
            ['Visitor_Team', 'V_Team_PTS', 'Home_Team', 'H_Team_PTS', 'True_Result', 'Predicted_Results', 'Confidence', 'Date'])

        with open(game_data_filename, 'rb') as readfile, open(datetime_filename, 'rb') as readfile2:
        scoreReader = csv.reader(readfile)
        scores = [row for row in scoreReader]
        scores = scores[1::]
        daysReader = csv.reader(readfile2)
        days = [day for day in daysReader]
        if(len(scores) != len(days)):
            print("File lengths do not match")
        else:
            for i in range(len(days)):
                tableau_content = scores[i][1::]
                tableau_date = days[i]
                # Append True_Result
                if int(tableau_content[3]) > int(tableau_content[1]):
                    tableau_content.append(1)
                else:
                    tableau_content.append(0)
                # Append 'Predicted_Result' and 'Confidence'
                prediction_results = mp.make_predictions(tableau_content[0], tableau_content[2])
                tableau_content += list(prediction_results)
                tableau_content += tableau_date

                tableau_write.writerow(tableau_content)

This combines both of the file reading parts into one.

As per your questions below:

scoreReader = csv.reader(readfile)
scores = [row for row in scoreReader]
scores = scores[1::]

This uses list comprehension to create a list called scores, with every element being one of the rows from scoreReader. As scorereader is a generator, every time we ask it for a row, it spits one out for us, until there are no more.

The second line scores = scores[1::] just chops off the first element of the list, as you don't want the header.

For more info try these:

Generators on Wiki
List Comprehensions

Good luck!

Comments