in code veritas in code veritas - 29 days ago 6
Python Question

Simple way to create multiple columns with one row from multiple rows sharing the same key in pandas

I'm trying to create a dataframe from vote data that is in the following format:

Name,StateCode,GeoStratum,CountyCode,fips,Precinct,PrecinctReport,TotalVotes,FullName,VoteCount,ElectoralVote,Percent
Hawaii,HI,2,1,15001,43,43,64865,Hillary Clinton,64
Hawaii,HI,2,1,15001,43,43,64865,Donald Trump,27
Hawaii,HI,2,1,15001,43,43,64865,Gary Johnson,4
Hawaii,HI,2,1,15001,43,43,64865,Jill Stein,4


I'd like to convert this data into a format like this:

Name,StateCode,GeoStratum,CountyCode,fips,Precinct,PrecinctReport,TotalVotes,FullName,VoteCount,ElectoralVote,Clinton,Trump,Johnson,Stein
Hawaii,HI,2,1,15001,43,43,64865,64,27,4,4


Is there a simple way to take the
fips
column as a key, then use the values from Percent where 'Hillary Clinton' or 'Donald Trump' etc.. are the values in
FullName
to fill the
Trump
,
Clinton
etc columns?

Of course, a couple nested loops will do this. Hoping to find a nice way.

Answer

Use pivot_table and declare the index, columns, and values you want to get in your pivoted data:

df.pivot_table(index=['Name', 'StateCode', 'GeoStratum', 'CountyCode', 'fips', 'Precinct',
       'PrecinctReport', 'TotalVotes'], columns='FullName', values='VoteCount')

Eventually use reset_index to get the table you need and drop useless columns that can remain from this pivot.