Rafael Rafael - 2 months ago 17
Python Question

Creating User-PageView Matrix from CSV Table

I have a scenario where I need to create a User vs Page View matrix of our web application.

The data is in the form:

Page Name UserName Count of Page Views by The User
Home David 12
Home Minerva 56
Home Michael 1112
Buy David 2
Buy Mike 12


I want to create a User vs Page View matrix where each entry in the matrix is the Count.

I am using the Python stack and is there any way I can create the matrix (numpy) automatically?

I guess case by case parsing will be very tedious and this is a general use case, so there must be some function but I couldn't find it.

Thanks for your help.

Answer Source

It seems you need pivot or unstack:

df1 = df.pivot(index='Page Name',columns='UserName',values='Count of Page Views by The User')

df1 = df.set_index(['Page Name','UserName'])['Count of Page Views by The User'].unstack()
print (df1)
UserName   David  Michael  Mike  Minerva
Page Name                               
Buy          2.0      NaN  12.0      NaN
Home        12.0   1112.0   NaN     56.0