RDJ RDJ - 7 months ago 8
Python Question

Pandas: How to reference and print multiple dataframes as HTML tables

I'm trying split out individual dataframes from a groupby to print them as pandas HTML tables. I need to reference and render them individually as tables so I can screenshot them for a presentation.

This is my existing code:

import pandas as pd

df = pd.DataFrame(
{'area': [5, 42, 20, 20, 43, 78, 89, 30, 46, 78],
'cost': [52300, 52000, 25000, 61600, 43000, 23400, 52300, 62000, 62000, 73000],
'grade': [1, 3, 2, 1, 2, 2, 2, 4, 1, 2], 'size': [1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005],
'team': ['man utd', 'chelsea', 'arsenal', 'man utd', 'man utd', 'arsenal', 'man utd', 'chelsea', 'arsenal', 'arsenal']})

result = df.groupby(['team', 'grade']).agg({'cost':'mean', 'area':'mean', 'size':'sum'}).rename(columns={'cost':'mean_cost', 'area':'mean_area'})

dfs = {team:grp.drop('team', axis=1)
for team, grp in result.reset_index().groupby('team')}

for team, grp in dfs.items():
print('{}:\n{}\n'.format(team, gap))


Which prints (as non HTML tables):

chelsea:
grade mean_cost mean_area size
2 3 52000 42 957
3 4 62000 30 849

arsenal:
grade mean_cost mean_area size
0 1 62000.000000 46.000000 973
1 2 40466.666667 58.666667 3110

man utd:
grade mean_cost mean_area size
4 1 56950 12.5 2445
5 2 47650 66.0 2579


Is it possible to get these dataframes one by one as HTML tables? For the avoidance of doubt, I don't need an iterative method to return them all as HTML tables - am happy to reference each one individually.

Answer

You can use df.to_html() to generate HTML. And per Harmon's answer, HTML can be displayed in the IPython notebook by using

from IPython.core import display as ICD
html = df.to_html()
ICD.display(ICD.HTML(html))

For example, in an IPython notebook,

import pandas as pd
from IPython.core import display as ICD


df = pd.DataFrame(
    {'area': [5, 42, 20, 20, 43, 78, 89, 30, 46, 78],
     'cost': [52300, 52000, 25000, 61600, 43000, 23400, 52300, 62000, 62000, 73000], 
     'grade': [1, 3, 2, 1, 2, 2, 2, 4, 1, 2], 'size': [1045, 957, 1099, 1400, 1592, 1006, 987, 849, 973, 1005], 
     'team': ['man utd', 'chelsea', 'arsenal', 'man utd', 'man utd', 'arsenal', 'man utd', 'chelsea', 'arsenal', 'arsenal']})

result =  df.groupby(['team', 'grade']).agg({'cost':'mean', 'area':'mean', 'size':'sum'}).rename(columns={'cost':'mean_cost', 'area':'mean_area'})

dfs = {team:grp.drop('team', axis=1) 
       for team, grp in result.reset_index().groupby('team')}

for team, grp in dfs.items():
    print(team)
    ICD.display(ICD.HTML(grp.to_html()))

generates
enter image description here