codingknob codingknob - 4 months ago 12
Python Question

code to regenerate a dataframe in pandas for Stackoverflow/SO questions

Lets say I have the following dataframe. I want to ask a question on Stackoverflow/SO regarding a type of manipulation I am trying to do. Now, to help users on SO its generally best practice to supply the code to regenerate the dataframe in question.

sunlight
sum count
city date
SFO 2014-05-31 -1805.04 31
SFO 2014-06-30 -579.52 30
SFO 2014-07-31 1025.51 31
SFO 2014-08-31 -705.18 31
SFO 2014-09-30 -1214.33 30


I don't want to manually type in all the text required to supply the code that generates the above dataframe. Is there a pandas function/command I can invoke that would output the dataframe in some sort of structure that someone can easily copy and paste into their python/ipython command line in order to generate the dataframe object. Something like
df.head().to_clipboard()
but instead of the copying the display of the df, copy the code required to produce the df.

The above dataframe is fairly simple but for complicated dataframes its extremely cumbersome to manually type in the code required to generate the dataframe in a SO question.

Answer

Use to_dict()

Let's say you have this df

df = pd.DataFrame(np.arange(16).reshape(4, 4), list('abcd'),
                  pd.MultiIndex.from_product([list('AB'), ['One', 'Two']]))
df

enter image description here

print df

    A       B    
  One Two One Two
a   0   1   2   3
b   4   5   6   7
c   8   9  10  11
d  12  13  14  15

I'd first print df.to_dict()

print df.to_dict()

{('B', 'One'): {'a': 2, 'c': 10, 'b': 6, 'd': 14}, ('A', 'Two'): {'a': 1, 'c': 9, 'b': 5, 'd': 13}, ('A', 'One'): {'a': 0, 'c': 8, 'b': 4, 'd': 12}, ('B', 'Two'): {'a': 3, 'c': 11, 'b': 7, 'd': 15}}

Then I'd copy that and paste it into a pd.DataFrame(). You can slightly format the copied text for readability.

df = pd.DataFrame({('B', 'One'): {'a': 2, 'c': 10, 'b': 6, 'd': 14},
                   ('A', 'Two'): {'a': 1, 'c': 9, 'b': 5, 'd': 13},
                   ('A', 'One'): {'a': 0, 'c': 8, 'b': 4, 'd': 12},
                   ('B', 'Two'): {'a': 3, 'c': 11, 'b': 7, 'd': 15}})

df

enter image description here

print df

    A       B    
  One Two One Two
a   0   1   2   3
b   4   5   6   7
c   8   9  10  11
d  12  13  14  15
Comments