Claire Claire - 9 days ago 6
Python Question

add a row in pandas dataframe without knowing number of columns

I'm new to python and I would appreciate if you give me an answer as soon as possible.

I'm processing a file containing reviews for products that can belong to more than 1 category. What I need is to group the review ratings by the categories, and date at the same time. Since I don't know the exact number of categories, or dates in advance, I need to add rows and columns as I'm processing the reviews data (50 GB file).

I've seen how I can add columns, however my trouble is adding a row without knowing how many columns are currently in the dataframe.

Here is my code:

list1=['Movies & TV', 'Books'] #categories so far
dfMain=pandas.DataFrame(index=list1,columns=['2002-09']) #only one column at the beginnig
print(dfMain)


This is what dfMain looks like:

enter image description here

If I want to add a column, I simply do this:
dfMain.insert(0, date, 0) #where date is in format like '2002-09'

But if I want to add a new category(row) and fill all the dates(columns) with zeros? How do I do that? I've tried with method append, but it asks for all the columns as parameters. Method Insert doesn't seem to work either..

Answer

Here's a possible solution:

dfMain.append(pd.Series(index=dfMain.columns, name='NewRow').fillna(0))

             2002-09
Movies & TV  NaN
Books        NaN
NewRow       0.0