I'm working with pandas and I have a dataframe that looks something like this.
df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [100,100,30,40],'CCC' : [100,100,30,-50]})
And I'm using .groupby() and .size() to find duplicate rows in only the 'BBB' and 'CCC' columns and turning the result into a dataframe like this :
duplicates=df.groupby(['BBB','CCC']).size().to_frame('num')
I find the format of this new dataframe duplicates hard to work with, even though it has all the data that I need inside of it. It looks like this when I look at it in the Variable explorer in Spyder:
Index num
(30,30) 1
(40,-50) 1
(100,100) 2
So the index contains the values of 'BBB' and 'CCC' that were repeated and num contains how many times they were repeated. I don't know how to access data from the index and parse it into the individual columns so the index is really the hardest thing to work with. I would really like it if instead the output looked like this
Index 'BBB' 'CCC' num
0 30 30 1
1 40 -50 1
2 100 100 2
P.S.
Sorry if the formatting is bad I still haven't found how to post well on this site.