user2604307 user2604307 - 14 days ago 5
Python Question

Pandas itertuples first row returns true even if match isnt found

I have 2 csv files with exact same rows as below:


asas,asafdfdd,fgffgdvnufg,rterrtrrtr,wewewtyuhe,yuuiiyuyuy,uiuiui9u
absas,a2assafdfdd,fgffgedkfg,rtdfrtrrtr,wewewuikjhe,yuuiuiyouyuy,ui7u8iuiu
asbas,asasdfdfdd,fgffgfpoftg,rtrjktrrtr,wewewuyihe,yuyuyyupuy,uiu7iuiu
asabs,asafddffdd,fgffg2floig,rtrtrcxcrtr,weweyjunwe,yuyuyumy,uiui6uiu
asasbb,asafddfdd,fgffgdfkfg,rtrtrjkhrtr,wewewdfxe,yuyuyuny,uiui5uiu
absbas,asafdrtfdd,fgffgvbfg,rtrt3rrcxvtr,wewedfcwe,yuycuyuy,uiu4iuiu


I read these 2 csv files in 2 dataframes named df1 and df2 respectively. When I do
result = (df1==df2)
, I get another dataframe in results having True/False values for match (In this case True for all).

Now when with below code the first row is displayed even if there isnt 'False' value in that tuple.

for row in result.itertuples():
if(False in row):
print (row)


Why is this? Do i need to do something different here?

Whole code is here for reference:

import pandas as pd

df1 = pd.read_csv('test3.csv', header=None)
df2 = pd.read_csv('test4.csv', header=None)

result = (df1==df2)
print result
for row in result.itertuples():
if(False in row):
print (row)

Answer

It's because the first row has a zero in it. So False in [0] is True. This happens because the first index Value is zero.

if you shift your index values

result.index += 1

Then run your loop... it won't print.


Now this explains why this happens. But I wouldn't do whatever your doing this way.

I'd do

for i, row in result.iterrows():
    if not row.all():
        print(row)