R Mahmood - 1 year ago
Python Question

Match identical words in two prints

I am using os to list the filenames within a directory. I am also using pandas to list the contents of one column in a CSV file. I have printed the results of both and now I want to match the names that appear in both prints and also identify which names are exclusive to one print. Below is my code which gets the names and the contents of the CSV file.

import os, sys
import pandas as pd

path = "/mydir/csvfile"
dirs = os.listdir( path )

for file in dirs:
print file

fields = ['Column']

df = pd.read_csv('/mydir/csv_file', skipinitialspace=True, usecols=fields)

print df.Column

Answer Source

Instead of

for file in dirs:
    print file

Build a list:

files = [file for file in dirs]

Then use the DataFrame to check:

df.Column.isin(files)  # this will check elementwise
0    True
1    True
2    True
3    True
Name: Column, dtype: bool


df.Column.isin(files).all()  # if all of them are the same
Out: True
