R Mahmood R Mahmood - 4 months ago 10
Python Question

Match identical words in two prints

I am using os to list the filenames within a directory. I am also using pandas to list the contents of one column in a CSV file. I have printed the results of both and now I want to match the names that appear in both prints and also identify which names are exclusive to one print. Below is my code which gets the names and the contents of the CSV file.

import os, sys
import pandas as pd


path = "/mydir/csvfile"
dirs = os.listdir( path )

for file in dirs:
print file

fields = ['Column']

df = pd.read_csv('/mydir/csv_file', skipinitialspace=True, usecols=fields)

print df.Column

Answer

Instead of

for file in dirs:
    print file

Build a list:

files = [file for file in dirs]

Then use the DataFrame to check:

df.Column.isin(files)  # this will check elementwise
Out: 
0    True
1    True
2    True
3    True
Name: Column, dtype: bool

Or

df.Column.isin(files).all()  # if all of them are the same
Out: True