I am using os to list the filenames within a directory. I am also using pandas to list the contents of one column in a CSV file. I have printed the results of both and now I want to match the names that appear in both prints and also identify which names are exclusive to one print. Below is my code which gets the names and the contents of the CSV file.
import os, sys
import pandas as pd
path = "/mydir/csvfile"
dirs = os.listdir( path )
for file in dirs:
fields = ['Column']
df = pd.read_csv('/mydir/csv_file', skipinitialspace=True, usecols=fields)
for file in dirs: print file
Build a list:
files = [file for file in dirs]
Then use the DataFrame to check:
df.Column.isin(files) # this will check elementwise Out: 0 True 1 True 2 True 3 True Name: Column, dtype: bool
df.Column.isin(files).all() # if all of them are the same Out: True