Michael Molter Michael Molter - 5 months ago 6
Python Question

"not in" comparison not working as expected

I am having trouble with the

not in
comparison operator in Python 2.7. I have a list of US state abbreviations, and I want to check if a given abbreviation is not in that list, so I use:

'IL' not in states['Abbreviation']

Unexpectedly, I got a True; however, when I do the following, I also get a true.

'IL' == states['Abbreviation'][13]

is the 14th item on the list of abbreviations, and when I use
I can prove that it is in the list; however, when I use the
not in
comparison, it doesn't see it in the list? What gives?

I am a little new to python, so hopefully the answer isn't too embarrassing.



EDIT: And yes, I did my best to 'google' an answer before posting, but searching Google for the terms 'not in' is a futile endeavor, and the behavior I described above does not seem to be consistent with how the comparison is said to work in the documentation.

EDIT2: The list

in[89]: states['Abbreviation']


0 AL
1 AK
2 AZ
3 AR
4 CA
5 CO
6 CT
7 DE
8 DC
9 FL
10 GA
11 HI
12 ID
13 IL
14 IN
15 IA
16 KS
17 KY
18 LA
19 ME
20 MT
21 NE
22 NV
23 NH
24 NJ
25 NM
26 NY
27 NC
28 ND
29 OH
30 OK
31 OR
32 MD
33 MA
34 MI
35 MN
36 MS
37 MO
38 PA
39 RI
40 SC
41 SD
42 TN
43 TX
44 UT
45 VT
46 VA
47 WA
48 WV
49 WI
50 WY
Name: Abbreviation, dtype: object


I defined the list using pandas in iPython notebook

import pandas as pd
states = pd.read_table('states.csv', sep=',')

states.csv is a file contain the state name in the first column, and the abbreviation in the second. That's pretty much all of it. What confused me is why is could use the
in one line to show it was in the list, and then not have
not in
give the right answer?


As requested by a reply,

In [92]: type(states['Abbreviation'])
Out[92]: pandas.core.series.Series


It looks like your states is not a list, but a pandas DataFrame, and states['Abbreviation'] is one of its columns (a pandas Series). Using in on a Series checks whether the value is in the index, not the values. Try 'IL' in states['Abbreviation'].values.