Michael Molter Michael Molter - 6 months ago 13
Python Question

"not in" comparison not working as expected

I am having trouble with the

not in
comparison operator in Python 2.7. I have a list of US state abbreviations, and I want to check if a given abbreviation is not in that list, so I use:

'IL' not in states['Abbreviation']


Unexpectedly, I got a True; however, when I do the following, I also get a true.

'IL' == states['Abbreviation'][13]


'IL'
is the 14th item on the list of abbreviations, and when I use
==
I can prove that it is in the list; however, when I use the
not in
comparison, it doesn't see it in the list? What gives?

I am a little new to python, so hopefully the answer isn't too embarrassing.

Thanks,

Michael

EDIT: And yes, I did my best to 'google' an answer before posting, but searching Google for the terms 'not in' is a futile endeavor, and the behavior I described above does not seem to be consistent with how the comparison is said to work in the documentation.

EDIT2: The list

in[89]: states['Abbreviation']

out[89]:

0 AL
1 AK
2 AZ
3 AR
4 CA
5 CO
6 CT
7 DE
8 DC
9 FL
10 GA
11 HI
12 ID
13 IL
14 IN
15 IA
16 KS
17 KY
18 LA
19 ME
20 MT
21 NE
22 NV
23 NH
24 NJ
25 NM
26 NY
27 NC
28 ND
29 OH
30 OK
31 OR
32 MD
33 MA
34 MI
35 MN
36 MS
37 MO
38 PA
39 RI
40 SC
41 SD
42 TN
43 TX
44 UT
45 VT
46 VA
47 WA
48 WV
49 WI
50 WY
Name: Abbreviation, dtype: object


EDIT3:

I defined the list using pandas in iPython notebook

import pandas as pd
states = pd.read_table('states.csv', sep=',')


states.csv is a file contain the state name in the first column, and the abbreviation in the second. That's pretty much all of it. What confused me is why is could use the
==
in one line to show it was in the list, and then not have
not in
give the right answer?

EDIT4:

As requested by a reply,

In [92]: type(states['Abbreviation'])
Out[92]: pandas.core.series.Series

Answer

It looks like your states is not a list, but a pandas DataFrame, and states['Abbreviation'] is one of its columns (a pandas Series). Using in on a Series checks whether the value is in the index, not the values. Try 'IL' in states['Abbreviation'].values.