duckertito duckertito - 12 days ago 13
Python Question

Reading the column of DataFrame fails

I am using python 2.7 with Anaconda2. When I do a simple reading of a txt file into the dataframe (

df = pd.read_table("/home/testtab.txt",sep='\t',index_col=False)
) and then read one of the columns as follows (
df["col1"].head()
), it gives me an error (see below).
Also the row index is present, while I expected to disable it with
index_col=False
.
The output of
df.columns
is the following:

Index([u'col1', u'col2'],dtype='object')


Error:

KeyError Traceback (most recent call last)
<ipython-input-3-c472f01c3482> in <module>()
----> 1 df["col1"].head()

/home/gooo/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
1995 return self._getitem_multilevel(key)
1996 else:
-> 1997 return self._getitem_column(key)
1998
1999 def _getitem_column(self, key):

/home/gooo/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py in _getitem_column(self, key)
2002 # get column
2003 if self.columns.is_unique:
-> 2004 return self._get_item_cache(key)
2005
2006 # duplicate columns & possible reduce dimensionality

/home/gooo/anaconda2/lib/python2.7/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
1348 res = cache.get(item)
1349 if res is None:
-> 1350 values = self._data.get(item)
1351 res = self._box_item_values(item, values)
1352 cache[item] = res

/home/gooo/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py in get(self, item, fastpath)
3288
3289 if not isnull(item):
-> 3290 loc = self.items.get_loc(item)
3291 else:
3292 indexer = np.arange(len(self.items))[isnull(self.items)]

/home/gooo/anaconda2/lib/python2.7/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
1945 return self._engine.get_loc(key)
1946 except KeyError:
-> 1947 return self._engine.get_loc(self._maybe_cast_indexer(key))
1948
1949 indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4018)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322)()

KeyError: 'col1'


EDIT:

Output of
df.info()
:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7499900 entries, 0 to 7499899
Data columns (total 18 columns):
col1 object
col2 float64
dtypes: float64(1), object(1)
memory usage: 1.0+ GB

Answer

There was problem with pandas version 0.18.1, after upgrade to 0.19.1 it works fine.

Maybe some bug I think.

Comments