ResMar ResMar - 27 days ago 16
Python Question

Pandas multi-indexing succeeds when given a tuple, fails with a list

I have data in the form of an array of lists, of the form

[['Manhattan', 142, 42], [...]]
. I have a
pd.DataFrame
with a multi-index, one containing, among other things, a column called
VAC
.

The following raises a
ValueError
:

for vac_bbl in vac_bbls:
property_profiles['VAC'][vac_bbl] = None


The traceback:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-98-e8edfc85d7ba> in <module>()
1 for vac_bbl in vac_bbls:
----> 2 property_profiles['VAC'][vac_bbl] = None

C:\Anaconda3\envs\test\lib\site-packages\pandas\core\series.py in __setitem__(self, key, value)
751 # do the setitem
752 cacher_needs_updating = self._check_is_chained_assignment_possible()
--> 753 setitem(key, value)
754 if cacher_needs_updating:
755 self._maybe_update_cacher()

C:\Anaconda3\envs\test\lib\site-packages\pandas\core\series.py in setitem(key, value)
747 pass
748
--> 749 self._set_with(key, value)
750
751 # do the setitem

C:\Anaconda3\envs\test\lib\site-packages\pandas\core\series.py in _set_with(self, key, value)
795 self._set_values(key.astype(np.bool_), value)
796 else:
--> 797 self._set_labels(key, value)
798
799 def _set_labels(self, key, value):

C:\Anaconda3\envs\test\lib\site-packages\pandas\core\series.py in _set_labels(self, key, value)
805 mask = indexer == -1
806 if mask.any():
--> 807 raise ValueError('%s not contained in the index' % str(key[mask]))
808 self._set_values(indexer, value)
809

ValueError: ['Manhattan' 1750.0 53.0] not contained in the index


However, the following works fine:

for vac_bbl in vac_bbls:
property_profiles['VAC'][tuple(vac_bbl)] = None


Why is this?

Answer

pandas uses a list in this context to return a dataframe of columns in which each column is indexed with each specific item in the list. A tuple is used to represent the multiple layers for that particular column in the multiindex. This makes perfect sense and is as expected.

You can also pass a list of tuples that will return a dataframe of columns, one column for each tuple.

Comments