oliversm oliversm - 4 months ago 55
Python Question

Pandas.read_excel: Accessing the home directory

[Solution Found]

I have encountered some unexpected behavior when trying to access my home directory using

pandas.read_excel
.

The file I want to access can be found at

/users/isys/orsheridanmeth


which is where
cd ~/
takes me to. The file I would like to access is

'~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx'


The following works to read in the excel file (using
import pandas as pd
):

df = pd.read_excel('workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx', 'Sheet1')


whereas

df = pd.read_excel('~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx', 'Sheet1')


gives me the following error:

df = pd.read_excel('~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx', 'Sheet1')
Traceback (most recent call last):
File "/users/is/ahlpypi/egg_cache/i/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/core/interactiveshell.py", line 3035, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-397-4412a9e7c128>", line 1, in <module>
df = pd.read_excel('~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx', 'Sheet1')
File "/users/is/ahlpypi/egg_cache/p/pandas-0.16.2_ahl1-py2.7-linux-x86_64.egg/pandas/io/excel.py", line 151, in read_excel
return ExcelFile(io, engine=engine).parse(sheetname=sheetname, **kwds)
File "/users/is/ahlpypi/egg_cache/p/pandas-0.16.2_ahl1-py2.7-linux-x86_64.egg/pandas/io/excel.py", line 188, in __init__
self.book = xlrd.open_workbook(io)
File "/users/is/ahlpypi/egg_cache/x/xlrd-0.9.2-py2.7.egg/xlrd/__init__.py", line 394, in open_workbook
f = open(filename, "rb")
IOError: [Errno 2] No such file or directory: '~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx'


pandas.read_csv however worked when I used
pd.read_csv('~/workspace/tip_rank/data/example_TRESS-2016-7-5.csv')
.

I would like to continue to use this relative paths to files. Any explanation why this doesn't work with
pandas.read_excel
?

Using
xlrd


when using
xlrd
I get a similar error:

import xlrd
xl = xlrd.open_workbook('~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx')
Traceback (most recent call last):
File "/users/is/ahlpypi/egg_cache/i/ipython-3.2.0_1_ahl1-py2.7.egg/IPython/core/interactiveshell.py", line 3035, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-403-90af31feff4b>", line 1, in <module>
xl = xlrd.open_workbook('~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx')
File "/users/is/ahlpypi/egg_cache/x/xlrd-0.9.2-py2.7.egg/xlrd/__init__.py", line 394, in open_workbook
f = open(filename, "rb")
IOError: [Errno 2] No such file or directory: '~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx'


[SOLUTION]

from os.path import expanduser as ospath
df = pd.read_excel(ospath('~/workspace/tip_rank/data/example_TipRanksBloggersRawDataFeed.xlsx'), 'Sheet1')

Answer

I believe ~ is expanded by the shell - in which case your code is literally trying to open a path starting with ~. Oddly enough this doesn't work. :-)

Try running the path through os.path.expanduser() first - that should work to expand the ~ variable to the real value.

You may also want to look into os.path.expandvars().

Hope that helps