Maryam Pashmi Maryam Pashmi - 1 year ago 153
Python Question

Pandas python, what is the workbook encoding type?

I am new to python and also to the pandas library in Python. The documentation is not well described and they didn't explain it well. I want to save the dataframe as an excel format and in the memory and I have found the following explanation:
[Pandas excel to the memory]

I need an explanation about

. The value of this variable is encoded, how can I see the real value of this variable? How to decode it? what should be return value for it?


How to pass it into the content of attachment in Mandrill api.

This is my attachment part for excel extension :

'attachments': [
'content': content,
'name': 'fraud_report.xlsx',
'type': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'

I am not able to open the excel file and all the time i'am getteing an error from Microsoft excel which said
the file format is not valid!...

Any help can be helpful. Thank you

Answer Source

For the explanation i paste the example from your link here again:

# Safe import for either Python 2.x or 3.x
    from io import BytesIO
except ImportError:
    from cStringIO import StringIO as BytesIO

bio = BytesIO()

# By setting the 'engine' in the ExcelWriter constructor.
writer = ExcelWriter(bio, engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')

# Save the workbook

# Seek to the beginning and read to copy the workbook to a variable in memory
workbook =

The method saves the data in BytesIO (bio) instead of Excel file. It means, that the variable bio stores the bytes codes of the excel file.

The method sets the current position (for reading, writing,...) of bio to 0. So that you can read the data of bio from beginning with the next method

The variable workbook store the bytes string of the excel file (or excel workbook). If you read a excel file in byte mode, you will get the same data. Or you can write it in a excel file:

with open("my_excel_file.xlsx", "wb") as f:

To read data from bio and store in a DataFrame you don't need
df = pd.read_excel(bio, "Sheet1", engine="xlrd")

For your question about using of mandrill:

In the example of mandrill you see:

{'attachments': [{'content': 'ZXhhbXBsZSBmaWxl',
                      'name': 'myfile.txt',
                      'type': 'text/plain'}],...

The documentation writes about it too:

content: the content of the attachment as a base64-encoded string

You should encode workbook in base64 and use it for sending

import base64
content = base64.b64encode(workbook)

P/S: workbook and content have the type bytes. May be you need to convert content to str before sending.

{'attachments': [{'content': content.decode('utf-8'),
                          'name': 'myfile.xlsx',
                          'type': 'text/plain'}],...

Add: if file is excel then you should change type to application/vnd.openxmlformats-officedocument.spreadsheetml.sheet