normic normic - 3 months ago 16
Python Question

csv import with Python's QUOTE_NONNUMERIC not working as expected

I can't figure this out, maybe I'm getting blind for looking to long on the same stuff...

I have this kind of lines in a CSV file:

""BIN"",""Afg"",""SONIC/SONIC JET/"",1,8.9095,""Due to the dynamic nature of the exemptions granted to many operators, the Contract Price does not reflect V.A.T. / G.S.T., Mineral Oil Taxes, Federal Excise Taxes or other taxes to which an operator may be exempt. Please contact your salesperson or World Fuel Services if you require assistance in generating a fuel price estimate."",""N/A"",""01-NOV-2013"


Which I'm trying to import like this:

data = csv.DictReader(open(newdatafile), delimiter=',', quoting=csv.QUOTE_NONNUMERIC)
data.fieldnames = [
'iata', 'country', 'fbo', 'quantity', 'price', 'remarks', 'special', 'validdate'
]

for row in data:
fuelentry = FuelPriceImport()
fuelentry.iata = row['iata']
fuelentry.fbo = row['fbo']
fuelentry.min_quantity = row['quantity']
fuelentry.net_price_liter = row['price']
fuelentry.remarks = row['remarks']
fuelentry.save()


When I run this piece of code, it always complains about:

could not convert string to float: the Contract Price does not reflect V.A.T. / G.S.T.


Which is obviously directly after the comma within the double quoted string.

Shouldn't
QUOTE_NONNUMERIC
avoid exactly this, as the whole text is within double quotes?

Answer

Your input format uses doubled quotes, which is the CSV equivalent of escaping the quotes.

You'll have to replace doubled quotes with single quotes; you can do this on-the-fly with a wrapper generator:

def undoublequotes(fobject):
    for line in fobject:
        yield line.replace('""', '"')

This does assume that the column data itself does not contain doubled quotes.

Demo:

>>> import csv
>>> from pprint import pprint
>>> def undoublequotes(fobject):
...     for line in fobject:
...         yield line.replace('""', '"')
... 
>>> sample = '''\
... ""BIN"",""Afg"",""SONIC/SONIC JET/"",1,8.9095,""Due to the dynamic nature of the exemptions granted to many operators, the Contract Price does not reflect V.A.T. / G.S.T., Mineral Oil Taxes, Federal Excise Taxes or other taxes to which an operator may be exempt.  Please contact your salesperson or World Fuel Services if you require assistance in generating a fuel price estimate."",""N/A"",""01-NOV-2013"
... '''
>>> reader = csv.reader(undoublequotes(sample.splitlines(True)),
...                     quoting=csv.QUOTE_NONNUMERIC)
>>> pprint(next(reader))
['BIN',
 'Afg',
 'SONIC/SONIC JET/',
 1.0,
 8.9095,
 'Due to the dynamic nature of the exemptions granted to many operators, the Contract Price does not reflect V.A.T. / G.S.T., Mineral Oil Taxes, Federal Excise Taxes or other taxes to which an operator may be exempt.  Please contact your salesperson or World Fuel Services if you require assistance in generating a fuel price estimate.',
 'N/A',
 '01-NOV-2013']