dream_machine dream_machine - 6 months ago 46
MySQL Question

Influxdb-python: error inserting utf8 data

I want to insert some data from mysql to influxDB. The data in sql are utf-8 encoded and i am using python 2.6.6 on vagrant.

$] python
Python 2.6.6 (r266:84292, Jul 23 2015, 15:22:56)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2


Here is the structure of table and sample data of mysql.

mysql> show create table countries;
-----------------------------------+
| Table | Create Table |
+-----------+----------------------+
| countries | CREATE TABLE `countries` (
`id` smallint(5) unsigned DEFAULT NULL,
`name` varchar(100) DEFAULT NULL,
KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+----------------------------------+

mysql> select id,name from countries;
+------+----------------------------------+
| id | name |
+------+----------------------------------+
| 11 | Afghanistan |
| 12 | Åland Islands |
| 13 | Albania |
| 14 | Côte d’Ivoire |
+------+----------------------------------+


Some countries contains special character. I am using below python code to fetch data from mysql and the insert into influxDB.

#!/usr/bin/python

import MySQLdb
import json
from influxdb import InfluxDBClient

# Open database connection
db = MySQLdb.connect("localhost","root","","platform" )

# prepare a cursor object using cursor() method
cursor = db.cursor()

# execute SQL query using execute() method.
cursor.execute("SELECT id,name from countries")

# Fetch a all row using fetchall() method.
data = cursor.fetchall()

json_body = []

for id,name in data:
print id,name
json_1 = {
"measurement": "cpu_load_short",
"tags": {
"host": "server01",
"region": "us-west"
},
"time": id,
"fields": {
"value": name.decode('utf8') // ERROR
}
}
#json_1 = json.dumps(json_1).encode('utf8')
json_body.append(json_1)

#json_body = json.dumps(json_body, ensure_ascii=False).encode('utf8')

client = InfluxDBClient('localhost', 8086, 'root', 'root', 'example')

client.create_database('example')

client.write_points(json_body)

result = client.query('select * from cpu_load_short;')

print("Result: {0}".format(result))

# disconnect from server
db.close()


I am getting error while decoding data:

$] python test.py
11 Afghanistan
12 Åland Islands
Traceback (most recent call last):
File "test.py", line 31, in <module>
"value": name.decode('utf8')
File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc5 in position 0: invalid continuation byte


If i remove decoding from name:

"fields": {
"value": name
}


I still get error

Traceback (most recent call last):
File "test.py", line 43, in <module>
client.write_points(json_body)
File "/usr/lib/python2.6/site-packages/influxdb/client.py", line 391, in write_points
tags=tags)
File "/usr/lib/python2.6/site-packages/influxdb/client.py", line 436, in _write_points
expected_response_code=204
File "/usr/lib/python2.6/site-packages/influxdb/client.py", line 276, in write
data=make_lines(data, precision).encode('utf-8'),
File "/usr/lib/python2.6/site-packages/influxdb/line_protocol.py", line 119, in make_lines
value = _escape_value(point['fields'][field_key])
File "/usr/lib/python2.6/site-packages/influxdb/line_protocol.py", line 53, in _escape_value
value = _get_unicode(value)
File "/usr/lib/python2.6/site-packages/influxdb/line_protocol.py", line 73, in _get_unicode
return data.decode('utf-8')
File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc5 in position 0: invalid continuation byte


Any solution of this problem ?

Answer

Use unidecode to convert unicode data to ASCII text.

import unidecode
name = unidecode.unidecode_expect_nonascii(name)