Fitzerbirth Fitzerbirth - 6 months ago 99
SQL Question

Not able to add a column from a pandas data frame to mysql in python

I have connected to mysql from python and I can add a whole data frame to sql by using df.to_sql command. When I am adding/updating a single column from pd.DataFrame, not able udate/add.

Here is the information about dataset, result,

In [221]: result.shape
Out[221]: (226, 5)

In [223]: result.columns
Out[223]: Index([u'id', u'name', u'height', u'weight', u'categories'], dtype='object')


I have the table already in the database with all the columns except categories, so I just need to add the column to the table. From these,

Python MYSQL update statement

ProgrammingError: (1064, 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax

cursor.execute("ALTER TABLE content_detail ADD category VARCHAR(255)" % result["categories"])


This can be successfully add the column but with all NULL values,
and when I was trying this

cursor.execute("ALTER TABLE content_detail ADD category=%s VARCHAR(255)" % result["categories"])


ends with following error

ProgrammingError Traceback (most recent call last)
<ipython-input-227-ab21171eee50> in <module>()
----> 1 cur.execute("ALTER TABLE content_detail ADD category=%s VARCHAR(255)" % result["categories"])

/usr/lib/python2.7/dist-packages/mysql/connector/cursor.pyc in execute(self, operation, params, multi)
505 self._executed = stmt
506 try:
--> 507 self._handle_result(self._connection.cmd_query(stmt))
508 except errors.InterfaceError:
509 if self._connection._have_next_result: # pylint: disable=W0212

/usr/lib/python2.7/dist-packages/mysql/connector/connection.pyc in cmd_query(self, query)
720 if not isinstance(query, bytes):
721 query = query.encode('utf-8')
--> 722 result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
723
724 if self._have_next_result:

/usr/lib/python2.7/dist-packages/mysql/connector/connection.pyc in _handle_result(self, packet)
638 return self._handle_eof(packet)
639 elif packet[4] == 255:
--> 640 raise errors.get_exception(packet)
641
642 # We have a text result set

ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '=0 corporate
1 corporate


I think there is something I am missing with datatype, please help me to sort this out, thanks.

Air Air
Answer

You cannot add a column to your table with data in it all in one step. You must use at least two separate statements to perform the DDL first (ALTER TABLE) and the DML second (UPDATE or INSERT ... ON DUPLICATE KEY UPDATE).

This means that to add a column with a NOT NULL constraint requires three steps:

  1. Add nullable column
  2. Populate column with values in every row
  3. Add the NOT NULL constraint to the column

Alternatively, by using a "dummy" default value, you can do it in two steps (just be careful not to leave any "dummy" values floating around, or use values that are meaningful/well-documented):

  1. Add column as NOT NULL DEFAULT '' (or use e.g. 0 for numeric types)
  2. Populate column with values in every row

You can optionally alter the table again to remove the DEFAULT value. Personally, I prefer the first method because it doesn't introduce meaningless values into your table and it's more likely to throw an error if the second step has a problem. I might go with the second method when a column lends itself to a certain natural DEFAULT value and I plan to keep that in the final table definition.

Additionally, you are not parameterizing your query correctly; you should pass the parameter values to the method rather than formatting the string argument inside the method call. In other words:

cursor.execute("Query with %s, %s, ...", iterable_with_values)  # Do this!
cursor.execute("Query with %s, %s, ..." % iterable_with_values)  # NOT this!
Comments