Steven G Steven G - 2 months ago 22
SQL Question

how to INSERT but IGNORE same date tag with pymssql python

I am having trouble ignoring a duplicate 'date' values when I am importing data

import pymssql
import datetime as dt

server = 'xx'
user = 'xx'
password = 'xx'
db = 'xx'
conn = pymssql.connect(server, user, password, db, as_dict=True)

cur = conn.cursor()
Table_create = 'CREATE TABLE table1 (t_date date, price FLOAT)'
cur.execute(Table_create)

today = str(dt.datetime.now().date())
yesterday = str(dt.datetime.now().date() - dt.timedelta(days=1))
nextday = str(dt.datetime.now().date() + dt.timedelta(days=1))
cur.executemany("INSERT INTO table1(t_date, price) VALUES(%s,%d)", [(today, 100), (yesterday, 200)])
conn.commit()


I am getting

In[78]: x
Out[78]:
[{u'price': 100.0, u't_date': u'2016-09-25'},
{u'price': 200.0, u't_date': u'2016-09-24'}]


I want to ignore a same date record if I try to insert it again such has:

cur.executemany("INSERT INTO table1(t_date, price) VALUES(%s,%d)", [(today, 100), (nextday, 200)])
conn.commit()


but I get duplicate such has:

In[80]: x
Out[80]:
[{u'price': 100.0, u't_date': u'2016-09-25'},
{u'price': 200.0, u't_date': u'2016-09-24'},
{u'price': 100.0, u't_date': u'2016-09-25'},
{u'price': 200.0, u't_date': u'2016-09-26'}]


Thanks for help!

Answer

You have a few options, you can make date your primary key, then on any duplicate entries just update the price using ON DUPLICATE KEY UPDATE:

Table_create = '''CREATE TABLE table1 (t_date date PRIMARY KEY, price FLOAT )'''
cur.execute(Table_create)

today = str(dt.datetime.now().date())
yesterday = str(dt.datetime.now().date() - dt.timedelta(days=1))
nextday = str(dt.datetime.now().date() + dt.timedelta(days=1))
cur.executemany("INSERT  INTO table1(t_date, price) VALUES(%s, %s) on DUPLICATE KEY UPDATE price=price",([(today, 100), (yesterday, 200)]))

That would only update the price if you happened to get a different/new price for an existing row or do an insert for a new date

You could also just IGNORE any duplicates:

"INSERT INSERT IGNORE INTO table1(t_date, price) VALUES(%s, %s)"

But IGNORE will ignore more than an IntegrityError based on the duplicate entry so I would personally prefer the former.

Another option if you wanted to have an auto_increment id and to use both t_date and price or just the t_date would be to create a UNIQUE CONSTRAINT

  Table_create = '''CREATE TABLE table1 (
                            ID int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
                            t_date date,
                            price FLOAT,
                            CONSTRAINT no_dupes UNIQUE  (t_date, price))''' # CONSTRAINT no_dupes UNIQUE (t_date) for just the t_date
cur.execute(Table_create)

The same logic would apply when inserting.