Radical_Activity Radical_Activity - 4 months ago 12
SQL Question

How to avoid inserting duplicates into MySQL?

I have an analytics platform with lots of users and hundreds of inserting clicks / minute.

Sometimes I see that the exact same click is inserted to the Database within the same second and it becomes a duplicate of the other.

I have a system which checks if the table has the same value and not letting the other inserted if it finds one.

However in this case it looks to me that they're inserted into the DB in the exact same milisecond.

What can I do here?

Answer

My favorite: insert ignore myTable (col1, col2, ...) ...

where unique key(s) are setup beforehand to forbid the insert. It would appear that you do not care so much that it was previously setup as much as you care that the end result is not dupes.

Note: the unique keys can be multi-column keys (composites)

Short of that, one should look into intention locks, like here, but crafted for your particular use-case. Steer toward INNODB row-level locking that is swifty, and certainly not table locks. Most things come with a trade-off. The downside of locking is diminished concurrency.

A word of warning about insert ignore: it should not be implemented without careful thought of its ramifications for sensitive systems that need to know that the row was truly already there. It is ideal for "make sure it is there".

The reason your system probably already had the row there was due to high concurrency use without locking.