pangpang pangpang - 1 year ago 49
SQL Question

Concurrent cause duplicate records inserted

For my rails app, I found the duplicate records from mysql database table.

user_product table

id | user_id | product_id | amount | created_at | update_at

181115 | 811115 | 1800 | 2 | 2014-10-16 03:00:13 | 2014-10-16 03:03:58

181116 | 811115 | 1800 | 2 | 2014-10-16 03:00:13 | 2014-10-16 03:03:58

but I have added the validates_uniqueness_of in user_product model.

class UserProduct < ActiveRecord::Base
validates_uniqueness_of :product_id, scope: :user_id

I found some useful information from rails doc.

Using this validation method in conjunction with ActiveRecord::Validations#save does not guarantee the absence of duplicate record insertions, because uniqueness checks on the application level are inherently prone to race conditions. This could even happen if you use transactions with the 'serializable' isolation level.

I checked the database isolation level, it is "REPEATABLE-READ", not "serializable".

SELECT @@global.tx_isolation;

I can add composite unique index to solve this problem.

add_index :user_product, [:product_id, :user_id], unique: true

but I hope find the root cause, anyone can help me. Thanks a lot!

Answer Source

The root cause may be a race condition in the app. Rails validates_uniqueness_of doesn't really work fully, because it can't guarantee uniqueness at the database level when you've got race conditions such as multiple threads, multiple web requests, etc.

In your particular case, the sequential id numbers could be caused by, for example, a user double-clicking a "Save" button that triggers two Ajax requests (instead of just one), and a Rails controller that does a model find_or_create.

Your database isolation of REPEATABLE-READ doesn't matter for the race condition. REPEATABLE-READ means every lock acquired during a transaction is held for the duration of the transaction; you can still have phantom reads. What the Rails tutorial was trying to explain is that the validates_uniqueness_of race condition happens even if you're using SERIALIZABLE, which prevents phantom reads and is thus more protected than REPEATABLE-READ.

To fix this in your app, stop relying exclusively on validates_uniqueness_of in Rails, and start using uniqueness guarantees that are built into the database, such as a unique primary key, or your solution of a unique composite index. This ensures that even when you have a race in Rails, your database will block the race.

Eliminating the race in Rails (not the DB) can be accomplished, but it's probably not a smart approach for a typical Rails web app. For example, you could eliminate the race in Rails by using a web server that allows only one request at a time, and only one Rails app connects to the database at a time.

If you discover that your race is caused by something like an Ajax button double-click, you can be very helpful to your user by putting some smarts into the button, so it doesn't send the same data twice in quick succession. This will eliminate an Ajax race for many common use cases.