I've got this table (generated by Django):
CREATE TABLE feeds_person (
id serial PRIMARY KEY,
created timestamp with time zone NOT NULL,
modified timestamp with time zone NOT NULL,
name character varying(4000) NOT NULL,
url character varying(1000) NOT NULL,
email character varying(254) NOT NULL,
CONSTRAINT feeds_person_name_ad8c7469_uniq UNIQUE (name, url, email)
INSERT INTO feeds_person (created, modified, name, url, email)
VALUES blah blah blah
ON CONFLICT (name, url, email) DO UPDATE SET url = feeds_person.url
django.db.utils.ProgrammingError: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
The error you get:
ON CONFLICT DO UPDATE command cannot affect row a second time
indicates you are trying to upsert the same row more than once in a single command. In other words: you have dupes on
(name, url, email) in your
VALUES list. Fold duplicates (if that's an option) and it should work. But you will have to decide which row to pick from each set of dupes.
INSERT INTO feeds_person (created, modified, name, url, email) SELECT DISTINCT ON (name, url, email) * FROM (VALUES blah blah blah) v(created, modified, name, url, email) -- match column list ON CONFLICT (name, url, email) DO UPDATE SET url = feeds_person.url RETURNING id;
Since we use a free-standing
VALUES expression now, you have to add explicit type casts for non-default types. Like:
VALUES (timestamptz '2016-03-12 02:47:56+01' , timestamptz '2016-03-12 02:47:56+01' , 'n3', 'u3', 'e3') ...
timestamptz columns need an explicit type cast, while the string types can operate with default
text. (You could still cast to
varchar(n) right away.)
There are ways to determine which row to pick from each set of dupes:
You are right, there is (currently) no way to get excluded rows in the
RETURNING clause. I quote the Postgres Wiki:
RETURNINGdoes not make visible the "
EXCLUDED.*" alias from the
UPDATE(just the generic "
TARGET.*" alias is visible there). Doing so is thought to create annoying ambiguity for the simple, common cases  for little to no benefit. At some point in the future, we may pursue a way of exposing if
RETURNING-projected tuples were inserted and updated, but this probably doesn't need to make it into the first committed iteration of the feature .
However, you shouldn't be updating rows that are not supposed to be updated. Empty updates are almost as expensive as regular updates - and might have unintended side effects. You don't strictly need UPSERT to begin with, your case looks more like "SELECT or INSERT". Related:
One cleaner way to insert a set of rows would be with data-modifying CTEs:
WITH val AS ( SELECT DISTINCT ON (name, url, email) * FROM ( VALUES (timestamptz '2016-1-1 0:0+1', timestamptz '2016-1-1 0:0+1', 'n', 'u', 'e') , ('2016-03-12 02:47:56+01', '2016-03-12 02:47:56+01', 'n1', 'u3', 'e3') -- more (type cast only needed in 1st row) ) v(created, modified, name, url, email) ) , ins AS ( INSERT INTO feeds_person (created, modified, name, url, email) SELECT created, modified, name, url, email FROM val ON CONFLICT (name, url, email) DO NOTHING RETURNING id, name, url, email ) SELECT 'inserted' AS how, id FROM ins -- inserted UNION ALL SELECT 'selected' AS how, f.id -- not inserted FROM val v JOIN feeds_person f USING (name, url, email) WHERE NOT EXISTS ( SELECT 1 FROM ins WHERE name = v.name AND url = v.url AND email = v.email );
This is ignoring the order of rows, since it's not well defined while you still did not define how to fold duplicates exactly. You can have any order you want ...
The complexity should pay for big tables where
INSERT is the rule and
SELECT the exception. Else you might just
INSERT .. ON CONFLICT DO NOTHING, followed by a
SELECT for all rows - within the same transaction.
Either solution leaves a tiny window for a race condition if a concurrent transaction changes existing rows between
To close this window for good, use
DO UPDATE on excluded rows, but add a condition to the update to not actually update any rows that don't need an update:
ON CONFLICT(tag) DO UPDATE SET name = name WHERE FALSE -- never executed, only locks rows
Only rows for which this expression returns
truewill be updated, although all rows will be locked when the
ON CONFLICT DO UPDATEaction is taken. Note that condition is evaluated last, after a conflict has been identified as a candidate to update.
Bold emphasis mine.