newbie_girl newbie_girl - 1 month ago 6
SQL Question

How to make query that will return differences between two tables in PostgreSQL

I need to return differences between two tables.

CREATE TEMPORARY TABLES

CREATE TEMP TABLE first(
zoom smallint NOT NULL,
x integer NOT NULL,
y integer NOT NULL
);

CREATE TEMP TABLE second(
zoom smallint NOT NULL,
x integer NOT NULL,
y integer NOT NULL
);


INSERT DATA

INSERT INTO first(zoom,x,y) VALUES(5,2,25);
INSERT INTO first(zoom,x,y) VALUES(5,4,45);
INSERT INTO first(zoom,x,y) VALUES(5,7,34);
INSERT INTO first(zoom,x,y) VALUES(5,45,40);
INSERT INTO first(zoom,x,y) VALUES(5,72,63);
INSERT INTO second(zoom,x,y) VALUES(5,2,25);
INSERT INTO second(zoom,x,y) VALUES(5,4,45);
INSERT INTO second(zoom,x,y) VALUES(5,7,34);


Wanted result:

In table first there are extra rows:
5,45,40
5,72,63


Edit

Sorry about this, but I have figured out now that my original data is pretty much complexed then the sample I provided. So in the original data, table first consists of 900 rows, and table second of 935 rows. I assumed that the rows are distinct in each table, however as I am not sure now, so I would like to include this condition in a query. I assumed query would return 35 rows as difference, because I was pretty much convinced that all zoom/x/y would be same except this 35 one. However, it might now be the case. So basically what I need to know is what are differences between two tables, whatever approach is the best to solve it.

Can I get something like this:

zoom | x | y | first |second
------+----+--- +-------+------
5 | 45 | 40 | yes | no |


order by first yes, second no

zoom | x | y | first |second
------+----+--- +-------+------
5 | 45 | 40 | yes | no |
5 | 45 | 40 | yes | no |
5 | 45 | 40 | yes | no |


then first no, second yes

zoom | x | y | first |second
------+----+--- +-------+------
5 | 45 | 40 | no | yes |
5 | 45 | 40 | no | yes |
5 | 45 | 40 | no | yes |

Answer

As you want to compare all columns of both tables, you can use a full outer join on all columns and the check if one of them is:

select case 
          when f.zoom is null then 'missing in first'
          when s.zoom is null then 'missing in second'
       end as status, 
       zoom, x, y
from "first" f
  full outer join second s using (zoom, x, y)
where f.zoom is null or s.zoom is null;

The join based on using() will return those columns that are not null (and only those columns - removing the duplicate columns from the result)

When using your sample data from the question, the result would be:

status            | zoom | x  | y 
------------------+------+----+---
missing in second |    5 | 45 | 40
missing in second |    5 | 72 | 63

If a row in the second table is added that doesn't exist in the first, e.g.:

 INSERT INTO second(zoom,x,y) VALUES(15,7,34);

then the result will be:

status            | zoom | x  | y 
------------------+------+----+---
missing in second |    5 | 45 | 40
missing in second |    5 | 72 | 63
missing in first  |   15 |  7 | 34
Comments