user1935987 user1935987 - 3 months ago 9x
Java Question

PostgreSQL multiple 'WHERE' conditions (1000+) request

I'm not a pro in SQL at all :)
Having a very critical performance issue.
Here is the info directly related to problem.

I have 2 tables in my DB- table

and table

have the fields:

  • id (PK)

  • name

  • city

  • country

table items:

  • id (PK)

  • name

  • multiple fields not related to issue

  • condo_id (FK)

I have 1000+ entities in
table and 1000+ in

The problem is how i perform items search

currently it is:

For example, i want to get all the items for
city = Sydney

  1. Perform a
    SELECT condos.condo_id FROM WHERE city = 'Sydney'

  2. Make a
    SELECT * FROM public.items WHERE item.condo_id = ?
    for each
    i get in step 1.

The issue is that once i get 1000+ entities in
table, the request is performed 1000+ times for each
belongs to 'Sydney'. And the execution of this request takes more then a 2 minutes which is a critical performance issue.

So, the questions is:

What is the best way for me to perform such search ? should i put a 1000+ id's in single
request? or?

For add info, i use PostgreSQL 9.4 and Spring MVC.


Essentially what you need is to eliminate the N+1 query and at the same time ensure that your City field is indexed. You have 3 mechanisms to go. One is already stated in one of the other answers you have received this is the SUBSELECT approach. Beyond this approach you have another two.

You can use what you have stated :

SELECT condos.condo_id FROM WHERE city = 'Sydney'

FROM   public.items 
WHERE  items.condo_id IN (up to 1000 ids here) 

the reason why I am stating up to 1000 is because some SQL providers have limitations.

You also can do join as a way to eliminate the N+1 selects

FROM   public.items join on items.condo_id=condos.condo_id  and'Sydney'

Now what is the difference in between the 3 queries.

Pros of Subselect query is that you get everything at once. The Cons is that if you have too many elements the performance may suffer:

Pros of simple In clause. Effectivly solves the N+1 problem, Cons may lead to some extra queries compared to the Subselect

Joined query pros, you can initialize in one go both Condo and Item. Cons leads to some data duplication on Condo side

If we have a look into a framework like Hibernate, we can find there that in most of the cases as a fetch strategy is used either Joined either IN strategies. Subselect is used rarely.

Also if you have critical performance you may consider reading everything In Memory and serving it from there. Judging from the content of these two tables it should be fairly easy to just upload it into a Map.

Effectively everything that solves your N+1 query problem is a solution in your case if we are talking of just 2 times 1000 queries. All three options are solutions.