Anonymous-SOS Anonymous-SOS - 1 year ago 52
MySQL Question

How can I optimize the 'IN' query?

**** EDIT ****

14ms may not seem a lot, however as you can see below in the "PostgresSQL Explain", PostgreSQL is doing a Seq Scan on 80,000 rows. There must be a way to avoid this Scan and do a couple of Index lookups instead.

**** EDIT END ****

I am playing around with the schemaless idea and I have the following three tables:

The tables are populated with 100,000 random entries.

entities(_primary_key SERIAL PRIMARY KEY, _id CHAR(32) UNIQUE,
data BYTEA)

index_username_profile_names(_id CHARE(32) PRIMARY KEY,

index_username_email(_id CHAR(32) PRIMARY KEY, key VARCHAR)

with a non-unique index on

My SQL query is:

SELECT data FROM entities WHERE
_id IN (SELECT _id FROM index_users_email WHERE key = 'test')
_id in (SELECT _id FROM index_users_profile_name WHERE key = 'test')

This takes a whooping 14ms although 'test' doesn't exits in either of the 'index' tables, no matter if I use PostgreSQL or MySQL, so it must be something that I am doing wrong.

Any idea how I can optimized it, or what I am doing wrong?


Postgres explain:

Seq Scan on entities (cost=16.88..4776.15 rows=80414 width=163) (actual time=15.169..15.169 rows=0 loops=1)
Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
Rows Removed by Filter: 107218
SubPlan 1
-> Index Scan using index_users_email_key_idx1 on index_users_email (cost=0.42..8.44 rows=1 width=33) (actual time=0.039..0.039 rows=0 loops=1)
Index Cond: ((key)::text = 'test'::text)
SubPlan 2
-> Index Scan using index_users_profile_name_key_idx1 on index_users_profile_name (cost=0.42..8.44 rows=1 width=33) (actual time=0.071..0.071 rows=0 loops=1)
Index Cond: ((key)::text = 'test'::text)
Planning time: 0.202 ms
Execution time: 15.216 ms

Answer Source

ORed (join-) conditions are usually bad, try UNION instead:

SELECT data FROM entities 
 ( SELECT _id 
   FROM index_users_email 
   WHERE key = 'test'
SELECT data FROM entities
WHERE _id in 
 ( SELECT _id 
   FROM index_users_profile_name 
   WHERE key = 'test'