Tchopane Tchopane - 16 days ago 5
MySQL Question

Is adding and dropping indexes everyday on huge tables a good practice?

I'm building a Web Application that is connected to a MySQL database.
I've got two huge tables containing each about 40 millions rows at the moment, and they are receiving new rows everyday (which adds ~ 500 000-1000 000 rows everyday).

The process to add new rows runs during the night, while no one can use the application, and the new rows' content depends on the result of some basic

SELECT
queries on the current database.
In order to get the result of those
SELECT
statement fast enough, I'm using simple indexes (one column per index) on each column that appears at least once in a
WHERE
clause.

The thing is, during the day, some totally different queries are run against those tables, including some "range WHERE clause" (
SELECT * FROM t1 WHERE a = a1 AND b = b1 AND (date BETWEEN d1 AND d2)
).
I found on stack this very helpful mini-cookbook that advises you on which INDEXes you should use depending on how the database is queried: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
They advice to use compound index: in my example query above it would give INDEX(a, b, date).

It indeed increased the speed of the queries run during the day (from 1 minute to 8 seconds so I was truly happy).

However, with those compound indexes, the required time to add new rows during the night totally explode (it would take more than one day to add the daily content).

Here is my question: would that be ok to drop all the indexes every night, add the new content, and set back up the daily indexes?
Or would that be dangerous since indexes are not meant to be rebuilt every day, especially on such big tables?
I know such an operation would take approximately two hours in total (drop and recreate INDEXes).

I am aware of the existence of
ALTER TABLE table_name DISABLE KEYS;
but I'm using InnoDB and I believe it is not made to work on InnoDB table.

Any senior advice would be welcome!
Thanks in advance.

Answer

I believe you have answered your own question: You need the indexes during the day, but not at night. Given what you describe, you should drop the indexes for the bulk inserts at night and re-create them afterwards. Dropping indexes for data loads is not unheard of, and seems appropriate in your case.

I would ask about how you are inserting new data. One method is to insert the values one row at a time. Another is to put the values into a temporary table (with no index) and do a bulk insert:

insert into bigtable( . . .)
    select . . .
    from smalltable;

These have different performance characteristics. You might find that using a single insert (if you are not already doing so) is fast enough for your purposes.