EmmanuelG EmmanuelG - 10 months ago 32
PHP Question

Is mysqli::multi_query more efficient than several single queries?

Someone brought up the MySQLi multi_query function in an answer claiming that it would be better than looping through 3 separate queries. I tried to Google some sort of answer to this but didn't really get anything that really satisfied my curiosity so I hope you guys may have better insights into the reasons for using it rather than saving a few lines of code.

So here's what I am wondering:

  1. What is multi_query doing under the hood?

  2. Does multi_query simply hit the server x number of times and aggregates the results?

  3. Is there a case where single queries may be more efficient than multiple queries?

I know that hitting the database 3 times for a million items each and smashing it into one huge object isn't good for memory usage, but I know that there has to be a reason for its existence and I am also sure there are times when it should be avoided. I hope to better understand it so that I can put that into my bag of tricks when the need arises.

Thanks for your time!

Answer Source
  • What is multi_query doing under the hood? - Just sending all the queries to the server at once instead of one at a time, and retrieving all the results in one go. Nothing more complicated than that.

  • Does multi_query simply hit the server x number of times and aggregates the results? - It "hits" the server twice - once to send the queries and once to retrieve the results.

  • Is there a case where single queries may be more efficient than multiple queries? - depends how you define "efficient". multi_query() is light on the network but memory heavy, running query() in a loop vice versa.

    For many SELECT statements that return large result sets the loss in memory consumption is likely to vastly outweigh the gain in terms of the network and most of the time you'd be better to issue the queries and process the result sets one at a time - although this depends on exactly what you are doing with the data. But if you needed to run many UPDATE statements, it is likely that multi_query() would be better since the return values are just success/fail and the memory consumption will be light.

    You would have to weigh up all the factors like what you are doing, how long you expect it to take, the network latency between the (database) server and client, the available resources (mostly memory) on the server and the client, etc, etc... and take it on a case by case basis.

I found this record of some performance testing done a while ago, where the conclusion is that there is an overall efficiency gain to be found from using multi_query(). However the test case was simply running 4 queries, each one SELECTing a single result, and the definition of "more efficient" is simply "faster". There is no testing for larger numbers of queries or larger result sets and, while speed is important, it is not the be-all and end-all - I can make anything run incredibly fast if I give it an unlimited amount of memory, but an attempt to do anything concurrently will fail miserably. It is also not a real world test, since the end result could be achieved with a single JOINed query. It does make for some interesting read though.

Personally I feel this is somewhat academic because if you are running a large set of statements at once, 90% of the time they will vary only in the data that is being passed and the query structure will remain the same - which is an obvious candidate for prepared statements.