DaiBu DaiBu - 7 months ago 6
SQL Question

Please help me optimize this MySQL SELECT statement

I have a query that takes roughly four minutes to run on a high powered SSD server with no other notable processes running. I'd like to make it faster if possible.

The database stores a match history for a popular video game called Dota 2. In this game, ten players (five on each team) each select a "hero" and battle it out.

The intention of my query is to create a list of past matches along with how much of a "XP dependence" each team had, based on the heroes used. With 200,000 matches (and a 2,000,000 row matches-to-heroes relationship table) the query takes about four minutes. With 1,000,000 matches, it takes roughly 15.

I have full control of the server, so any configuration suggestions are also appreciated. Thanks for any help guys. Here are the details...

CREATE TABLE matches (
* match_id BIGINT UNSIGNED NOT NULL,
start_time INT UNSIGNED NOT NULL,
skill_level TINYINT NOT NULL DEFAULT -1,
* winning_team TINYINT UNSIGNED NOT NULL,
PRIMARY KEY (match_id),
KEY start_time (start_time),
KEY skill_level (skill_level),
KEY winning_team (winning_team));

CREATE TABLE heroes (
* hero_id SMALLINT UNSIGNED NOT NULL,
name CHAR(40) NOT NULL DEFAULT '',
faction TINYINT NOT NULL DEFAULT -1,
primary_attribute TINYINT NOT NULL DEFAULT -1,
group_index TINYINT NOT NULL DEFAULT -1,
match_count BIGINT UNSIGNED NOT NULL DEFAULT 0,
win_count BIGINT UNSIGNED NOT NULL DEFAULT 0,
* xp_from_wins BIGINT UNSIGNED NOT NULL DEFAULT 0,
* team_xp_from_wins BIGINT UNSIGNED NOT NULL DEFAULT 0,
xp_from_losses BIGINT UNSIGNED NOT NULL DEFAULT 0,
team_xp_from_losses BIGINT UNSIGNED NOT NULL DEFAULT 0,
gold_from_wins BIGINT UNSIGNED NOT NULL DEFAULT 0,
team_gold_from_wins BIGINT UNSIGNED NOT NULL DEFAULT 0,
gold_from_losses BIGINT UNSIGNED NOT NULL DEFAULT 0,
team_gold_from_losses BIGINT UNSIGNED NOT NULL DEFAULT 0,
included TINYINT UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY (hero_id));

CREATE TABLE matches_heroes (
* match_id BIGINT UNSIGNED NOT NULL,
player_id INT UNSIGNED NOT NULL,
* hero_id SMALLINT UNSIGNED NOT NULL,
xp_per_min SMALLINT UNSIGNED NOT NULL,
gold_per_min SMALLINT UNSIGNED NOT NULL,
position TINYINT UNSIGNED NOT NULL,
PRIMARY KEY (match_id, hero_id),
KEY match_id (match_id),
KEY player_id (player_id),
KEY hero_id (hero_id),
KEY xp_per_min (xp_per_min),
KEY gold_per_min (gold_per_min),
KEY position (position));


Query

SELECT
matches.match_id,
SUM(CASE
WHEN position < 5 THEN xp_from_wins / team_xp_from_wins
ELSE 0
END) AS radiant_xp_dependence,
SUM(CASE
WHEN position >= 5 THEN xp_from_wins / team_xp_from_wins
ELSE 0
END) AS dire_xp_dependence,
winning_team
FROM
matches
INNER JOIN
matches_heroes
ON matches.match_id = matches_heroes.match_id
INNER JOIN
heroes
ON matches_heroes.hero_id = heroes.hero_id
GROUP BY
matches.match_id


Sample Results

match_id | radiant_xp_dependence | dire_xp_dependence | winning_team

2298874871 | 1.0164 | 0.9689 | 1
2298884079 | 0.9932 | 1.0390 | 0
2298885606 | 0.9877 | 1.0015 | 1


EXPLAIN

id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra

1 | SIMPLE | heroes | ALL | PRIMARY | NULL | NULL | NULL | 111 | Using temporary; Using filesort
1 | SIMPLE | matches_heroes | ref | PRIMARY,match_id,hero_id | hero_id | 2 | dota_2.heroes.hero_id | 3213 |
1 | SIMPLE | matches | eq_ref | PRIMARY | PRIMARY | 8 | dota_2.matches_heroes.match_id | 1 |


Machine Specs


  • Intel Xeon E5

  • E5-1630v3 4/8t

  • 3.7 / 3.8 GHz

  • 64 GB of RAM

  • DDR4 ECC 2133 MHz

  • 2 x 480GB of SSD SOFT



Database


  • MariaDB 10.0

  • InnoDB


Answer

In all likelihood, the main performance driver is the GROUP BY. Sometimes, in MySQL, it can be faster to use correlated subuqeries. So, try writing the query like this:

SELECT m.match_id,
       (SELECT SUM(h.xp_from_wins / h.team_xp_from_wins)
        FROM matches_heroes mh INNER JOIN
             heroes h   
             ON mh.hero_id = h.hero_id
        WHERE m.match_id = mh.match_id AND mh.position < 5
       ) AS radiant_xp_dependence,
       (SELECT SUM(h.xp_from_wins / h.team_xp_from_wins)
        FROM matches_heroes mh INNER JOIN
             heroes h   
             ON mh.hero_id = h.hero_id
        WHERE m.match_id = mh.match_id AND mh.position >= 5
       ) AS dire_xp_dependence,
       m.winning_team   
FROM matches m;

Then, you want indexes on:

  • matches_heroes(match_id, position)
  • heroes(hero_id, xp_from_wins, team_xp_from_wins)

For completeness, you might want this index as well:

  • matches(match_id, winning_team)

This would be more important if you added order by match_id to the query.

Comments