sh0ber sh0ber - 4 months ago 11
SQL Question

Left Join ON LIKE needs something like LIMIT

(EDIT: A trailing

LIMIT
will not solve. Please read carefully.)

I have a table that looks up text in another table by
LEFT JOIN
and
LIKE
and I would like to limit the amount of results per-left-row.


  • Table
    usergames AS u
    contains substrings of user-inputted strings representing game names.

  • Table
    games AS g
    contains a master list of game names.

  • When the user enters a game name like "Hotline Miami", the input is split and 2 entries are created in
    substrs
    : one containing
    "hotline"
    & one containing
    "miami"
    (in the
    piece
    column), and both containing the full game name (in the
    name
    column).

  • In the event that the full game name does not match anything exactly in the master list, (ex. user mistakenly enters "Haatline Miami"), I check the master list for suggestions, such as any game with
    "Haatline"
    or
    "Miami"
    .



This is my query and it is working great:

SELECT u.name,g.name AS suggestion,count(g.name)
FROM usergames u
LEFT JOIN games g
ON CONCAT(' ', g.name, ' ') LIKE CONCAT('% ', u.piece, ' %')
GROUP BY g.name, u.name
ORDER BY u.name ASC, count(g.name) DESC


The problem I have is that sometimes there are too many suggestions. Maybe there are 1000 games that have the word "Miami". How can I limit the many rows created by the join "per left row"? See the data below, some results are from "hotline" and some from "miami". How could I limit to X results per substring.

SAMPLE DATA

substrs


enter image description here

games


enter image description here

result


enter image description here

(Note: A trailing
LIMIT
is not what I am looking for, as I don't want to limit the total # of results but only the results-per-joined-item that result from the
LIKE
join)


(Note: The
CONCAT
w/ spaces are there to ensure that only full word matches are returned. Mentally remove them if they are confusing, the question would be the same if it was
g.name LIKE s.piece
)

Answer

So, here is what it took: 2 subqueries and 2 cross joins with a tally table. I STILL don't know how to limit the number of subquery results but this solution makes it work so much faster now that the need to reduce is greatly diminished. (limit is still not applicable.)

First I learned about CROSS JOINs with tally tables, thanks to my brother. With that I was able to come up with this crazy solution. If anyone knows of a better way after reviewing my solution, please do post:

SELECT us.name, gs.idGame, suggestion, count(suggestion)
FROM (
    SELECT u.idGame, u.name, SUBSTRING_INDEX(SUBSTRING_INDEX(u.name, ' ', n.n), ' ', -1) AS user_substr
    FROM usergames u
    CROSS JOIN (
        SELECT N FROM _tally
    ) n
    WHERE u.idGame IS NULL
    AND n.n <= 1 + (LENGTH(u.name) - LENGTH(REPLACE(u.name, ' ', '')))
    HAVING LENGTH(user_substr) > 1
) us, 
(
    SELECT idGame, g.name suggestion, SUBSTRING_INDEX(SUBSTRING_INDEX(g.name, ' ', n.n), ' ', -1) AS game_substr
    FROM games g
    CROSS JOIN (
        SELECT N FROM _tally
    ) n
    WHERE n.n <= 1 + (LENGTH(g.name) - LENGTH(REPLACE(g.name, ' ', '')))
    HAVING LENGTH(game_substr) > 1
) gs
WHERE user_substr NOT IN (
    SELECT piece from _piecesrestricted   
)
AND game_substr NOT IN (
    SELECT piece from _piecesrestricted   
)
AND (
    LENGTH(user_substr)>3 OR
    user_substr = suggestion
)
AND user_substr = game_substr
GROUP BY suggestion, us.name
ORDER BY us.name ASC, count(suggestion) DESC