dave dave - 6 months ago 46
SQL Question

How would I implement a ranking algorithm in my website to sort database data?

I want to implement a ranking system on a website I've been working on and have decided to go with the Hacker News algorithm. My reasoning for choosing this algorithm is simply because it's been described here.

I was looking at this Python code (the language I'm using to build my site) and couldn't figure out how I would implement it.

def calculate_score(votes, item_hour_age, gravity=1.8):
return (votes - 1) / pow((item_hour_age+2), gravity)


Given the tables:

posts:
id | title | time_submitted

votes:
id | postid | userid | score


How would I pull the data from the database? The ideal solution (most efficient) would be to construct a MySQL query to retrieve the top 10 posts ranked using the algorithm. But given that Hacker News has it implemented in Arc, it makes me think they pull out all the posts then run them through the algorithm to rank them.

Reddit also comes to mind for this... They use a non-relational database schema so I would assume they, like Hacker News, perform the rankings in their code - not the database.

How would you implement this?

EDIT: one post can have many votes as I would like to log which user votes on which post.

Answer

You can use the data you need in the ORDER BY clause.

SELECT p.id, p.title, p.time_submitted, SUM(v.score) as num_votes 
  FROM posts p, votes v
 WHERE v.postid = p.id
GROUP BY p.id
ORDER BY 
   (SUM(v.score) - 1) / POW(TIMESTAMPDIFF(HOUR,p.time_submitted,NOW()) + INTERVAL 2 HOUR, 1.8) DESC
LIMIT 100
Comments