Saqib Ali Saqib Ali - 11 months ago 46
SQL Question

How to find top-X highest values in column using Django Queryset without cutting off ties at the bottom?

I have the following Django Model:

class myModel(models.Model):
name = models.CharField(max_length=255, unique=True)
score = models.FloatField()

There are thousands of values in the DB for this model. I would like to efficiently and elegantly use that QuerySets alone to get the top-ten highest scores and display the names with their scores in descending order of score. So far it is relatively easy.

Here is where the wrinkle is: If there are multiple myModels who are tied for tenth place, I want to show them all. I don't want to only see some of them. That would unduly give some names an arbitrary advantage over others. If absolutely necessary, I can do some post-DB list processing outside of Querysets. However, the main problem I see is that there is no way I can know apriori to limit my DB query to the top 10 elements since for all I know there may be a million records all tied for tenth place.

Do I need to get all the myModels sorted by score and then do one pass over them to calculate the score-threshold? And then use that calculated score-threshold as a filter in another Queryset?

If I wanted to write this in straight-SQL could I even do it in a single query?

Answer Source

Of course you can do it in one SQL query. Generating this query using django ORM is also easily achievable.

top_scores = (myModel.objects
                     .values_list('score', flat=True)
top_records = (myModel.objects

This should generate single SQL query (with subquery).