Boris Gaganelov Boris Gaganelov - 15 days ago 4
Python Question

Django weighted query (annotated values)

I am attempting to created a query and sort it based on a custom calculation of weights.

I require some help as the solution I have come to does indeed work but the performance is not where I'd like it to be

What I have is a Media object. It has related Comments, Likes and Orders.

What currently works but is a complete hacky mess is the following query:

products = Media.objects \
.select_related(
'image',
'currency',
'user',
'user__image',
) \
.prefetch_related('category', 'tags') \
.exclude(is_deleted=1) \
.filter(Q(category__category__in=categories) | Q(tags__tag__title=query)) \
.annotate(order_count = Count('orders', distinct=True)) \
.annotate(comment_count = Count('comments', distinct=True)) \
.annotate(like_count = Count('likes', distinct=True)) \
.annotate(weight = Count(0)) \
.distinct()

for m in products.iterator():
initial_weight = int(m.order_count)*40 + int(m.comment_count)*4 + int(m.like_count)*4 + int(m.clicks)
m.weight = float(float(initial_weight) - float(m.views/50))


As you can see I am separately annotating all the parameters I'll be using and then doing a stupid iteration full of arithmetic operations for EVERY item in the queryset which is very sub optimal.

One thing I attempted doing was the following:

products = Media.objects \
.select_related(
'image',
'currency',
'user',
'user__image',
) \
.prefetch_related('category', 'tags') \
.exclude(is_deleted=1) \
.filter(Q(category__category__in=categories) | Q(tags__tag__title=query)) \
.annotate(weight = Count('orders', distinct=True) * 40 + Count('comments', distinct=True) * 4 + Count('likes', distinct=True) - F('views')/50 + F('clicks'))


But similar operations in the annotation were impossible (tried a few variations with and without Sum() - Django always complained that the annotated values were of different type.

By the way we're using django 1.8 for this project.

Is there a good single-query aproach to getting my desired sorting weights?

EDIT:

My final solution (thanks to Назар)

media_list = Media.objects \
.select_related('user', 'user__image', 'image', 'currency') \
.filter(user__in=suggested_users) \
.filter(created_at__gt=seven_days_ago) \
.exclude(sold=True) \
.annotate(order_count = Count('orders', distinct=True)) \
.annotate(comment_count = Count('comments', distinct=True)) \
.annotate(like_count = Count('likes', distinct=True)) \
.annotate(
initial_weight=ExpressionWrapper((F('order_count') * 40 + F('comment_count') * 4 + \
F('like_count') * 4 + F('clicks')), output_field=FloatField())
) \
.annotate(
views_divided=ExpressionWrapper(F('views') / Decimal(50.0), output_field=FloatField())
) \
.annotate(weight=F('initial_weight') - F('views_divided')) \
.distinct()

Answer

First, you would need to make sure that division would yield floats (without rounding). You would need something like this (disgracefully stolen here):

ExpressionWrapper(
    (F('views') / Decimal(50.0), 
    output_field=FloatField()),
)

So, query would look like this:

products = Media.objects \
    .exclude(is_deleted=1) \
    .filter(Q(category__category__in=categories) | Q(tags__tag__title=query)) \
    .annotate(order_count = Count('orders', distinct=True)) \
    .annotate(comment_count = Count('comments', distinct=True)) \
    .annotate(like_count = Count('likes', distinct=True)) \
    .annotate(weight = Count(0)) \
    .annotate(
        initial_weight=F('order_count') * 40 + F('comment_count') * 4 + \
                       F('like_count') * 4 + F('clicks')
     ) \
    .annotate(
        views_divided=ExpressionWrapper((F('views') / Decimal(50.0), 
                                        output_field=FloatField()))
     ) \
    .annotate(weight=F('initial_weight') - F('views_divided')) \
    .distinct()

Looks ugly, but should work (I think).

On the side note - if you only need to calculate weight, you don't actually have to use prefetch_related and select_realted, django will take care of that stuff itself (however, that's just my speculation - if you actually use those foreign keys later in the code, then it's justified).