Ajoy Ajoy - 2 months ago 6
SQL Question

Django - Group by column and where on a different column

These are my django models (simplified):

class Status(models.Model):
status = models.CharField(max_length=20)

class Project(models.Model):
client = models.ForiegnKey(Client)

class TicketRequest(models.Model):
status = models.ForiegnKey(Status, related_name='ticket_requests')
project = models.ForiegnKey(Project)
created = models.DateTimeField()


Required result



status.value | client.id | count
--------------+-------------+---------
is_assigned | 4 | 2
is_closed | 4 | 66
is_open | 4 | 7


Queryset annotate



Status.objects.filter(
ticket_requests__project__client_id=4
).values('value').annotate(
count=Count('ticket_requests__project__client')
)


returns

[{'count': 2, 'value': u'is_assigned'}, {'count': 66, 'value': u'is_closed'}, {'count': 7, 'value': u'is_open'}]


which is exactly what is required.

But I need to filter the queryset on
TicketRequest.created
with filters such as today, this week and month.


Since these filters need to be reused a lot, I created a handy helper:

def qs_time_range(qs, time_range, field_name):
now = timezone.now()

if time_range == 'month':
past = now - timedelta(days=30)

return qs.filter(
**{field_name + '__date__range': [past, now]}
)


Problem



When I use the helper to filter the ticket requests, the results are not what I expect.

data = filters.qs_time_range(
Status.objects.filter(ticket_requests__project__client_id=4),
'month',
'ticket_requests__created'
).values('value').annotate(
count=Count('ticket_requests__project__client')
)


Result

[{'count': 27, 'value': u'is_assigned'}]


When instead this should have been the result:

[{'count': 3, 'value': u'is_assigned'}, {'count': 3, 'value': u'is_closed'}, {'count': 3, 'value': u'is_open'}]


Question: Is the filtering messing up the SQL? What should I try instead?




Additional Info - Raw SQL



data.query
reveals this SQL (modified for Postgres)

SELECT "tickets_status"."value",
COUNT("projects_project"."client_id") AS "count"
FROM "tickets_status"
INNER JOIN "tickets_ticketrequest" ON ("tickets_status"."id" = "tickets_ticketrequest"."status_id")
INNER JOIN "projects_project" ON ("tickets_ticketrequest"."project_id" = "projects_project"."id")
INNER JOIN "tickets_ticketrequest" T5 ON ("tickets_status"."id" = T5."status_id")
WHERE ("projects_project"."client_id" = 4
AND T5."created"::date BETWEEN '2016-09-02' AND '2016-10-02')
GROUP BY "tickets_status"."value",
"tickets_status"."name"
ORDER BY "tickets_status"."name" ASC;


which returns

value | count
-------------+-------
is_assigned | 27
(1 row)


But this is the SQL that I want generated:

SELECT "tickets_status"."value",
COUNT("projects_project"."client_id") AS "count"
FROM "tickets_status"
INNER JOIN "tickets_ticketrequest" ON ("tickets_status"."id" = "tickets_ticketrequest"."status_id")
INNER JOIN "projects_project" ON ("tickets_ticketrequest"."project_id" = "projects_project"."id")
WHERE ("projects_project"."client_id" = 4
AND "tickets_ticketrequest"."created"::date BETWEEN '2016-09-02' AND '2016-10-02')
GROUP BY "tickets_status"."value",
"tickets_status"."name"
ORDER BY "tickets_status"."name" ASC;


which returns the right result

value | count
-------------+-------
is_assigned | 3
(3 rows)

Answer

Lovely to hear that you got your stuff working, but this answer is 1 explanation short of a reasoning behind this magic. I remembered that this works, but if someone comes up with an explanation WHY it works, then that is the real correct answer. But yeah, if django.db.models.Count doesn't work as expected, you can replace it with a combination of Sum(Case(When(field='value'), then=1), default=0, output_field=models.IntegerField()) (in Django >= 1.9).

The original comment that solved the problem:

First of all have to state the obvious: are you sure you have any closed or open ticket requests created in the last month? That would be the most obvious cause. But if you already double- and triple-checked that, then I have a vague memory of having a similar problem somewhere, but can't remember why. But I fixed it by changing Count() to Sum(Case(When(set__the__condition='set_the_value'), then=1), default=0, output_field=models.IntegerField())).