Dean Christian Armada Dean Christian Armada - 2 months ago 8
Python Question

Execute a code at some point of time

I am creating a web app made of python flask.. What I want is if I created an invoice it automatically send an e-mail on its renewal date to notify that the invoice is renewed and so on.. Please take note that renewal occurs every 3 months The code is something like:

def some_method(): // An API Enpoint that adds an invoice from a request
// Syntax that inserts the details to the database
// Let's assume that the the start of the renewal_date is December 15, 2016

How will the automatic email execution be achieved? Without putting too much stress in the backend? Because I'm guessing that if there are 300 invoices then the server might be stressed out


"Write it yourself." No, really.

Keep a database table of invoices to be sent. Every invoice has a status (values such as pending, sent, paid, ...) and an invoice date (which may be in the future). Use cron or similar to periodically run (e.g. 1/hour, 1/day) a program that queries the table for pending invoices for which the invoice date/time has arrived/passed, yet no invoice has yet been sent. This invoicing process sends the invoice, updates the status, and finishes. This invoicer utility will not be integral to your Flask web app, but live beside it as a support program.

Why? It's simple and direct approach. It keeps the invoicing code in Python, close to your chosen app language and database. It doesn't require much excursion into external systems or middleware. It's straightforward to debug and monitor, using the same database, queries, and skills as writing the app itself. Simple, direct, reliable, done. What's not to love?

Now, I fully understand that a "write it yourself" recommendation runs contrary to typical "buy not build" doctrine. But I have tried all the main alternatives such as cron and Celery; my experience with a revenue-producing web app says they're not the way to go for a few hundred long-term invoicing events.

The TL;DR--Why Not Cron and Celery?

cron and its latter-day equivalents (e.g. launchd or Heroku Scheduler) run recurring events. Every 10 minutes, every hour, once a day, every other Tuesday at 3:10am UTC. They generally don't solve the "run this once, at time and date X in the future" problem, but they are great for periodic work.

Now that last statement isn't strictly true. It describes cron and some of its replacements. But even traditional Unix provides at as side-car to cron, and some cron follow-ons (e.g. launchd, systemd) bundle recurring and future event scheduling together (along with other kitchen appliances and the proverbial sink). Evens so, there are some issues:

  1. You're relying on external scheduling systems. That means another interface to learn, monitor, and debug if something goes wrong. There's significant impedance mismatch between those system-level schedulers and your Python app. Even if you farm out "run event at time X," you still need to write the Python code to send the invoice and properly move it along your business workflow.
  2. Those systems, beautiful for a handful of events, generally lack interfaces that make reviewing, modifying, monitoring, or debugging hundreds of outstanding events straightforward. Debugging production app errors amidst an ocean of scheduled events is harrowing. You're talking about committing 300+ pending events to this external system. You must also consider how you'll monitor and debug that use.
  3. Those schedulers are designed for "regular" not "high value" or "highly available" operations. As just one gotcha, what if an event is scheduled, but then you take downtime (planned or unplanned)? If the event time passes before the system is back up, what then? Most of the cron-like schedulers lack provisions for robustly handling "missed" events or "making them up at the earliest opportunity." That can be, in technical terms, "a bummer, man." Say the event triggered money collection--or in your case, invoice issuance. You have hundreds of invoices, and issuing those invoices is presumably business-critical. The capability gaps between system-level schedulers and your operational needs can be genuinely painful, especially as you scale.

Okay, what about driving those events into an external event scheduler like Celery? This is a much better idea. Celery is designed to run large number of app events. It supports various backend engines (e.g. RabbitMQ) proven in practice to handle thousands upon untold thousands of events, and it has user interface options to help deal with event multitudes. So far so good! But:

  1. You will find yourself dealing with the complexities of installing, configuring, and operating external middleware (e.g. RabbitMQ). The effort yields very high achievable scale, but the startup and operational costs are real. This is true even if you farm much of it to a cloud service like Heroku.
  2. More important, while great as a job dispatcher for near-term events, Celery is not ideal as a long-wait-time scheduler. In production I've seen serious issues with "long throw" events (those posted a month, or in your case three months, in the future). While the problems aren't identical, just like cron etc., Celery long-throw events intersect ungracefully with normal app update and restart cycles. This is environment-dependent, but happens on popular cloud services like Heroku.

The Celery issues are not entirely unsolvable or fatal, but long-delay events don't have the same "Wow! Celery made everything work so much better!" magic for long-delay events that it does with swarms of near-term events. And you must become a bit of a Celery, RabbitMQ, etc. engineer and caretaker. That's a high price and a lot of work for just scheduling a few hundred invoices.

In summary: While future invoice scheduling may seem like something to farm out, in practice it will be easier, faster, and more immediately robust to keep that function in your primary app code (not directly in your Flask web app, but as an associated utility), and just farm out the "remind me to run this N times a day" low-level tickler to a system-level job scheduler.