artdv artdv - 2 months ago 13
Python Question

Import celery task without importing dependencies

I have two modules

alpha.py
beta.py


beta.py
can only be run on
beta.server
because it requires a licensed solver than only exists on
beta.server
.

Within
alpha.py
, there's a portion of code that calls:

beta_task.apply_async(kwargs={...})


As such, it requires

from beta import beta_task


Which in turn requires the magical proprietary module that is only available on
beta.server
.

I need to enable
alpha_task
to run on
alpha.server
, having the ability to call
beta_task
without having the
beta_task
code on the server.

Is this possible?

UPDATE



Also, can I prevent
beta.task
from running on
alpha.server
?

Since
alpha.py
import
beta.py
, the daemon finds
beta.task
and listens for tasks of this type:

- ** ---------- [config]
- ** ---------- .> app: app_app
- ** ---------- .> transport: asdfasdfasd
- ** ---------- .> results: adfasdfasdf
- *** --- * --- .> concurrency: 12 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery

[tasks]
. alpha.alpha_task
. beta.beta_task

Answer

I ran into this before but never got it to work "right". I used a hacky workaround instead.

You can put the import proprietary statement in the beta.beta_task def itself. Your 'alpha' file doesn't actually run the 'beta' def, it just uses celery's task decorator to dispatch a message about it.

While PEP standards dictate a module should be at the top on the outermost scope, it's actually common practice for widely used PyPi modules to place the import within a registration or called function so that uninstalled dependencies for the unused files won't break the package [for example, a caching library will import redis/memcached modules within the backend activation, so the 3rd party modules aren't needed unless that backend is used].

alpha.py

from beta import beta_task

beta_task.apply_async(kwargs={...})

beta.py

@task
def beta_task(args):
    import proprietary
    proprietary.foo()

For the Update about running different tasks on each server: that is all covered in the "routing" chapter of the celery docs: http://docs.celeryproject.org/en/latest/userguide/routing.html

You basically configure different 'queues' (one for alpha, one for beta); start the workers to only handle the queues you specify; and either specify the route in the call to apply_async or configure the celery daemon to match a task to a route (there are several ways to do that, all explained in that chapter with examples.)

Comments