Skip to main content
Python for the JVM Engineer

Background Jobs

Ravinder··5 min read
PythonJVMJavaCeleryRQDramatiqbackground jobstask queues
Share:
Background Jobs

Every production service eventually needs to move work off the request path: send an email after signup, resize an image after upload, generate a report on a schedule. Java engineers reach for Quartz Scheduler, Spring @Scheduled, Spring Batch, or an MQ-backed @JmsListener. Python's equivalent ecosystem is centred on Redis-backed task queues — and Celery is its Quartz plus Spring Batch rolled into one.

The Architecture Pattern

flowchart LR WebApp["Web app\n(FastAPI / Django)"] -->|"enqueue task"| Broker["Message broker\n(Redis / RabbitMQ)"] Broker --> W1["Worker process 1"] Broker --> W2["Worker process 2"] Broker --> W3["Worker process 3"] W1 -->|"store result"| Backend["Result backend\n(Redis / DB)"] W2 --> Backend W3 --> Backend WebApp -->|"poll result"| Backend

This is the same producer-consumer pattern as Spring's @JmsListener or @KafkaListener, but the broker is typically Redis (not a full JMS broker) and the workers are separate OS processes — each with its own GIL, bypassing the threading limitation discussed in post 3.

Celery is the most widely deployed Python task queue. It supports multiple brokers (Redis, RabbitMQ), multiple result backends, periodic tasks (beat scheduler), retries, rate limiting, and task routing.

# tasks.py
from celery import Celery
 
app = Celery(
    "myproject",
    broker="redis://localhost:6379/0",
    backend="redis://localhost:6379/1",
)
 
app.conf.update(
    task_serializer="json",
    result_serializer="json",
    accept_content=["json"],
    task_track_started=True,
)
 
@app.task(bind=True, max_retries=3, default_retry_delay=60)
def send_email(self, user_id: int, template: str) -> dict:
    try:
        # ... actual email logic
        return {"status": "sent", "user_id": user_id}
    except Exception as exc:
        raise self.retry(exc=exc)

Enqueuing from the web layer:

# In a FastAPI route
from tasks import send_email
 
@app.post("/users/{user_id}/welcome")
async def welcome_user(user_id: int):
    task = send_email.delay(user_id, "welcome")     # async enqueue
    return {"task_id": task.id}
 
@app.get("/tasks/{task_id}")
async def get_task_status(task_id: str):
    from celery.result import AsyncResult
    result = AsyncResult(task_id)
    return {"status": result.status, "result": result.result}

Java analogue — Spring with @Async and a queue:

@Service
public class EmailService {
    @Async
    public CompletableFuture<Void> sendEmail(int userId, String template) {
        // email logic
        return CompletableFuture.completedFuture(null);
    }
}

The difference: Celery tasks run in separate processes (durable, survives app restart); Spring @Async runs in an in-process thread pool (lost if the JVM dies).

Periodic Tasks with Celery Beat

Celery Beat is the Quartz Scheduler equivalent — it triggers tasks on a cron-like schedule:

from celery.schedules import crontab
 
app.conf.beat_schedule = {
    "generate-daily-report": {
        "task": "tasks.generate_report",
        "schedule": crontab(hour=2, minute=0),   # 02:00 every day
    },
    "cleanup-stale-sessions": {
        "task": "tasks.cleanup_sessions",
        "schedule": 300.0,   # every 5 minutes
    },
}

Compare to Quartz:

JobDetail job = JobBuilder.newJob(GenerateReportJob.class).build();
CronTrigger trigger = TriggerBuilder.newTrigger()
    .withSchedule(CronScheduleBuilder.dailyAtHourAndMinute(2, 0))
    .build();
scheduler.scheduleJob(job, trigger);

Both are declarative cron schedulers. Celery Beat requires a dedicated beat process running alongside workers.

RQ (Redis Queue) — Simpler Alternative

RQ is a lighter alternative to Celery — just Redis, no broker configuration complexity:

from rq import Queue
from redis import Redis
from mymodule import long_running_task
 
redis_conn = Redis(host="localhost", port=6379)
q = Queue(connection=redis_conn)
 
job = q.enqueue(long_running_task, arg1, arg2, job_timeout=120)
print(job.id, job.status)

Workers are started with a single command:

rq worker --with-scheduler

RQ's simplicity makes it the right choice for small-to-medium workloads where Celery's configuration surface is overhead. The tradeoff: no multiple broker support, fewer built-in features (rate limiting, canvas primitives, complex routing).

Dramatiq — The Modern Contender

Dramatiq is a newer library that prioritises reliability and simplicity over feature completeness:

import dramatiq
from dramatiq.brokers.redis import RedisBroker
 
broker = RedisBroker(host="localhost")
dramatiq.set_broker(broker)
 
@dramatiq.actor(max_retries=5, min_backoff=1000, max_backoff=60_000)
def process_payment(order_id: int, amount: float) -> None:
    # payment logic
    pass
 
# Enqueue
process_payment.send(order_id=123, amount=99.99)
 
# Schedule for later
process_payment.send_with_options(
    args=(123, 99.99),
    delay=30_000   # 30 seconds
)

Dramatiq uses exponential backoff by default and has a cleaner middleware system than Celery. It lacks a built-in beat scheduler — use APScheduler or rocketry for periodic tasks.

Choosing the Right Tool

flowchart TD Q1{"Need periodic\nscheduled tasks?"} Q1 -- Yes --> Celery["Celery + Beat\n(full-featured scheduler)"] Q1 -- No --> Q2{"Need multiple\nbrokers or complex routing?"} Q2 -- Yes --> Celery Q2 -- No --> Q3{"Team prefers\nsimplicity over features?"} Q3 -- Yes --> RQ["RQ\n(Redis only, simple API)"] Q3 -- No --> Dramatiq["Dramatiq\n(modern, reliable, clean API)"]
Feature Celery RQ Dramatiq
Broker support Redis, RabbitMQ Redis only Redis, RabbitMQ
Periodic tasks Built-in (Beat) Via rq-scheduler External
Retry / backoff Manual Limited Built-in (exp.)
Monitoring UI Flower RQ Dashboard Periodiq
Configuration surface High Low Medium

Key Takeaways

  • Python task queues (Celery, RQ, Dramatiq) map to Spring @Async + JMS/Kafka listener, but workers run in separate processes — inherently bypassing the GIL.
  • Celery is the Quartz + Spring Batch equivalent: periodic tasks, complex routing, multiple brokers, result tracking.
  • RQ is the simplest option for Redis-backed queues — correct choice when you want to be up in 30 minutes.
  • Dramatiq offers modern retry semantics and a clean API without Celery's configuration complexity.
  • Always use a separate result backend (Redis DB index 1, or a SQL table) from the broker — avoid mixing task metadata with queue data.
  • Monitor workers in production: Celery Flower, RQ Dashboard, or Prometheus metrics via broker exporters — equivalent to JMX metrics on Spring Batch jobs.
Share: