Gunicorn Worker Timeouts and Sentry Cleanup

I encountered a traceback in my logs during a period of high traffic. The application is running on Gunicorn with Sentry for error monitoring. The logs showed a WORKER TIMEOUT, followed by an attempt by Sentry to send events, ultimately resulting in a SystemExit: 1 and a SIGKILL.

Here is what the logs looked like:

[CRITICAL] WORKER TIMEOUT (pid:551)
Exception ignored in atexit callback ...:
Traceback (most recent call last):
  File ".../sentry_sdk/integrations/atexit.py", line 52, in _shutdown
    client.close(callback=integration.callback)
  ...
  File ".../threading.py", line 373, in wait
    gotit = waiter.acquire(True, timeout)
  File ".../gunicorn/workers/base.py", line 204, in handle_abort
    sys.exit(1)
SystemExit: 1
[ERROR] Worker (pid:551) was sent SIGKILL! Perhaps out of memory?

At first glance, it looks like Sentry caused the crash, or maybe an OOM (Out Of Memory) error as Gunicorn suggests. But the real issue is a race condition during shutdown.

The Breakdown

The Timeout: The worker was restarting because it reached the max_requests limit. This is expected behavior designed to prevent memory leaks by recycling workers after a set number of requests. The “timeout” message is misleading; it just means the worker didn’t complete its shutdown within the configured timeout window.
The Signal: The Master process considers the worker “frozen” and sends a signal (usually SIGABRT or SIGTERM) to kill it.
The Cleanup Conflict: The worker catches the signal and begins to exit. Sentry’s SDK hooks into the atexit handler to ensure pending events are flushed to their servers before the process dies. Sentry pauses the exit to send these network requests.
The Force Kill: Gunicorn doesn’t wait forever. Since the worker didn’t die immediately (because it was busy flushing Sentry events), the Master process followed up with a SIGKILL, effectively pulling the plug.

This results in the traceback: the process was killed while Sentry was still waiting on its internal queue.

The Solution

There are two parts to fixing this.

First, if your workers are legitimately busy (as in my case with high traffic), you might need a higher timeout to give them enough time to shut down gracefully.

# Increase the worker timeout (default is 30s)
gunicorn --timeout 60 ...

Second, and more importantly for the traceback, configure Sentry to stop blocking the shutdown process. You can limit how long Sentry waits to flush events.

import sentry_sdk

sentry_sdk.init(
    # ...
    shutdown_timeout=2,  # Give Sentry max 2 seconds to flush on exit
)

By setting shutdown_timeout, you ensure that if Sentry can’t get the data out quickly, it abandons the effort and lets the worker die cleanly, avoiding the SIGKILL and the noise in your logs.

Gunicorn Worker Timeouts and Sentry Cleanup

The Breakdown

The Solution

Related Posts

The Gunicorn Timeout Trinity

Gunicorn Keepalive and AWS ELB 502 Errors

Further Reading