Skip to content

InterfaceError in outbox process_shard causes 500 on /oauth/token/ #118610

Description

@macuzi

Summary

POST /oauth/token/ (grant_type=refresh) returns a 500 HTML error page to integrators due to a DB connection drop during hybrid cloud outbox processing. The token refresh itself succeeds — the new token is written and replicated — but an unhandled InterfaceError: connection already closed in process_shard propagates to the HTTP response.

Root cause

After refresh_token.refresh() completes and the updated token is replicated via an internal RPC call, sentry/hybridcloud/models/outbox.py::process_shard attempts outbox coalescing/cleanup. At that point the PostgreSQL connection drops (OperationalError: server closed the connection unexpectedly). The @auto_reconnect_cursor decorator in sentry/db/postgres/decorators.py only catches reconnectable errors — this InterfaceError (connection already fully closed) slips through, surfaces as an unhandled exception, and Django returns a 500.

  • Culprit: sentry/hybridcloud/models/outbox.py::process_shard via /oauth/token/
  • Environment: control plane (getsentry-control-web-default-common-production-*)
  • Sentry issue: SENTRY-42W9 — 130k+ events, first seen June 2025, ongoing

Impact

Integrators receive a 500 on token refresh even though the new token was already written. They cannot distinguish this from a true failure, so their integration breaks. This is the underlying cause behind reports in #94190.

Workaround

The /authorizations/ endpoint now supports a manual refresh grant type (added Nov 2025, see docs), which lets integrators recover from the broken state — but does not prevent the 500 from occurring.

Ticket for reference #215474829333534

Metadata

Metadata

Assignees

No one assigned
    No fields configured for issues without a type.

    Projects

    Status
    Waiting for: Support

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions