Skip to content

SIGSEGV: NULL pointer in rb_llhttp_data_callback_call / rb_llhttp_on_body when underlying connection is reset mid-body #36

@TestardR

Description

@TestardR

Summary

rb_llhttp_data_callback_call (and by extension rb_llhttp_on_body, rb_llhttp_on_url, rb_llhttp_on_status, rb_llhttp_on_header_field, rb_llhttp_on_header_value) passes the raw data pointer received from Node's llhttp parser straight into rb_str_new(data, length) with no NULL check. In production, when an HTTP response body is being parsed and the underlying TCP connection is reset/closed mid-body, Node's llhttp invokes the body callback with data == NULL (and typically length == 0). rb_str_new(NULL, 0) dereferences the NULL pointer and the entire Ruby process is killed by SIGSEGV.

Affected source (current main, mri/ext/llhttp/llhttp_ext.c):

https://github.com/bryanp/llhttp/blob/main/mri/ext/llhttp/llhttp_ext.c#L58-L60

void rb_llhttp_data_callback_call(VALUE delegate, ID method, char *data, size_t length) {
  rb_funcall(delegate, method, 1, rb_str_new(data, length));
}

https://github.com/bryanp/llhttp/blob/main/mri/ext/llhttp/llhttp_ext.c#L118-L124

int rb_llhttp_on_body(llhttp_t *parser, char *data, size_t length) {
  rb_llhttp_parser_data *parserData = (rb_llhttp_parser_data*) parser->data;

  rb_llhttp_data_callback_call(parserData->delegate, parserData->on_body, data, length);

  return 0;
}

Real-world crash

Observed on production Ruby 4.0 / http gem 6.0.3 / llhttp 0.6.1, in a Sidekiq worker process whose http gem connection was a long-lived ld-eventsource (LaunchDarkly streaming) SSE stream. When the SSE connection is reset by the upstream LB or a network blip, on_body is invoked with data = NULL and the worker dies with SIGSEGV.

Crash signature

[BUG] Segmentation fault at 0x0000000000000000

-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.4.0(rb_print_backtrace+0x12)
/usr/local/lib/libruby.so.4.0(rb_vm_bugreport)
/usr/local/lib/libruby.so.4.0(rb_bug_for_fatal_signal+0xf4)
/usr/local/lib/libruby.so.4.0(sigsegv+0x42)
/app/vendor/bundle/ruby/4.0.0/gems/llhttp-0.6.1/ext/llhttp/llhttp.c:15624
                                          ↑ llhttp__internal__run dereferences NULL
rb_llhttp_data_callback_call (null):0
rb_llhttp_on_body

Ruby stack at crash

http-6.0.3/lib/http/response/parser.rb:57:in '<<'
http-6.0.3/lib/http/connection/internals.rb:134:in 'read_more'
http-6.0.3/lib/http/connection.rb:133:in 'readpartial'
http-6.0.3/lib/http/response/body.rb:61:in 'readpartial'

Impact

  • Single SIGSEGV kills the entire Ruby process.
  • Any long-running consumer of http gem on an unstable upstream (SSE streams, long-poll, HTTP/1.1 keepalive with idle timeout) is affected.
  • Cannot be rescued at the Ruby level (signal is delivered before Ruby control returns).
  • In our environment: ~1 segfault/day per active streaming connection across a Sidekiq + Puma fleet. Sidekiq + K8s auto-restart papered over customer impact, but the alerts and pod churn are real operational cost.

Proposed fix (one-line per callback, or one-line in the shared helper)

In the shared helper, guard before calling rb_str_new:

 void rb_llhttp_data_callback_call(VALUE delegate, ID method, char *data, size_t length) {
+  if (data == NULL) return;
   rb_funcall(delegate, method, 1, rb_str_new(data, length));
 }

A data == NULL callback with length == 0 semantically conveys no new bytes for that field, so dropping the callback is the correct behavior — equivalent to what Node's llhttp consumers do (Node itself null-guards before invoking JS callbacks).

If preferred, the guard can live in each rb_llhttp_on_* instead, returning 0 (continue parsing) when data == NULL. Functionally equivalent for this bug class.

Reproduction outline

Not trivial to reproduce deterministically because it requires Node's llhttp to land in the specific state where it calls the body callback with NULL. Easiest paths:

  1. Open a streaming HTTP response with the http gem against a server that sends Transfer-Encoding: chunked, then have the server send a chunk header followed by a TCP RST before any body bytes. Some load balancers will produce this on health-check failover.
  2. Synthetic: drive llhttp_execute(parser, "HTTP/1.1 200 OK\r\nTransfer-Encoding: chunked\r\n\r\n0\r\n\r\n", N) with N varying around chunk boundaries, then call again with data = NULL, length = 0 to flush — the on_body callback is reached with NULL.

(Happy to put together a failing spec against the gem's existing test suite if a maintainer confirms the fix direction.)

Why we're filing instead of PR-ing directly

We hit this in production at Qonto via LaunchDarkly's Ruby SDK → ld-eventsourcehttp gem → llhttp. Our local resolution is to accept the transient (Sidekiq retry + K8s pod restart handles it for us), but the bug is general enough that other Ruby ecosystems using http gem on long-lived connections are very likely hitting it too. Filing here so the broader community benefits, and so we can drop our internal "accept and move on" note once an llhttp 0.6.2 ships.

If you'd like a PR with the NULL guard + a regression spec, happy to send one.

— Romain (Qonto, App Systems backend)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions