SIGSEGV: NULL pointer in rb_llhttp_data_callback_call / rb_llhttp_on_body when underlying connection is reset mid-body

## Summary

`rb_llhttp_data_callback_call` (and by extension `rb_llhttp_on_body`, `rb_llhttp_on_url`, `rb_llhttp_on_status`, `rb_llhttp_on_header_field`, `rb_llhttp_on_header_value`) passes the raw `data` pointer received from Node's llhttp parser straight into `rb_str_new(data, length)` with no NULL check. In production, when an HTTP response body is being parsed and the underlying TCP connection is reset/closed mid-body, Node's llhttp invokes the body callback with `data == NULL` (and typically `length == 0`). `rb_str_new(NULL, 0)` dereferences the NULL pointer and the entire Ruby process is killed by SIGSEGV.

Affected source (current `main`, `mri/ext/llhttp/llhttp_ext.c`):

https://github.com/bryanp/llhttp/blob/main/mri/ext/llhttp/llhttp_ext.c#L58-L60
```c
void rb_llhttp_data_callback_call(VALUE delegate, ID method, char *data, size_t length) {
  rb_funcall(delegate, method, 1, rb_str_new(data, length));
}
```

https://github.com/bryanp/llhttp/blob/main/mri/ext/llhttp/llhttp_ext.c#L118-L124
```c
int rb_llhttp_on_body(llhttp_t *parser, char *data, size_t length) {
  rb_llhttp_parser_data *parserData = (rb_llhttp_parser_data*) parser->data;

  rb_llhttp_data_callback_call(parserData->delegate, parserData->on_body, data, length);

  return 0;
}
```

## Real-world crash

Observed on production Ruby 4.0 / `http` gem 6.0.3 / `llhttp` 0.6.1, in a Sidekiq worker process whose `http` gem connection was a long-lived `ld-eventsource` (LaunchDarkly streaming) SSE stream. When the SSE connection is reset by the upstream LB or a network blip, `on_body` is invoked with `data = NULL` and the worker dies with SIGSEGV.

### Crash signature

```
[BUG] Segmentation fault at 0x0000000000000000

-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.4.0(rb_print_backtrace+0x12)
/usr/local/lib/libruby.so.4.0(rb_vm_bugreport)
/usr/local/lib/libruby.so.4.0(rb_bug_for_fatal_signal+0xf4)
/usr/local/lib/libruby.so.4.0(sigsegv+0x42)
/app/vendor/bundle/ruby/4.0.0/gems/llhttp-0.6.1/ext/llhttp/llhttp.c:15624
                                          ↑ llhttp__internal__run dereferences NULL
rb_llhttp_data_callback_call (null):0
rb_llhttp_on_body
```

### Ruby stack at crash

```
http-6.0.3/lib/http/response/parser.rb:57:in '<<'
http-6.0.3/lib/http/connection/internals.rb:134:in 'read_more'
http-6.0.3/lib/http/connection.rb:133:in 'readpartial'
http-6.0.3/lib/http/response/body.rb:61:in 'readpartial'
```

## Impact

- Single SIGSEGV kills the entire Ruby process.
- Any long-running consumer of `http` gem on an unstable upstream (SSE streams, long-poll, HTTP/1.1 keepalive with idle timeout) is affected.
- Cannot be rescued at the Ruby level (signal is delivered before Ruby control returns).
- In our environment: ~1 segfault/day per active streaming connection across a Sidekiq + Puma fleet. Sidekiq + K8s auto-restart papered over customer impact, but the alerts and pod churn are real operational cost.

## Proposed fix (one-line per callback, or one-line in the shared helper)

In the shared helper, guard before calling `rb_str_new`:

```diff
 void rb_llhttp_data_callback_call(VALUE delegate, ID method, char *data, size_t length) {
+  if (data == NULL) return;
   rb_funcall(delegate, method, 1, rb_str_new(data, length));
 }
```

A `data == NULL` callback with `length == 0` semantically conveys no new bytes for that field, so dropping the callback is the correct behavior — equivalent to what Node's llhttp consumers do (Node itself null-guards before invoking JS callbacks).

If preferred, the guard can live in each `rb_llhttp_on_*` instead, returning `0` (continue parsing) when `data == NULL`. Functionally equivalent for this bug class.

## Reproduction outline

Not trivial to reproduce deterministically because it requires Node's llhttp to land in the specific state where it calls the body callback with NULL. Easiest paths:

1. Open a streaming HTTP response with the `http` gem against a server that sends `Transfer-Encoding: chunked`, then have the server send a chunk header followed by a TCP RST before any body bytes. Some load balancers will produce this on health-check failover.
2. Synthetic: drive `llhttp_execute(parser, "HTTP/1.1 200 OK\r\nTransfer-Encoding: chunked\r\n\r\n0\r\n\r\n", N)` with N varying around chunk boundaries, then call again with `data = NULL, length = 0` to flush — the on_body callback is reached with NULL.

(Happy to put together a failing spec against the gem's existing test suite if a maintainer confirms the fix direction.)

## Why we're filing instead of PR-ing directly

We hit this in production at Qonto via LaunchDarkly's Ruby SDK → `ld-eventsource` → `http` gem → `llhttp`. Our local resolution is to accept the transient (Sidekiq retry + K8s pod restart handles it for us), but the bug is general enough that other Ruby ecosystems using `http` gem on long-lived connections are very likely hitting it too. Filing here so the broader community benefits, and so we can drop our internal "accept and move on" note once an `llhttp 0.6.2` ships.

If you'd like a PR with the NULL guard + a regression spec, happy to send one.

— Romain (Qonto, App Systems backend)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIGSEGV: NULL pointer in rb_llhttp_data_callback_call / rb_llhttp_on_body when underlying connection is reset mid-body #36

Summary

Real-world crash

Crash signature

Ruby stack at crash

Impact

Proposed fix (one-line per callback, or one-line in the shared helper)

Reproduction outline

Why we're filing instead of PR-ing directly

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

SIGSEGV: NULL pointer in rb_llhttp_data_callback_call / rb_llhttp_on_body when underlying connection is reset mid-body #36

Description

Summary

Real-world crash

Crash signature

Ruby stack at crash

Impact

Proposed fix (one-line per callback, or one-line in the shared helper)

Reproduction outline

Why we're filing instead of PR-ing directly

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions