Skip to content

Run the RBS parser on JRuby via WebAssembly (JRuby support, step 3)#3000

Open
soutaro wants to merge 3 commits into
claude/practical-mendel-hnqws8-2from
claude/practical-mendel-hnqws8-3
Open

Run the RBS parser on JRuby via WebAssembly (JRuby support, step 3)#3000
soutaro wants to merge 3 commits into
claude/practical-mendel-hnqws8-2from
claude/practical-mendel-hnqws8-3

Conversation

@soutaro

@soutaro soutaro commented Jun 16, 2026

Copy link
Copy Markdown
Member

Stacked on #2999 (step 2), which is stacked on #2998 (step 1). Review/merge those first; this PR's diff is against #2999.

This is the payoff: RBS runs on JRuby, with the parser in WebAssembly and the AST rebuilt in pure Ruby. lib/rbs.rb branches on RUBY_ENGINE, so CRuby is completely unaffected.

JRuby ── Chicory (pure-JVM Wasm) ── rbs_parser.wasm ── serialize ─┐
                                                                  │ identical bytes
RBS::WASM::Deserializer ◄── error blob / serialized AST ◄─────────┘

What's here

WebAssembly ABI (wasm/rbs_wasm.c): rbs_wasm_parse_signature / _parse_type / _parse_method_type parse a character range of a source buffer and leave the serialized AST — or, on a parse error, an error blob (positions + token type + message) — in linear memory, read back via rbs_wasm_result_ptr/_len.

Ruby side (lib/rbs/wasm, loaded only on JRuby):

  • Runtime — loads rbs_parser.wasm into Chicory, wires up WASI, and drives the parse functions. Chicory is pure Java, so there's no native dependency — only the .wasm and the jars ship.
  • Parser — implements RBS::Parser._parse_signature/_parse_type/_parse_method_type on top of Runtime + the step-2 Deserializer, raising RBS::ParsingError just like the C extension. (_lex, _parse_type_params, and the inline-annotation entries raise NotImplementedError for now — follow-up work.)
  • Location — a pure-Ruby implementation of the primitives behind RBS::Location (the C extension's legacy_location.c), so rbs/location_aux.rb works unchanged.

Packaging & CI:

  • rake wasm:jruby_setup assembles lib/rbs/wasm/ (the .wasm plus the Chicory jars from Maven Central). The gemspec ships these in the java-platform gem and skips the C extension there.
  • A JRuby CI job assembles the runtime and runs test/rbs/wasm/jruby_parser_test.rb (which parses the whole bundled corpus + checks types, method types, variables, and error handling).

Validation

Verified locally with JRuby 10.0.6.0 + Chicory 1.7.5 that JRuby and CRuby produce byte-identical ASTs across the entire bundled corpus: a SHA-256 over the JSON of every declaration in core + stdlib + sig (342 files) matches exactly between the two engines. parse_type/parse_method_type (including type variables) and ParsingError reporting also verified.

Scope / follow-ups

  • _lex, _parse_type_params, and inline Ruby annotations aren't wired through the WASM ABI yet — they raise a clear NotImplementedError on JRuby. The main signature-loading path is complete.
  • Encoding is fixed to UTF-8 in the WASM parser (fine for UTF-8/ASCII RBS, which is the convention).

https://claude.ai/code/session_01LTveMt3NLbYHEboXuzAKpA


Generated by Claude Code

claude added 3 commits June 16, 2026 15:09
JRuby cannot load the MRI C extension, so on JRuby RBS now runs the parser
inside WebAssembly (Chicory, a pure-Java runtime) and rebuilds the AST in
pure Ruby. `lib/rbs.rb` branches on RUBY_ENGINE.

WebAssembly ABI (wasm/rbs_wasm.c):
- rbs_wasm_parse_signature / _parse_type / _parse_method_type parse a
  character range of a source buffer and leave the serialized AST (or, on a
  parse error, an error blob) in linear memory for the host to read via
  rbs_wasm_result_ptr / _len.

Ruby side (lib/rbs/wasm, loaded only on JRuby):
- Runtime loads rbs_parser.wasm into Chicory, wires up WASI, and drives the
  parse functions.
- Parser implements RBS::Parser._parse_signature/_parse_type/
  _parse_method_type on top of the runtime and RBS::WASM::Deserializer,
  raising RBS::ParsingError on failure just like the C extension. _lex,
  _parse_type_params and the inline annotation entries are not supported yet.
- Location is a pure-Ruby implementation of the primitives behind
  RBS::Location (the C extension's legacy_location.c), so rbs/location_aux.rb
  works unchanged.

Packaging and CI:
- `rake wasm:jruby_setup` assembles lib/rbs/wasm/ (the .wasm plus the Chicory
  jars from Maven Central); the gemspec ships them in the `java` platform gem
  and skips the C extension there.
- A JRuby CI job parses the whole bundled corpus and runs
  test/rbs/wasm/jruby_parser_test.rb.

Verified that JRuby and CRuby produce byte-identical ASTs across the entire
bundled corpus (core + stdlib + sig).

https://claude.ai/code/session_01LTveMt3NLbYHEboXuzAKpA
Add `omit_on_jruby!` (class- and instance-level), mirroring
`omit_on_truffle_ruby!`, for tests that depend on the C extension or on parser
features not yet wired through the WebAssembly bridge.

- parser_test: omit `test__lex` and `test_parse_type_params` on JRuby (those
  primitives raise NotImplementedError there for now).
- serialization_test: omit the class on JRuby; its round-trip is driven by the
  C extension's `_parse_*_to_bytes`.
- jruby_parser_test: qualify Test::Unit as ::Test::Unit so the file also loads
  under the full suite, where RBS::Test would otherwise shadow it.

https://claude.ai/code/session_01LTveMt3NLbYHEboXuzAKpA
The WASM parser was hardcoded to UTF-8. Pass the buffer's Ruby encoding name
through the ABI and resolve it with rbs_encoding_find (falling back to UTF-8),
so non-UTF-8 sources (EUC-JP, Windows-31J, ...) lex correctly — matching the C
extension, which uses the buffer's encoding.

Verified that an EUC-JP signature produces byte-identical locations and a
correctly-encoded comment string on JRuby and CRuby, and that the UTF-8 corpus
digest is unchanged.

https://claude.ai/code/session_01LTveMt3NLbYHEboXuzAKpA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants