[Rust] Improve Object API codegen: idiomatic types, Eq/Hash derives, String shadowing fix by pittengermdp · Pull Request #8931 · google/flatbuffers

pittengermdp · 2026-02-19T14:29:05Z

Summary

Use idiomatic short type names (String, Vec, Box) instead of fully-qualified alloc:: paths in generated Rust Object API code, with use alloc::* imports added for no_std compatibility
Derive Eq and Hash on Object API types (tables, structs, unions, bitflags) when no floating-point fields are present in the type hierarchy
Fix String type shadowing: when a schema defines a table named String, generated code now emits std::string::String for stdlib string fields (wires up existing unused NamespaceShadowsString/ObjectStringType helpers)
Add #[allow(unused_imports)] to suppress warnings on generated imports
Fix serde Deserialize impl for enums to use fully-qualified std::string::String

Test plan

C++ tests pass (flattests)
Rust tests pass (usage, no_std, serde, reflection, outdir, clippy)
Go tests pass (TestAll, TestTextParsing, benchmarks)
Python tests pass (multi-file, one-file, flexbuffers)

google-cla · 2026-02-19T14:29:11Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

csmulhern · 2026-02-20T02:53:54Z

The use of fully qualified names for Vec, String, etc, was done intentionally in #8564. The point is to deal with the situation you cover in bullet 3. If you were to name a struct / table Vec or String without using fully-qualified path names in the generated code when referencing the stdlib types, then your code generation would be broken. The use of fully-qualified paths for the stdlib types is the most robust and precise approach.

This can't be done for alloc because of how alloc works, but all current uses should be using alloc for no_std compatibility.

…2.19 Rebuilds the fork cleanly on google/flatbuffers master (81edeb1) instead of the broken 2016-rooted lineage. Replays all hand-written source (generators, runtimes, schemas, hand-written tests) via 3-way merge; regenerable artifacts will be rebuilt in a follow-up commit via scripts/generate_code.py. Source features preserved: Rust Object API (Eq/Hash, OrderedFloat, serde, bare Box, rustdoc/#[must_use]), --bfbs-gen-embed, flatc native map<K,V>/set<V> syntax + JSON/FBS round-trip, Go/TS/Rust reflection + verifier runtimes, and map/set + cross-ns tests. Conflicts resolved (4): idl_gen_rust.cpp (54 hunks, upstream reindent + fork features), idl_gen_fbs.cpp (map/set re-emit + upstream string-escaping), builder.rs (upstream ? operator), test.cpp (kept both upstream CrossNamespacePackTest and fork map/set tests). Verified: flatc builds and links cleanly (v25.12.19).

…t Object-API Regenerates all generated code (Rust/Go/TS/C++/samples/grpc) with the merged flatc (v25.12.19), and completes the fork's OrderedFloat-in-Object-API feature so the full Rust suite compiles and passes. Generator fixes (src/idl_gen_rust.cpp): - vector-of-float unpack: OrderedFloat(*v) -> OrderedFloat(v) (.iter() yields owned f64) - optional scalar float: accessor .map(OrderedFloat); pack .map(|v| v.into_inner()) - struct array-of-float: array_init(|i| OrderedFloat(field.get(i))) (Array isn't indexable) - struct array float Default: [OrderedFloat(0.0); N] - AllFieldsHaveNaturalDefault: string fields with a schema default now force a manual Default impl (was silently emitting empty string) Rust tests adapted to the OrderedFloat API (no assertions weakened): optional_scalars_test, arrays_test, integration_test. Verified: flatc builds; Rust suite green (276 integration + 21 arrays + 13 optional + samples); Go runtime builds clean; TS runtime typechecks clean. Note: js/mjs compiled output follows upstream (gitignored); re-add force-commit + .npmignore if the frontend needs GitHub npm-install artifacts.

… behind flag (google#5) google#3 (idl_gen_cpp.cpp): generated C++ static_assert now pins only MAJOR.MINOR, not REVISION, so patch-version bumps stop churning every generated header. google#5 (idl.h, flatc.cpp, idl_gen_rust.cpp): OrderedFloat-in-Object-API is now opt-in via --rust-object-api-hashable-floats (default OFF). Default codegen emits plain f32/f64 (upstream-compatible ergonomics, no .into_inner() tax); the flag re-enables OrderedFloat wrapping for Eq/Hash on float-bearing *T types. TypeIsHashable made transitive (nested structs/unions/vectors) so Eq/Hash derives are sound in both modes. Restored upstream's optional_scalars/arrays/integration Rust tests (plain-float API). Verified: cargo test green (276+21+13+2+samples); default corpus has zero OrderedFloat; flag ON re-emits OrderedFloat (12 hits on optional_scalars).

generate_code.py now runs each language's canonical formatter over the generated corpus after codegen: clang-format (C++), rustfmt (Rust), gofmt (Go), prettier (TS), and black (Python, if installed). Each is optional and skipped with a note when the tool is absent, so the script still runs in minimal environments. This makes regenerations idempotent and eliminates whitespace-only churn — the root cause of the 54-conflict generator merge in this branch's history. Applied once here: clang-format 27, rustfmt 90, gofmt 55, prettier 167 files. Verified: Rust suite still green after reformat (formatters are semantics-preserving).

…google#1) Upstream already runs a regen-diff gate (scripts/check_generate_code.py in the 'Check Generated Code' job), which reruns generate_code.py and fails on any diff. Since generate_code.py now formats generated code (google#2), that gate is only deterministic when the same formatters are present. Changes: - build.yml 'Check Generated Code' job now sets up Go (gofmt), Rust (rustfmt), and Node + pnpm@10 + pnpm i (prettier) before the check, so regen output matches. - Pinned prettier to 3.8.4 (package.json + pnpm-lock.yaml) for cross-env determinism. - Dropped clang-format from the format step: its output varies sharply across major versions (local 22 vs CI 18), which would make the gate flaky. flatc already emits clean C++, so generated headers revert to flatc's deterministic output. Verified: two consecutive regens produce byte-identical output (0 unstaged diff), so the gate passes. gofmt/rustfmt confirmed stable (stable==nightly rustfmt output).

Scheduled GitHub Actions workflow (1st of each month + manual dispatch) that fetches google/flatbuffers master, merges it into a dated upstream-sync/ branch, and opens a PR (draft if conflicts) so CI runs and a human approves. Keeps the fork from ever drifting to a stale merge-base again — the root cause of this branch's painful rebuild.

Five verified correctness fixes across the generators and runtime: - rust runtime: Verifier::reset() now clears apparent_size instead of assigning num_tables twice, so a reused Verifier no longer inherits the previous buffer's apparent size. - rust serde: enums serialize unknown discriminants as their numeric value instead of variant_name().unwrap() (forward-compat); union serialize skips unknown variants instead of unimplemented!(). - go: nested struct fields are nil-guarded before Pack flattens them into CreateX(...), preventing a nil dereference on hand-built *T objects. - ts: map/set pack sorts keys with a relational comparator, fixing bigint key precision loss above 2^53 that corrupted the sorted-vector order. - rust: map/set float keys sort via total_cmp (total order including NaN). Also repairs rust_serialize_test, which did not compile: - serde dependency enables the `derive` feature (the generated code uses #[derive(serde::Serialize, serde::Deserialize)]). - enum serialize uses `self.0 as u32` so signed/64-bit underlying types (byte/long) compile. - bit-flags enums emit a serde Deserialize impl and serialize as u64, so *T structs containing bit-flags fields can derive Deserialize. - builder test asserts the preallocated vecs don't reallocate (capacity stability) rather than zero global allocations, which is unsatisfiable now that strings_pool is a HashMap<String, _> with owned String keys. Regenerated Go and Rust serde corpus accordingly.

…able-floats - build-generator-windows now installs Go/Rust/Node+pnpm before generating, so format_generated_code() actually runs. Previously the Windows job regenerated unformatted output and diffed it against the formatted (Linux) committed corpus, making the regen-diff gate silently non-deterministic on Windows. - Add compile + behaviour coverage for --rust-object-api-hashable-floats: generate_code.py now emits an OrderedFloat-wrapped optional_scalars variant, and optional_scalars_ordered_floats_test.rs proves the *T struct derives Eq/Hash and round-trips. The flag was previously exercised by no test, so it could silently break on an upstream merge. - Fix a pre-existing no_std build regression: the fork's flatbuffers::HashMap / HashSet aliases used std::collections unconditionally under #![no_std]. Gate them behind the `std` feature (map/set Object API fields require std). - generate_code.py: drop the dead `import glob` (shadowed by the local glob() helper) and the redundant inner `import shutil`.

64-bit (int64/uint64) field defaults and accessor fallbacks now use native bigint literals (e.g. 0n, 42n) instead of BigInt('0') / BigInt(0) calls. The values are identical at runtime; the literal form is a native expression rather than a runtime constructor call and reads cleaner. Runtime BigInt(expr) conversions (operand is a variable) are left unchanged. All consuming tsconfigs target ES2020, which supports bigint literals (verified via tsc --noEmit). Part of the non-breaking TypeScript improvements. Two other candidates were intentionally not applied: - readonly bb/bb_pos: not feasible — __init() assigns them outside the constructor, which TypeScript's readonly forbids; forcing it would require a breaking constructor-signature change. - field-name/type-name clash (idl_gen_ts.cpp FIXME): latent only — no schema in the test corpus triggers it, so a fix would be unverifiable defensive code. The breaking AnyT discriminated-union *T field retyping was deferred (per review) to keep the generated Object API stable for downstream consumers.

…constants - KeyCompare and LookupByKey now declare their temporary accessors as `var obj T` instead of `obj := &T{}`. No pointer escapes either function, so Go keeps them on the stack — removing two heap allocations per comparator call during a sort, and one per step of every O(log n) keyed binary search. - Generate exported `<Type>Field<Field>` string constants (value = the schema field name) and point UnpackFields' doc at them. Callers now select fields with compile-checked constants (e.g. MonsterFieldHp) instead of guessing raw schema-name strings; a misspelled name fails to compile rather than silently populating nothing. Non-breaking: existing string callers still work because each constant's value is the schema name UnpackFields already matches on. Regenerated the Go corpus; Go tests pass.

Run rustfmt over the fork's additions to the flatbuffers runtime crate so it is `cargo fmt`-clean: the shared-string pool binary search in builder.rs and the map/set HashMap/HashSet type aliases in lib.rs were authored with manual line breaks that rustfmt collapses. No behavioural change. Keeps the first-party library crates clean so a per-crate `cargo fmt --check` (pre-push) passes.

The 25.12.19 replay lost the root go.mod (module github.com/google/flatbuffers, go 1.22). Without it Go compiles the runtime at go1.16, so the reflect_*.go files' use of the predeclared 'any' identifier fails to build for downstream Go consumers. Restore it at go 1.24 (also satisfies testing.Loop in verifier_bench_test.go). Validated: go build ./go/ and go vet ./go/ both pass.

The upstream-25.12.19 replay dropped the committed js/mjs build output that the fork ships for direct github-npm consumption (v0.3.0 shipped it; v0.4.0 did not), so package.json main/module pointed at nonexistent files and liftNet_frontend's 'import flatbuffers' failed to resolve. Rebuild via npm run compile and force-add js/ + mjs/ (both are .gitignored, matching the fork's prior practice).

pittengermdp requested a review from dbaileychess as a code owner February 19, 2026 14:29

github-actions Bot added python c++ rust codegen Involving generating code from schema labels Feb 19, 2026

jtdavis777 added the waiting-for-update This PR is waiting for a change from the author or contributors before it is ready for merge label Mar 5, 2026

pittengermdp force-pushed the rust-object-api-improvements branch from 819b32e to 65e4333 Compare March 18, 2026 05:26

github-actions Bot added golang java c# javascript grpc typescript swift documentation Documentation lua dart lobster php json kotlin CI Continuous Integration nim labels Mar 18, 2026

pittengermdp added 5 commits June 25, 2026 15:40

pittengermdp force-pushed the rust-object-api-improvements branch from 41f3b73 to 8c84ab9 Compare June 25, 2026 22:22

pittengermdp added 7 commits June 25, 2026 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Rust] Improve Object API codegen: idiomatic types, Eq/Hash derives, String shadowing fix#8931

[Rust] Improve Object API codegen: idiomatic types, Eq/Hash derives, String shadowing fix#8931
pittengermdp wants to merge 13 commits into
google:masterfrom
pittengermdp:rust-object-api-improvements

pittengermdp commented Feb 19, 2026

Uh oh!

google-cla Bot commented Feb 19, 2026

Uh oh!

csmulhern commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

pittengermdp commented Feb 19, 2026

Summary

Test plan

Uh oh!

google-cla Bot commented Feb 19, 2026

Uh oh!

csmulhern commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants