Add Wireshark Lua dissector backend#264
Draft
AaronWebster wants to merge 1 commit into
Draft
Conversation
Adds a parallel back end at compiler/back_end/lua/ that turns an Emboss
.emb into a runnable Wireshark Lua dissector. Mirrors the C++ backend's
shape: a py_binary driver, a starlark rule (lua_emboss_library) exposed
from the root build_defs.bzl, a (wireshark)-qualified attribute set, and
golden tests parallel to cpp_golden_test.
Generator highlights:
* One Proto per .emb, one local function per struct/bits, one value
strings table per enum.
* Nested structs dissected via forward-declared dispatch.
* Bit-addressable (`bits`) blocks emitted as masked ProtoFields against
a single container read.
* `--` doc comments become the ProtoField description; `#` hash
comments are ignored.
* Endianness honored via `subtree:add` vs `subtree:add_le`.
Module-level attributes:
* `[(wireshark) protocol: "name"]` name of the generated Proto
* `[(wireshark) root: "Struct"]` which struct dispatches the top
* `[(wireshark) register_on: "..."]` Wireshark-display-filter-style
string of `<table> == <pattern>`
terms separated by `or` / `||`.
Each term becomes a
DissectorTable.get(...):add(...)
call so Wireshark routes packets
from Ethernet/IP/UDP/TCP layers
into the generated dissector.
Struct- and field-level:
* `[(wireshark) filter: "name"]` overrides the auto-generated
Wireshark filter-name segment.
Plumbing:
* New `emboss_lua_library` macro + `lua_emboss_library` rule + aspect
in the root build_defs.bzl, modelled on cc_emboss_library.
* `embossc --generate lua` (in addition to the existing `cc`).
* scripts/regenerate_goldens.py also refreshes the Lua goldens.
Tests:
* compiler/back_end/lua/dissector_generator_test.py — 27 unit tests
covering identifier sanitization, integer-width mapping, register_on
parsing, enum value-strings emission, filter composition, doc-text
extraction, attribute validation, root-struct selection, and nested
struct dispatch.
* lua_golden_test targets in compiler/back_end/lua/BUILD covering
enum, nested_structure, uint_sizes, int_sizes, and the new
wireshark.emb fixture.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a parallel back end at
compiler/back_end/lua/that turns an Emboss.embdefinition into a runnable Wireshark Lua dissector. Mirrors theC++ backend's shape (driver, starlark rule, golden tests) so the new
backend is invoked exactly like its C++ sibling:
How layered protocols work
Wireshark already dissects Ethernet → IP → UDP/TCP using its built-in
dissectors. The user only needs to define their payload in
.emb;declaring
[(wireshark) register_on: "..."]plugs the generateddissector into the correct Wireshark dissector table at load time.
The
register_onvalue uses Wireshark-display-filter syntax — one ormore
<table> == <integer>terms joined byor/||, with decimal or0x-hex patterns. Each term becomes aDissectorTable.get("<table>"):add(<pattern>, <proto>)call.Generator features
Protoper.emb, onelocal functionperstruct/bits, onevalue-strings table per
enum.reference order works.
bits) blocks emitted as maskedProtoFieldsagainst a single container read.
--doc comments become eachProtoField's description;#comments are ignored.
subtree:addvssubtree:add_le).[(wireshark) filter: "..."]override the auto-generated filter-name segment.
Explicit non-goals (for this initial cut)
The generator emits valid Lua even when it can't fully describe a
field — it emits a
-- skipped …comment and moves on. Future workcan extend coverage:
if).let/virtual fields.Tests
compiler/back_end/lua/dissector_generator_test.py— 27 unittests covering sanitization, integer-width mapping,
register_onparsing, enum tables, filter composition, doc extraction,
attribute validation, root-struct selection, and nested-struct
dispatch.
lua_golden_testtargets incompiler/back_end/lua/BUILDforenum.emb,nested_structure.emb,uint_sizes.emb,int_sizes.emb, and the newtestdata/wireshark.embfixture.scripts/regenerate_goldens.pyalso refreshes the Lua goldens.Test plan
bazel test //compiler/back_end/lua:dissector_generator_testbazel test //compiler/back_end/lua/...(golden tests)bazel build //testdata:wireshark_lua_embossbazel-bin/testdata/wireshark.emb.luaintoWireshark via
wireshark -X lua_script:...and feed a syntheticpacket.