Background β what is expression_types?
The LSP keeps a side table called expression_types in ndc_lsp/src/features/inlay_hints.rs:
pub expression_types: HashMap<usize, StaticType>,
It maps byte offsets in the source file to the static type of an expression that ends at that offset. Dot-completion uses it: when the user types expr., the LSP looks up the byte just after expr and asks "what's the type sitting here?" so it can suggest methods that accept that type. See completion.rs:22:
s.expression_types.get(&dot_offset).cloned().or_else(...)
The map is built by HintCollector::on_expression, which is called by walk_expression in ndc_lsp/src/visitor.rs as it walks every node in the AST. For each node, it inserts (node.span.end() β node.type).
The map's implicit invariant should be: at any byte offset, the value is the type of the largest enclosing expression that ends at that offset (because that's what the user means when they type . right after it).
The problem
Two patterns in the AST cause multiple expressions to end at the same byte:
1. Binary operator calls
For 1 < 2, the parser lowers < to a function call (parser.rs:196):
let new_span = left.span.merge(right.span);
// ...
Expression::Call { function: <op>, arguments: vec![left, right] }.to_location(new_span);
So the outer Call's span ends at the same byte as its right argument. Three expressions all end at byte 5 in 1 < 2:
Call(<, [1, 2]) β type Bool
- the right argument
2 β type Int
- (and the
Statement wrapper, see below)
2. The Statement wrapper
Top-level expressions are wrapped in Expression::Statement, which copies the inner expression's span verbatim (expression.rs:235-241):
pub fn to_statement(self) -> Self {
Self { id: NodeId::next(), span: self.span, expression: Expression::Statement(Box::new(self)) }
}
So Statement(Call(<, [1, 2])) has the same span.end() as the inner Call, with a different type: () (unit), the statement type.
What actually happens for 1 < 2;
walk_expression walks parent-before-children (pre-order) and inserts with HashMap::insert (last-writer-wins). Trace:
on_expression(Statement) β inserts unit at byte 5.
- recurse into
Call. on_expression(Call) β overwrites byte 5 with Bool.
- recurse into arguments.
on_expression(<literal 2>) β overwrites byte 5 with Int.
Final map: {1: Int, 5: Int}. The expected invariant says byte 5 should hold Bool (the Call).
Why no user-facing bug today
To actually trigger dot-completion on a binary operator result, you have to write something like (1 < 2).method. The parens are mandatory because 1 < 2.method would lex as 1 < (2.0) (2. starts a float literal). With parens, the outer node is Grouping, whose span includes the parens β so Grouping.span.end() is one byte after the inner Call.span.end(). No collision, no wrong lookup.
So today's only consumer (dot-completion) never queries the corrupted byte in any source a user can actually write. The map data is technically wrong, but no feature reads from the wrong slot.
It will start mattering as soon as any LSP feature queries expression_types more broadly β hover-on-expression is the obvious next one (e.g. hovering on 2 in 1 < 2 would surface Int instead of any contextual type), and any future feature that scans the map.
Why the obvious fix doesn't quite work
The natural fix is to switch insert to entry().or_insert_with(...) so the first writer wins. Since the visitor is pre-order, the largest enclosing expression gets there first.
But because Statement shares its inner's span, Statement(unit) would now win the byte-5 slot over the Call(Bool). Result: unit instead of Bool β still wrong, just wrong in a different way.
A correct fix needs both:
- Switch to
entry().or_insert_with(...) (first-writer-wins).
- Skip
Statement in on_expression (and audit any other "wrapper" expressions that copy their inner's span). Statement is a parse-time wrapper with no meaningful type of its own β its type is the inner's type for any consumer that cares.
Both pieces are needed: without (2), Statement steals the slot; without (1), the right operand steals it after the Call.
Reproduction
let info = collect_hints("1 < 2;");
assert_eq!(info.expression_types.get(&5), Some(&StaticType::Bool)); // fails today
(See the test pattern in ndc_lsp/src/features/inlay_hints.rs:107-177.)
Background
Flagged by Codex in review of #136. That PR added recursion into Call arguments to fix a separate inlay-hint bug, which exposed the latent map-overwrite issue. Pre-PR the map happened to read Bool at byte 5 β but only because the visitor was missing arms entirely and processing Call after Statement overwrote it, not because of any principled design.
Background β what is
expression_types?The LSP keeps a side table called
expression_typesinndc_lsp/src/features/inlay_hints.rs:It maps byte offsets in the source file to the static type of an expression that ends at that offset. Dot-completion uses it: when the user types
expr., the LSP looks up the byte just afterexprand asks "what's the type sitting here?" so it can suggest methods that accept that type. Seecompletion.rs:22:The map is built by
HintCollector::on_expression, which is called bywalk_expressioninndc_lsp/src/visitor.rsas it walks every node in the AST. For each node, it inserts(node.span.end() β node.type).The map's implicit invariant should be: at any byte offset, the value is the type of the largest enclosing expression that ends at that offset (because that's what the user means when they type
.right after it).The problem
Two patterns in the AST cause multiple expressions to end at the same byte:
1. Binary operator calls
For
1 < 2, the parser lowers<to a function call (parser.rs:196):So the outer
Call's span ends at the same byte as its right argument. Three expressions all end at byte 5 in1 < 2:Call(<, [1, 2])β typeBool2β typeIntStatementwrapper, see below)2. The
StatementwrapperTop-level expressions are wrapped in
Expression::Statement, which copies the inner expression's span verbatim (expression.rs:235-241):So
Statement(Call(<, [1, 2]))has the samespan.end()as the innerCall, with a different type:()(unit), the statement type.What actually happens for
1 < 2;walk_expressionwalks parent-before-children (pre-order) and inserts withHashMap::insert(last-writer-wins). Trace:on_expression(Statement)β insertsunitat byte 5.Call.on_expression(Call)β overwrites byte 5 withBool.on_expression(<literal 2>)β overwrites byte 5 withInt.Final map:
{1: Int, 5: Int}. The expected invariant says byte 5 should holdBool(theCall).Why no user-facing bug today
To actually trigger dot-completion on a binary operator result, you have to write something like
(1 < 2).method. The parens are mandatory because1 < 2.methodwould lex as1 < (2.0)(2.starts a float literal). With parens, the outer node isGrouping, whose span includes the parens β soGrouping.span.end()is one byte after the innerCall.span.end(). No collision, no wrong lookup.So today's only consumer (dot-completion) never queries the corrupted byte in any source a user can actually write. The map data is technically wrong, but no feature reads from the wrong slot.
It will start mattering as soon as any LSP feature queries
expression_typesmore broadly β hover-on-expression is the obvious next one (e.g. hovering on2in1 < 2would surfaceIntinstead of any contextual type), and any future feature that scans the map.Why the obvious fix doesn't quite work
The natural fix is to switch
inserttoentry().or_insert_with(...)so the first writer wins. Since the visitor is pre-order, the largest enclosing expression gets there first.But because
Statementshares its inner's span,Statement(unit)would now win the byte-5 slot over theCall(Bool). Result:unitinstead ofBoolβ still wrong, just wrong in a different way.A correct fix needs both:
entry().or_insert_with(...)(first-writer-wins).Statementinon_expression(and audit any other "wrapper" expressions that copy their inner's span). Statement is a parse-time wrapper with no meaningful type of its own β its type is the inner's type for any consumer that cares.Both pieces are needed: without (2), Statement steals the slot; without (1), the right operand steals it after the Call.
Reproduction
(See the test pattern in
ndc_lsp/src/features/inlay_hints.rs:107-177.)Background
Flagged by Codex in review of #136. That PR added recursion into
Callarguments to fix a separate inlay-hint bug, which exposed the latent map-overwrite issue. Pre-PR the map happened to readBoolat byte 5 β but only because the visitor was missing arms entirely and processingCallafterStatementoverwrote it, not because of any principled design.