Summary
Ship the tiktoken-backed token counter that the packaging already promises: a
counter implementing the existing firewall/token_counting.py seam, installed via
pip install weaver-kernel[tiktoken], so context budgets count real model tokens
instead of character/byte heuristics.
Why this matters
Budgets are the firewall's scalability promise to the LLM context window, and
context windows are measured in tokens. The extra is already declared in
pyproject.toml (tiktoken = ["tiktoken>=0.6"]) and the protocol seam exists with
a docstring naming tiktoken as the intended example — but installing the extra today
buys nothing, which is a small broken promise to anyone who reads the metadata.
Accurate counting makes budget downgrades fire at the right thresholds for real
deployments.
Current evidence
pyproject.toml:68: tiktoken = ["tiktoken>=0.6"] — declared, unused anywhere in src/.
firewall/token_counting.py:4: docstring says the module exists to plug "token counters (for example, a tiktoken-based one)" into the firewall.
firewall/budgets.py/transform.py currently drive decisions off byte-size estimates (see ISSUE 37).
External context
tiktoken is the de-facto tokenizer for OpenAI-family models; Anthropic counting
differs — the design should name the encoding explicitly rather than guess per
model.
Proposed implementation
- Add
firewall/token_counting_tiktoken.py (lazy import; helpful ImportError
message naming the extra) implementing the existing counter protocol with a
configurable encoding (default cl100k_base or o200k_base — decide and
document).
- Wire selection:
Firewall(token_counter=...) already-shaped seam; document
construction.
- Cache encoder instances (expensive to build); counting must stay deterministic.
- Bare-install safety: the module must not import tiktoken at package import time
(pairs with ISSUE 19's no-extras CI job).
AI-agent execution notes
- Inspect first:
firewall/token_counting.py (protocol), firewall/transform.py/budgets.py (consumption), otel.py (pattern for optional-extra lazy import), tests/test_firewall.py.
- Edge cases: non-string data (count serialized form? document), very large strings (chunked counting), unknown encoding names (typed error).
- Follow the existing optional-dependency seam pattern exactly (mcp/otel precedents).
Acceptance criteria
- With the extra installed, budgets use real token counts (tested with known strings and expected counts).
- Without the extra, behavior is unchanged and importing the public API does not require tiktoken.
- Helpful error when explicitly requesting the tiktoken counter without the extra.
Test plan
Marked tests that skip without the extra; CI job (dev extras include it or add it);
bare-install job asserts no import-time dependency. Run make ci.
Documentation plan
docs/context_firewall.md budgets section; README extras table; CHANGELOG Added.
Migration and compatibility notes
Opt-in; defaults unchanged. Budget thresholds calibrated for byte counts may need
retuning when switching to token counts — document the difference.
Risks and tradeoffs
tiktoken is a heavyweight binary dependency (hence extra-only); encoding choice can
mislead for non-OpenAI models — explicit configuration and docs mitigate.
Suggested labels
ai, llm, product, performance
Summary
Ship the
tiktoken-backed token counter that the packaging already promises: acounter implementing the existing
firewall/token_counting.pyseam, installed viapip install weaver-kernel[tiktoken], so context budgets count real model tokensinstead of character/byte heuristics.
Why this matters
Budgets are the firewall's scalability promise to the LLM context window, and
context windows are measured in tokens. The extra is already declared in
pyproject.toml(tiktoken = ["tiktoken>=0.6"]) and the protocol seam exists witha docstring naming tiktoken as the intended example — but installing the extra today
buys nothing, which is a small broken promise to anyone who reads the metadata.
Accurate counting makes budget downgrades fire at the right thresholds for real
deployments.
Current evidence
pyproject.toml:68:tiktoken = ["tiktoken>=0.6"]— declared, unused anywhere insrc/.firewall/token_counting.py:4: docstring says the module exists to plug "token counters (for example, atiktoken-based one)" into the firewall.firewall/budgets.py/transform.pycurrently drive decisions off byte-size estimates (see ISSUE 37).External context
tiktoken is the de-facto tokenizer for OpenAI-family models; Anthropic counting
differs — the design should name the encoding explicitly rather than guess per
model.
Proposed implementation
firewall/token_counting_tiktoken.py(lazy import; helpfulImportErrormessage naming the extra) implementing the existing counter protocol with a
configurable encoding (default
cl100k_baseoro200k_base— decide anddocument).
Firewall(token_counter=...)already-shaped seam; documentconstruction.
(pairs with ISSUE 19's no-extras CI job).
AI-agent execution notes
firewall/token_counting.py(protocol),firewall/transform.py/budgets.py(consumption),otel.py(pattern for optional-extra lazy import),tests/test_firewall.py.Acceptance criteria
Test plan
Marked tests that skip without the extra; CI job (dev extras include it or add it);
bare-install job asserts no import-time dependency. Run
make ci.Documentation plan
docs/context_firewall.mdbudgets section; README extras table; CHANGELOGAdded.Migration and compatibility notes
Opt-in; defaults unchanged. Budget thresholds calibrated for byte counts may need
retuning when switching to token counts — document the difference.
Risks and tradeoffs
tiktoken is a heavyweight binary dependency (hence extra-only); encoding choice can
mislead for non-OpenAI models — explicit configuration and docs mitigate.
Suggested labels
ai, llm, product, performance