fix: reject unallocated AL postal-code prefixes (#118)#134
Merged
Conversation
…ing (#118) The block resolver used a continuous bisect, so any 4-digit code >=1000 resolved to its enclosing district block — including codes in prefixes that belong to no district (e.g. 1900, 9999), which were returned as high-confidence regions despite not existing. Key on the allocated 2-digit district prefix instead: real codes and every #118 gap code still resolve identically (the golden test is unchanged), but non-existent codes now return not-found rather than a fabricated region. Flagged by Codex review on #133.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #133, addressing a Codex review finding.
Problem
The Albania block resolver used a continuous
bisect, so every 4-digit code ≥ 1000 resolved to its enclosing district block — including codes whose 2-digit prefix belongs to no district (e.g.1900,2100,9999). Those non-existent codes were returned asmatch_type="estimated", confidence0.9— indistinguishable from a real answer. Associating a confident region with a code that doesn't exist is a false signal.Fix
Key on the allocated 2-digit district prefix instead of a continuous range. A code resolves only when its prefix maps to a real district; otherwise it returns not-found.
1900 → None,9999 → None.1099 → AL022) — the block scheme is authoritative at district (2-digit) granularity, which is the finest our data supports.This aligns AL with the rest of the service's quality-over-coverage stance (0.85 geocode threshold, cross-border guard): prefer no answer over a confident wrong one.
Tests: 49 passed across the AL + data_loader suites, ruff clean. README + CHANGELOG updated.