fix: normalize percent-encoding in query and fragment (RFC 3986 §6.2.2)#183
Open
spokodev wants to merge 1 commit into
Open
fix: normalize percent-encoding in query and fragment (RFC 3986 §6.2.2)#183spokodev wants to merge 1 commit into
spokodev wants to merge 1 commit into
Conversation
RFC 3986 §6.2.2 case/percent normalization was applied to the path but not to the query or fragment: - the query was passed through verbatim, so `?a=%2a` was not uppercased to `?a=%2A`, `%7e`/`%2e` were not decoded, and a raw space or non-ASCII byte was left unencoded; - the fragment went through `encodeURI(decodeURIComponent(...))`, which decodes reserved characters too — `#a%2Fb` became `#a/b`, changing the value. Add `normalizeQueryFragmentEncoding` (the path normalizer with the query/fragment character set, which also allows `?` and decodes `.`) and route both components through it, so all three are normalized the same way. A fragment whose decoded bytes are not valid UTF-8 (e.g. `#%E0%A4A`) is still valid percent-encoding (RFC 3986 §2.1 is byte-level) and is now preserved instead of being flagged as malformed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
RFC 3986 §6.2.2 (case + percent-encoding normalization) is applied to the path but not to the query or fragment.
Why these are bugs
encodeURI(decodeURIComponent(fragment)), which decodes reserved characters too, so%2F→/and%2A→*— a value change.uri-js(the iso-compat target) normalizes both:?a=%2a→?a=%2A, and keeps#a%2Fbas#a%2Fb.Change
Add
normalizeQueryFragmentEncoding— the existingnormalizePathEncodingwith the query/fragment character set (which additionally permits?and decodes., since there are no dot-segments outside a path) — and route the query and fragment through it. All three components are now normalized consistently:?a=%2a?a=%2a?a=%2A?a b?a b?a%20b#a%2Fb%2A#a/b*#a%2Fb%2A#f?x/y#f?x/y#f?x/y(unchanged)One behavior change worth calling out
encodeURI(decodeURIComponent(...))threw on a fragment whose decoded bytes are not valid UTF-8 (e.g.#%E0%A4A), which the old code caught and flagged aserror: 'URI malformed'. Such a fragment is still valid percent-encoding (RFC 3986 §2.1 is byte-level and does not require UTF-8), so it is now preserved without an error. The'tolerates malformed fragments'resolve/equal tests still pass; I updated the one parse assertion that checked for the error flag (the fragment value is unchanged).Tests
Added
test/query-fragment-normalization.test.js. Fullnpm run test:unitpasses (884),eslintclean.