Skip to content

[MWPW-197392] fix: encode non-ASCII alt text header and handle empty image upload responses#180

Open
sharmeebuilds wants to merge 7 commits into
devfrom
mwpw-197392
Open

[MWPW-197392] fix: encode non-ASCII alt text header and handle empty image upload responses#180
sharmeebuilds wants to merge 7 commits into
devfrom
mwpw-197392

Conversation

@sharmeebuilds

@sharmeebuilds sharmeebuilds commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Encode non-ASCII characters in the x-image-alt-text header before sending to the API to prevent malformed request headers
  • Handle empty/null responses from API endpoints gracefully

Why encoding is required

XHR's `setRequestHeader` only accepts Latin-1 (ISO-8859-1) characters. Passing raw non-ASCII text (e.g. Japanese) throws a `TypeError` at the browser level. Any encoding chosen must produce ASCII-safe output.

RFC / encoding options considered

Approach comparison

Approach Wire format for `あ` Standard for HTTP headers Charset declared Recommended
Raw (no encoding) `あ` ❌ throws TypeError in XHR
`encodeURIComponent` `%E3%81%82` ✅ common practice ✅ simple, internal APIs
RFC 5987 (`UTF-8''` prefix) `UTF-8''%E3%81%82` ✅ formal HTTP standard ⚠️ requires server to strip prefix
RFC 2047 (MIME) `=?utf-8?Q?=E3=81=82?=` ❌ designed for email ❌ wrong standard for HTTP
Base64 `44GC` ❌ not a header standard

RFC 5987 — why it was considered and rejected

RFC 5987 is the formal HTTP standard for encoding non-ASCII header values. It prefixes the percent-encoded value with `UTF-8''` so the receiver knows the charset explicitly.

// RFC 5987 format
x-image-alt-text: UTF-8''%E3%81%82%E3%81%84%E3%81%82%E3%81%84%E3%81%93

Why rejected: ESP's `ImageService.getImageHeaders()` reads the header raw with no decoding — the server would store the literal string `UTF-8''%E3%81%82...` in DynamoDB. Using RFC 5987 would require a server-side change to strip the prefix before decoding. For an internal API, that complexity is unnecessary.

RFC 2047 — already used internally by ESP

ESP uses the `rfc2047` package in `ImageService` to encode alt text when writing to S3 metadata:

// ESP ImageService.js
[ImageService.ImageHeaders.ALT_TEXT]: rfc2047.encode(altText)
// → "=?utf-8?Q?=E3=81=82?="

This is an email/MIME encoding standard (not HTTP) applied after ESP receives and decodes the header value. It's unrelated to how the header travels from EMC to ESP.

npm packages evaluated

Package Maintainer Health Verdict
`rfc5987-value-chars` Unknown ❌ 1 version, abandoned since 2018 Rejected
`rfc2047` Andreas Lindpetersen ✅ 9 versions, actively maintained Already in ESP — not needed on client
`content-disposition` blakeembrey ✅ actively maintained Overkill — wraps RFC 5987 for Content-Disposition headers specifically

No well-maintained npm package exists for HTTP header percent-encoding because `encodeURIComponent` is a built-in that already handles it correctly.

Decision

Use plain `encodeURIComponent` on the client. It is:

  • Required by XHR spec (raw non-ASCII throws)
  • Sufficient for this internal API where both sides are Adobe-controlled
  • Decoded server-side with `decodeURIComponent` before ESP passes the value to `rfc2047.encode()` for S3 storage

Note: ESP's `ImageService.getImageHeaders()` currently reads the header with no decoding, meaning percent-encoded text is stored as-is in DynamoDB. A follow-up ESP fix is needed to call `decodeURIComponent` before the value reaches `rfc2047.encode()`.

Test plan

  • Upload an image with a non-ASCII alt text (e.g. accented characters, Japanese) and confirm the upload succeeds without header errors
  • Simulate an empty response from BE APIs and confirm no unhandled errors
  • Lint + type-check pass (`npm run check`)

🤖 Generated with Claude Code

…esponses

- encodeURIComponent on x-image-alt-text header so non-ASCII characters
  (e.g. Japanese) are not silently dropped by the browser
- Parse response body safely in safeFetch to avoid JSON.parse errors on
  empty 204-style responses; guard getConfig against non-object results

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sharmeebuilds sharmeebuilds changed the title fix: encode non-ASCII alt text header and handle empty image upload responses [MWPW-197392] fix: encode non-ASCII alt text header and handle empty image upload responses Jun 15, 2026
@sharmeebuilds sharmeebuilds requested a review from qiyundai June 15, 2026 09:53

@qiyundai qiyundai left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One soft flag before this ships: make sure the server decodes the encodeURIComponent-encoded value in the x-image-alt-text header before storing or displaying it — otherwise non-ASCII alt text will surface as percent-encoded strings (e.g. %E6%97%A5%E6%9C%AC%E8%AA%9E) on the display side. Encoding is the right transport fix; just confirming the backend is decoding on receipt.

@sharmeebuilds

sharmeebuilds commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

LGTM. One soft flag before this ships: make sure the server decodes the encodeURIComponent-encoded value in the x-image-alt-text header before storing or displaying it — otherwise non-ASCII alt text will surface as percent-encoded strings (e.g. %E6%97%A5%E6%9C%AC%E8%AA%9E) on the display side. Encoding is the right transport fix; just confirming the backend is decoding on receipt.

Thanks for flagging, this is already in discussion with BE team.

Also, api.ts:1827 (uploadImage()) is a second XHR upload path that still sends the header unencoded
encodeURIComponent added there too.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sharmeebuilds sharmeebuilds added this to the T3-26.26 milestone Jun 17, 2026
@sharmeebuilds

Copy link
Copy Markdown
Collaborator Author

Merge after BE uses decodeURI... @gbajaj91

sharmeebuilds and others added 5 commits June 17, 2026 15:15
Prefix the percent-encoded alt text with UTF-8'' so the charset is
explicitly declared per RFC 5987, making it safe for non-ASCII values
like Japanese across any compliant HTTP receiver.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the UTF-8'' prefixed encodeURIComponent approach with the
rfc5987-value-chars package, which correctly encodes chars forbidden by
RFC 5987 value-chars (apostrophes, parens, *) without adding a charset
prefix — keeping the wire format compatible with the existing server-side
decodeURIComponent decode.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removes rfc5987-value-chars dependency. Plain encodeURIComponent is
sufficient for this internal API to safely encode non-ASCII characters
like Japanese in XHR headers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants