Skip to content

fix(image-downloader): detect media type from magic bytes instead of URL extension#1280

Open
bakrsabeeh wants to merge 1 commit intoanthropics:mainfrom
bakrsabeeh:fix/image-media-type-magic-bytes
Open

fix(image-downloader): detect media type from magic bytes instead of URL extension#1280
bakrsabeeh wants to merge 1 commit intoanthropics:mainfrom
bakrsabeeh:fix/image-media-type-magic-bytes

Conversation

@bakrsabeeh
Copy link
Copy Markdown

Fix: Detect image media type from magic bytes instead of file extension

Problem

When Claude Code Action downloads images from GitHub comments, it determines
the media_type field (sent to the Anthropic API) from the file extension
in the URL — not from the actual file content.

This causes a 400 invalid_request_error whenever the real format doesn't
match the extension. The most common trigger is Claude's own spinner GIF,
which the action adds to comments during processing and then downloads back
as /tmp/github-images/image-xxx-0.png. Because the file is saved as .png
but contains GIF bytes, the API rejects it with:

API Error: 400
"Image does not match the provided media type image/png"

Fixes: #702, #495

Related issues in claude-code: #13396, #30124, #31642, #39146, #11931


Root cause

src/github/image-downloader.ts contained two functions:

// BEFORE — extension-based (broken)
function getImageExtension(url: string): string {
  const match = url.match(/\.([a-zA-Z0-9]+)(?:\?|$)/);
  if (match && ["png","jpg","jpeg","gif","webp"].includes(match[1].toLowerCase()))
    return match[1].toLowerCase();
  return "png"; // ← always falls back to "png", even for GIFs
}

function getMediaType(ext: string): string {
  const map: Record<string, string> = {
    png: "image/png", jpg: "image/jpeg", jpeg: "image/jpeg",
    gif: "image/gif", webp: "image/webp",
  };
  return map[ext] ?? "image/png"; // ← trusts the extension
}

The downloaded file was also saved with the URL-derived extension (e.g.
.png), so even checking the saved filename would reproduce the same error.


Fix

Replace extension-based detection with magic byte sniffing:

  1. Save downloaded files with a neutral .bin extension — no format assumption.
  2. detectMediaTypeFromBytes(buffer) reads the first 12 bytes of the file and
    matches against known signatures:
    Format Signature
    PNG \x89PNG\r\n\x1a\n (8 bytes)
    JPEG \xFF\xD8\xFF (3 bytes)
    GIF87a GIF87a (6 bytes)
    GIF89a GIF89a (6 bytes)
    WebP RIFF????WEBP (bytes 0-3 + 8-11)
  3. getMediaType(filePath) opens the file, reads 12 bytes, calls
    detectMediaTypeFromBytes, and falls back to extension-based detection
    only when the magic bytes are inconclusive.

Only 12 bytes are read per file — zero performance impact.


Files changed

  • src/github/image-downloader.ts — core fix (magic byte detection)

Tests

Unit tests cover:

  • All 5 supported formats detected correctly from magic bytes
  • Core bug scenario: GIF bytes in a .png file → correctly returns image/gif
  • JPEG bytes in a .png file → correctly returns image/jpeg
  • WebP bytes in a .jpg file → correctly returns image/webp
  • Extension fallback when magic bytes are inconclusive
  • Default image/png when both magic bytes and extension are unknown
  • Edge case: buffer shorter than 4 bytes returns null
✓ detectMediaTypeFromBytes > detects PNG
✓ detectMediaTypeFromBytes > detects JPEG
✓ detectMediaTypeFromBytes > detects GIF87a
✓ detectMediaTypeFromBytes > detects GIF89a
✓ detectMediaTypeFromBytes > detects WebP
✓ detectMediaTypeFromBytes > returns null for unknown bytes
✓ detectMediaTypeFromBytes > returns null for buffer < 4 bytes
✓ getMediaType > returns image/gif for a GIF saved with .png extension (core bug)
✓ getMediaType > returns image/jpeg for a JPEG saved with .png extension
✓ getMediaType > returns image/webp for a WebP saved with .jpg extension
✓ getMediaType > returns image/png for a genuine PNG file
✓ getMediaType > falls back to extension when magic bytes unknown
✓ getMediaType > falls back to image/png when both unknown

…tension

Fixes 400 API errors when GitHub serves images with mismatched extensions
(e.g. Claude's own spinner GIF downloaded as .png). Now reads the first
12 bytes of each downloaded buffer to identify the real format before
saving the file.

Fixes anthropics#702, anthropics#495
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Image media type detection fails for GIF images saved with .png extension

1 participant