Skip to content

feat(gemini): add image-with-ref adapter for multi-image uploads#2020

Open
mr7thing wants to merge 1 commit into
jackwener:mainfrom
mr7thing:opencli-browser-upload-attempt
Open

feat(gemini): add image-with-ref adapter for multi-image uploads#2020
mr7thing wants to merge 1 commit into
jackwener:mainfrom
mr7thing:opencli-browser-upload-attempt

Conversation

@mr7thing

Copy link
Copy Markdown

Summary

Adds image-with-ref adapter for Gemini — upload N reference images then generate in one command.

Usage

opencli gemini image-with-ref --prompt "draw a temple courtyard at dawn" --ref "/path/a.png,/path/b.png"

How it works

  1. Clicks the upload toolbar button in Gemini composer
  2. Clicks "Upload file" menu item (CDP native mouse event)
  3. Injects reference images via DOM.setFileInputFiles
  4. Types the prompt and sends it
  5. Waits for generated images and saves them locally

Testing

  • 11 new vitest cases
  • 106/106 gemini suite green (no existing tests modified)
  • Verified end-to-end on gemini.google.com/app (zh-CN): 3 reference images upload → composer thumbnail row → prompt → download

Known Limitation

OS Picker popup: On some Gemini builds, clicking the "Upload file" menuitem triggers the native OS file picker dialog instead of inserting the hidden <input type=file>. If the picker appears, the adapter falls back to pressing Escape to close it, then retries the upload. In persistent cases, restarting Chrome and reusing the same chat ID recovers the session without losing conversation state. This is a Gemini frontend behavior — not something the adapter controls.

Changes: 2 files, +266 lines

Adds a Gemini adapter that supports N reference images via a comma-separated --ref string. Wraps the existing upload flow into a single command.

Verified end-to-end on gemini.google.com/app (zh-CN): 3 reference images attach to the composer thumbnail row, ready for the prompt.
Tests: 11 new vitest cases, 106/106 gemini suite green.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant