feat(slides): add write tools for Google Slides#235
Conversation
Summary of ChangesHello @stefanoamorelli, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a substantial expansion of Google Slides automation capabilities by integrating 15 new write tools. This enhancement allows users to programmatically create, modify, and populate Google Slides presentations, streamlining workflows that involve template manipulation, dynamic content generation, and presentation assembly. The changes include updating necessary API permissions and thoroughly testing the new functionalities to ensure reliability and ease of use. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces 15 new write tools for Google Slides, significantly expanding the extension's capabilities for presentation management. The implementation correctly leverages the Google Slides API's batchUpdate method and handles OAuth scope upgrades. I've identified a few opportunities to improve type safety in the tool schemas and a functional bug in how speaker notes are cleared before updating.
69689c5 to
a1e55ba
Compare
|
We can't take this PR right now because it requires new scopes in the default project. I'm working on a mechanism to allow for user-configurables scopes so that we can take features that aren't supported by the main GCP project. In those cases a user who uses these features will have to host their own GCP project to go along with it until I can get the shared GCP project updated with new scopes. |
|
Thanks @allenhutchison for having a look, makes sense! When building and testing this feature I used the script available on Should we keep this PR open until we’d be ready to setup new scopes on the default GCP project? Happy to help rebasing if/when the time come. |
|
Yes let's keep it open. See #255 for discussion on this feature. |
Retrieves and writes per-slide speaker notes. getSpeakerNotes returns an array — one entry per slide — with slideIndex, slideObjectId, speakerNotesObjectId, and notes text. updateSpeakerNotes replaces notes on a single slide by objectId. Approach adapted from gemini-cli-extensions#235 by @stefanoamorelli (MIT). Key difference: we target slides by objectId rather than index, matching the pattern used by getMetadata and createFromJson throughout this PR.
|
That's cool |
|
@allenhutchison now that #255 is closed, should we move forward with this? (appreciate for your feedback @Sum1cares, we'd need an approval from |
…setup These two private helpers reduce boilerplate in the upcoming write methods, each of which needs to format success/error responses the same way. The existing read-only methods are left untouched to keep this change minimal and reviewable. On the test side, `create` and `batchUpdate` mocks are added to the Slides API mock object so the new method tests can use them.
Wraps `presentations.create` from the Google Slides API to allow creating a new empty presentation with a given title. Returns the presentation ID and a direct edit URL.
Wraps the `createSlide` batchUpdate request to add a new slide to an existing presentation. Supports optional insertion index, predefined layout (e.g. BLANK, TITLE_AND_BODY), or a specific layout ID from the presentation's masters.
Wraps the `deleteObject` batchUpdate request to remove a slide from a presentation by its object ID.
Wraps the `duplicateObject` batchUpdate request to clone a slide. The duplicate is placed immediately after the original and gets a new unique object ID returned in the response.
Wraps the `updateSlidesPosition` batchUpdate request to move one or more slides to a new position within the presentation.
Reads speaker notes for every slide in a presentation by traversing the `notesPage` structure. Each slide's notes text, object ID, and speaker notes shape ID are returned so they can be used with updateSpeakerNotes later.
Replaces the speaker notes for a specific slide. The method first reads the current notes page structure to find the speaker notes shape ID, then issues a deleteText + insertText batchUpdate pair. Passing an empty string clears the notes entirely.
Wraps the `replaceAllText` batchUpdate request to find-and-replace
text across the entire presentation. Supports case-sensitive and
case-insensitive matching. Particularly useful for template
variable substitution (e.g. replacing `{{name}}` placeholders).
Wraps the `insertText` batchUpdate request to insert text into a shape or table cell at a given index. Defaults to inserting at position 0 (beginning of the text content).
Wraps the `deleteText` batchUpdate request to remove text from a shape or table cell. Supports three range modes: FIXED_RANGE (start + end index), FROM_START_INDEX (from index to end), and ALL (clear everything).
Wraps the `createShape` batchUpdate request to add shapes like TEXT_BOX, RECTANGLE, ELLIPSE, etc. to a slide. Position and dimensions are specified in points (PT) using an affine transform.
Wraps the `createImage` batchUpdate request to insert an image from a publicly accessible URL onto a slide. Position and dimensions are specified in points (PT).
Wraps the `createTable` batchUpdate request to add a table with a given number of rows and columns to a slide. Position and dimensions are specified in points (PT).
Wraps the `updateTextStyle` batchUpdate request to change text formatting (bold, italic, font size, color, etc.) on a shape or table cell. The style parameter accepts a JSON string to work around the MCP SDK's lack of support for `z.record(z.any())` schemas, with a clear error message on invalid input.
Wraps the `updateShapeProperties` batchUpdate request to modify shape properties like background fill, outline, shadow, etc. Like updateTextStyle, the shapeProperties parameter accepts a JSON string to avoid the MCP SDK `z.record(z.any())` serialization issue.
The 14 new slides write tools (create, addSlide, deleteSlide, duplicateSlide, reorderSlides, updateSpeakerNotes, replaceAllText, insertText, deleteText, addShape, addImage, addTable, updateTextStyle, updateShapeProperties) were registered in index.ts but absent from the FEATURE_GROUPS registry, so the enabledTools gate in index.ts would have skipped them and the `presentations` write scope would never be requested. Listing them under `slides.write` (defaultEnabled: false) wires them into the feature-flag system introduced in gemini-cli-extensions#323, so users opt into write capabilities via `WORKSPACE_FEATURE_OVERRIDES=slides.write:on`, which automatically pulls in the full `presentations` scope. Also moves the new `slides.getSpeakerNotes` read tool into `slides.read` so it stays available with the default-on readonly scope. This replaces the standalone scope-bump commit that originally sat at the head of this branch but became obsolete once SCOPES was refactored out of index.ts into feature-config.ts on main. Signed-off-by: Stefano Amorelli <stefano@amorelli.tech>
a1e55ba to
4894713
Compare
| slidesService.getSlideThumbnail, | ||
| ); | ||
|
|
||
| server.registerTool( |
There was a problem hiding this comment.
[blocking] All 15 new tool registrations (this slides.create plus every server.registerTool call through line 876) bypass the local registerTool wrapper defined at line 172. The wrapper is what gates registration on enabledTools — every other tool in this file goes through it.
Worst-of-both-worlds consequence: slides.write is intentionally defaultEnabled: false in feature-config.ts:230, but with this bypass the 14 write tools get registered regardless. Meanwhile SCOPES is computed from resolveFeatures().requiredScopes which respects the gate, so the write scope is not requested — every call to a write tool will then fail with 403 at runtime. Even WORKSPACE_FEATURE_OVERRIDES=slides.write:on won't fix it, because the tools were already registered with the wrong scope set.
slides.getSpeakerNotes at line 600 has the same bug — it lands in the default-on read group so visible behavior is correct, but slides.read:off overrides are silently ignored.
Fix: s/server.registerTool/registerTool/ on all 15 registrations. Might also be worth adding a test that asserts every registered tool name appears in FEATURE_GROUPS so this regression can't recur.
| const errorMessage = | ||
| error instanceof Error ? error.message : String(error); | ||
| logToFile(`[SlidesService] Error during ${method}: ${errorMessage}`); | ||
| return { |
There was a problem hiding this comment.
[blocking] formatError is missing isError: true on the returned object. The project convention (see DriveService.ts:38–56 and DocsService.ts:113–121) is that MCP error responses set isError: true as the discriminator. Without it, every error from every Slides tool — auth failures, 404s, 403s, JSON.parse failures, slide-not-found — gets reported to the model/client as a successful tool call whose content happens to contain an error key. The model will treat it as success and continue confidently. This is the same issue that bites the older inline catches in getText / getMetadata / getImages / getSlideThumbnail, but at least those are pre-existing.
One-line fix here propagates to all 15 new tools:
return {
isError: true,
content: [{ type: 'text' as const, text: JSON.stringify({ error: errorMessage }) }],
};| }, | ||
| }); | ||
|
|
||
| const slideObjectId = |
There was a problem hiding this comment.
[blocking] response.data.replies?.[0]?.createSlide?.objectId ?? null returns success with slideObjectId: null whenever the API reply array is empty/malformed (partial success, throttle response, schema drift). The caller/model believes the slide was created, then any follow-up tool call using the null ID 400s with a confusing error.
Same pattern in duplicateSlide:436, addShape:835, addImage:904, addTable:976 — would suggest throwing instead:
const slideObjectId = response.data.replies?.[0]?.createSlide?.objectId;
if (!slideObjectId) {
throw new Error('createSlide returned no objectId; batchUpdate reply was empty or malformed.');
}| }, | ||
| }); | ||
|
|
||
| const occurrencesChanged = |
There was a problem hiding this comment.
[blocking] Same ?? 0 pattern as the slideObjectId case above — occurrencesChanged reports 0 both when (a) Google legitimately found zero matches and (b) the reply array is missing entirely (API change, partial failure). Users will spend hours debugging template variables that "weren't found" when the real problem was an API/permissions failure. Should distinguish missing reply (throw) from occurrencesChanged === 0 (legitimate).
| } | ||
|
|
||
| return { | ||
| slideIndex: index + 1, |
There was a problem hiding this comment.
[blocking] Returns slideIndex: index + 1 (1-based), but every other index on the API surface — addSlide.insertionIndex, reorderSlides.insertionIndex, etc. — is 0-based. The tool description at index.ts:603 doesn't document the difference, so an LLM caller will reasonably assume 0-based and feed this back into other tools off-by-one. Either drop the + 1 or call out the 1-based indexing in the description (with a strong nudge to the former for interop).
| } | ||
| }; | ||
|
|
||
| public addSlide = async ({ |
There was a problem hiding this comment.
[important] Zod and the service-layer types are diverging:
addSlideZod usespredefinedLayout: z.enum([...10 literals]), but the service signature here widens topredefinedLayout?: string.deleteText/updateTextStyleZod constraintypeto the 3-literal enum but the service usesstring.updateTextStyle.styleandupdateShapeProperties.shapePropertiesare typedstring | slides_v1.Schema$TextStylein the service, but the Zod schema only ever acceptsstring— the object branch is dead code.
Would z.infer<typeof schema> work as the single source of truth? Compile-time guarantees from the Zod boundary are getting lost the moment values cross into the service.
| server.registerTool( | ||
| 'slides.addSlide', | ||
| { | ||
| description: |
There was a problem hiding this comment.
[important] Description says Adds a new slide ... Optionally specify position and layout. but doesn't mention that the tool returns the new slideObjectId. Callers that need to chain (e.g. addShape on the new slide) won't know they can read it from the response. Compare to slides.create which does document its return.
| server.registerTool( | ||
| 'slides.replaceAllText', | ||
| { | ||
| description: |
There was a problem hiding this comment.
[important] Wrapper defaults matchCase: true, which is opposite to Google's API default (false). The current wording (case-sensitive (default: true) lower in the schema) is correct in isolation but the description doesn't call out that omitting matchCase gives case-sensitive matching — a behavior surprise for anyone familiar with the underlying API.
| }); | ||
| } | ||
|
|
||
| if (requests.length > 0) { |
There was a problem hiding this comment.
[nit] When both requests end up empty (no existing notes to delete and notes === ''), this skips the batchUpdate and still returns the success payload at 619. Defensible as a true no-op, but there's no signal to the caller that nothing happened. Worth a noOp: true field on the result, or at least a debug log.
| .describe('The object ID of the slide to add the image to.'), | ||
| imageUrl: z | ||
| .string() | ||
| .describe('The URL of the image to insert. Must be publicly accessible.'), |
There was a problem hiding this comment.
[nit] "Must be publicly accessible" is the most common gotcha but Google also requires HTTPS, ≤50MB, ≤25MP, and PNG/JPEG/GIF only. Worth a one-line addition since these are common failure modes that come back as opaque API errors.
Tip
Better reviewed commit-by-commit.
This adds 15 write tools for Google Slides, on top of the existing 5 read-only ones 1. Me and my team do use Google Slides templates heavily and being able to programmatically create presentations, replace placeholder text, add slides, insert images/tables, and update formatting would save us a lot of manual work.
All new tools map to existing Google Slides API v1
presentations.batchUpdaterequests 2. The only scope change needed is upgrading frompresentations.readonlytopresentations3, which is also reflected in the GCP setup script 4.NB:
updateTextStyleandupdateShapePropertiesaccept their style/properties as a JSON string instead ofz.record(z.any())because the MCP SDK schema serializer does not supportz.any()and causestools/listto fail silently. The JSON is parsed at runtime with a descriptive error on invalid input.Commits
80f4879upgrade OAuth scope to full presentations access0c7a6d1align GCP setup script scope with index.tsa1e91caadd formatError/formatResult helpers and test mock setupf9d42efadd slides.create77cc965add slides.addSlide3a56dd9add slides.deleteSlide639442cadd slides.duplicateSlide42361f6add slides.reorderSlides92f5eb1add slides.getSpeakerNotes656d145add slides.updateSpeakerNotesadaf091add slides.replaceAllText79d7d41add slides.insertText18d1018add slides.deleteText290830aadd slides.addShape9755bf9add slides.addImage86eb02aadd slides.addTable14cb5ddadd slides.updateTextStyle69689c5add slides.updateShapePropertiesCloses #234