Skip to content

feat(slides): add write tools for Google Slides#235

Open
stefanoamorelli wants to merge 17 commits into
gemini-cli-extensions:mainfrom
stefanoamorelli:feat/slides-write-tools
Open

feat(slides): add write tools for Google Slides#235
stefanoamorelli wants to merge 17 commits into
gemini-cli-extensions:mainfrom
stefanoamorelli:feat/slides-write-tools

Conversation

@stefanoamorelli
Copy link
Copy Markdown

@stefanoamorelli stefanoamorelli commented Feb 23, 2026

Tip

Better reviewed commit-by-commit.

This adds 15 write tools for Google Slides, on top of the existing 5 read-only ones 1. Me and my team do use Google Slides templates heavily and being able to programmatically create presentations, replace placeholder text, add slides, insert images/tables, and update formatting would save us a lot of manual work.

All new tools map to existing Google Slides API v1 presentations.batchUpdate requests 2. The only scope change needed is upgrading from presentations.readonly to presentations 3, which is also reflected in the GCP setup script 4.

NB:
updateTextStyle and updateShapeProperties accept their style/properties as a JSON string instead of z.record(z.any()) because the MCP SDK schema serializer does not support z.any() and causes tools/list to fail silently. The JSON is parsed at runtime with a descriptive error on invalid input.


Commits

  • 80f4879 upgrade OAuth scope to full presentations access
  • 0c7a6d1 align GCP setup script scope with index.ts
  • a1e91ca add formatError/formatResult helpers and test mock setup
  • f9d42ef add slides.create
  • 77cc965 add slides.addSlide
  • 3a56dd9 add slides.deleteSlide
  • 639442c add slides.duplicateSlide
  • 42361f6 add slides.reorderSlides
  • 92f5eb1 add slides.getSpeakerNotes
  • 656d145 add slides.updateSpeakerNotes
  • adaf091 add slides.replaceAllText
  • 79d7d41 add slides.insertText
  • 18d1018 add slides.deleteText
  • 290830a add slides.addShape
  • 9755bf9 add slides.addImage
  • 86eb02a add slides.addTable
  • 14cb5dd add slides.updateTextStyle
  • 69689c5 add slides.updateShapeProperties

Closes #234

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @stefanoamorelli, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a substantial expansion of Google Slides automation capabilities by integrating 15 new write tools. This enhancement allows users to programmatically create, modify, and populate Google Slides presentations, streamlining workflows that involve template manipulation, dynamic content generation, and presentation assembly. The changes include updating necessary API permissions and thoroughly testing the new functionalities to ensure reliability and ease of use.

Highlights

  • Expanded Google Slides Capabilities: Added 15 new write tools for Google Slides, significantly enhancing programmatic control over presentations. These tools enable creating presentations, manipulating slides (add, delete, duplicate, reorder), managing text (replace all, insert, delete, update style), and adding elements (shapes, images, tables).
  • OAuth Scope Upgrade: Upgraded the required Google Slides OAuth scope from presentations.readonly to presentations to support the new write operations, ensuring full read/write access.
  • Robust Error Handling and Testing: Implemented consistent error formatting and added comprehensive test suites for all new Slides tools, increasing the total Slides tests from 12 to 49 and overall tests to 423 across 25 suites.
  • JSON String Input for Complex Properties: Addressed SDK limitations by accepting JSON strings for updateTextStyle and updateShapeProperties parameters, allowing flexible and detailed styling and property updates.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • scripts/setup-gcp.sh
    • Updated the Google Slides API scope from presentations.readonly to presentations to enable write operations.
  • workspace-server/src/tests/services/SlidesService.test.ts
    • Added mocks for create and batchUpdate methods of the Slides API.
    • Introduced extensive test suites for all 15 new Slides write tools, covering various scenarios and error handling.
  • workspace-server/src/index.ts
    • Updated the Google Slides API scope definition from presentations.readonly to presentations.
    • Registered 15 new Google Slides tools, including slides.create, slides.addSlide, slides.deleteSlide, slides.duplicateSlide, slides.reorderSlides, slides.getSpeakerNotes, slides.updateSpeakerNotes, slides.replaceAllText, slides.insertText, slides.deleteText, slides.addShape, slides.addImage, slides.addTable, slides.updateTextStyle, and slides.updateShapeProperties.
  • workspace-server/src/services/SlidesService.ts
    • Implemented formatError and formatResult private helper methods for standardized tool responses.
    • Implemented create method to create new Google Slides presentations.
    • Implemented addSlide method to add new slides with optional layout and position.
    • Implemented deleteSlide method to remove slides by object ID.
    • Implemented duplicateSlide method to create copies of existing slides.
    • Implemented reorderSlides method to change the order of slides within a presentation.
    • Implemented getSpeakerNotes method to retrieve speaker notes for all slides.
    • Implemented updateSpeakerNotes method to modify speaker notes for a specific slide.
    • Implemented replaceAllText method to find and replace text across an entire presentation.
    • Implemented insertText method to insert text into specific shapes or table cells.
    • Implemented deleteText method to remove text from shapes or table cells based on a specified range.
    • Implemented addShape method to add various types of shapes to slides.
    • Implemented addImage method to insert images from URLs into slides.
    • Implemented addTable method to add tables to slides with specified dimensions.
    • Implemented updateTextStyle method to apply text formatting, supporting JSON string input for style definitions.
    • Implemented updateShapeProperties method to modify shape properties, supporting JSON string input for property definitions.
Activity
  • The pull request author, stefanoamorelli, has systematically added 15 new write tools for Google Slides, each in a dedicated commit.
  • The author has addressed the necessary OAuth scope upgrade in both the GCP setup script and the main application configuration.
  • Comprehensive test cases have been added for each new tool, demonstrating functionality and error handling.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces 15 new write tools for Google Slides, significantly expanding the extension's capabilities for presentation management. The implementation correctly leverages the Google Slides API's batchUpdate method and handles OAuth scope upgrades. I've identified a few opportunities to improve type safety in the tool schemas and a functional bug in how speaker notes are cleared before updating.

Comment thread workspace-server/src/services/SlidesService.ts Outdated
Comment thread workspace-server/src/index.ts Outdated
Comment thread workspace-server/src/index.ts Outdated
Comment thread workspace-server/src/index.ts Outdated
@allenhutchison
Copy link
Copy Markdown
Contributor

We can't take this PR right now because it requires new scopes in the default project. I'm working on a mechanism to allow for user-configurables scopes so that we can take features that aren't supported by the main GCP project. In those cases a user who uses these features will have to host their own GCP project to go along with it until I can get the shared GCP project updated with new scopes.

@stefanoamorelli
Copy link
Copy Markdown
Author

Thanks @allenhutchison for having a look, makes sense!

When building and testing this feature I used the script available on scripts/setup-gcp.sh to setup my own GCP project.

Should we keep this PR open until we’d be ready to setup new scopes on the default GCP project? Happy to help rebasing if/when the time come.

@allenhutchison
Copy link
Copy Markdown
Contributor

Yes let's keep it open.

See #255 for discussion on this feature.

n0012 added a commit to n0012/workspace that referenced this pull request Apr 26, 2026
Retrieves and writes per-slide speaker notes.

getSpeakerNotes returns an array — one entry per slide — with
slideIndex, slideObjectId, speakerNotesObjectId, and notes text.
updateSpeakerNotes replaces notes on a single slide by objectId.

Approach adapted from gemini-cli-extensions#235
by @stefanoamorelli (MIT). Key difference: we target slides
by objectId rather than index, matching the pattern used by
getMetadata and createFromJson throughout this PR.
@Sum1cares
Copy link
Copy Markdown

That's cool

@stefanoamorelli
Copy link
Copy Markdown
Author

@allenhutchison now that #255 is closed, should we move forward with this?

(appreciate for your feedback @Sum1cares, we'd need an approval from CODEOWNERS to proceed)

…setup

These two private helpers reduce boilerplate in the upcoming write
methods, each of which needs to format success/error responses the
same way. The existing read-only methods are left untouched to
keep this change minimal and reviewable.

On the test side, `create` and `batchUpdate` mocks are added to
the Slides API mock object so the new method tests can use them.
Wraps `presentations.create` from the Google Slides API to allow
creating a new empty presentation with a given title. Returns the
presentation ID and a direct edit URL.
Wraps the `createSlide` batchUpdate request to add a new slide to
an existing presentation. Supports optional insertion index,
predefined layout (e.g. BLANK, TITLE_AND_BODY), or a specific
layout ID from the presentation's masters.
Wraps the `deleteObject` batchUpdate request to remove a slide
from a presentation by its object ID.
Wraps the `duplicateObject` batchUpdate request to clone a slide.
The duplicate is placed immediately after the original and gets a
new unique object ID returned in the response.
Wraps the `updateSlidesPosition` batchUpdate request to move one
or more slides to a new position within the presentation.
Reads speaker notes for every slide in a presentation by traversing
the `notesPage` structure. Each slide's notes text, object ID, and
speaker notes shape ID are returned so they can be used with
updateSpeakerNotes later.
Replaces the speaker notes for a specific slide. The method first
reads the current notes page structure to find the speaker notes
shape ID, then issues a deleteText + insertText batchUpdate pair.
Passing an empty string clears the notes entirely.
Wraps the `replaceAllText` batchUpdate request to find-and-replace
text across the entire presentation. Supports case-sensitive and
case-insensitive matching. Particularly useful for template
variable substitution (e.g. replacing `{{name}}` placeholders).
Wraps the `insertText` batchUpdate request to insert text into a
shape or table cell at a given index. Defaults to inserting at
position 0 (beginning of the text content).
Wraps the `deleteText` batchUpdate request to remove text from a
shape or table cell. Supports three range modes: FIXED_RANGE
(start + end index), FROM_START_INDEX (from index to end), and
ALL (clear everything).
Wraps the `createShape` batchUpdate request to add shapes like
TEXT_BOX, RECTANGLE, ELLIPSE, etc. to a slide. Position and
dimensions are specified in points (PT) using an affine transform.
Wraps the `createImage` batchUpdate request to insert an image
from a publicly accessible URL onto a slide. Position and
dimensions are specified in points (PT).
Wraps the `createTable` batchUpdate request to add a table with
a given number of rows and columns to a slide. Position and
dimensions are specified in points (PT).
Wraps the `updateTextStyle` batchUpdate request to change text
formatting (bold, italic, font size, color, etc.) on a shape or
table cell. The style parameter accepts a JSON string to work
around the MCP SDK's lack of support for `z.record(z.any())`
schemas, with a clear error message on invalid input.
Wraps the `updateShapeProperties` batchUpdate request to modify
shape properties like background fill, outline, shadow, etc. Like
updateTextStyle, the shapeProperties parameter accepts a JSON
string to avoid the MCP SDK `z.record(z.any())` serialization
issue.
The 14 new slides write tools (create, addSlide, deleteSlide,
duplicateSlide, reorderSlides, updateSpeakerNotes, replaceAllText,
insertText, deleteText, addShape, addImage, addTable, updateTextStyle,
updateShapeProperties) were registered in index.ts but absent from
the FEATURE_GROUPS registry, so the enabledTools gate in index.ts
would have skipped them and the `presentations` write scope would
never be requested.

Listing them under `slides.write` (defaultEnabled: false) wires them
into the feature-flag system introduced in gemini-cli-extensions#323, so users opt into
write capabilities via `WORKSPACE_FEATURE_OVERRIDES=slides.write:on`,
which automatically pulls in the full `presentations` scope.

Also moves the new `slides.getSpeakerNotes` read tool into
`slides.read` so it stays available with the default-on readonly
scope.

This replaces the standalone scope-bump commit that originally sat
at the head of this branch but became obsolete once SCOPES was
refactored out of index.ts into feature-config.ts on main.

Signed-off-by: Stefano Amorelli <stefano@amorelli.tech>
@stefanoamorelli stefanoamorelli force-pushed the feat/slides-write-tools branch from a1e55ba to 4894713 Compare May 25, 2026 08:55
slidesService.getSlideThumbnail,
);

server.registerTool(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocking] All 15 new tool registrations (this slides.create plus every server.registerTool call through line 876) bypass the local registerTool wrapper defined at line 172. The wrapper is what gates registration on enabledTools — every other tool in this file goes through it.

Worst-of-both-worlds consequence: slides.write is intentionally defaultEnabled: false in feature-config.ts:230, but with this bypass the 14 write tools get registered regardless. Meanwhile SCOPES is computed from resolveFeatures().requiredScopes which respects the gate, so the write scope is not requested — every call to a write tool will then fail with 403 at runtime. Even WORKSPACE_FEATURE_OVERRIDES=slides.write:on won't fix it, because the tools were already registered with the wrong scope set.

slides.getSpeakerNotes at line 600 has the same bug — it lands in the default-on read group so visible behavior is correct, but slides.read:off overrides are silently ignored.

Fix: s/server.registerTool/registerTool/ on all 15 registrations. Might also be worth adding a test that asserts every registered tool name appears in FEATURE_GROUPS so this regression can't recur.

const errorMessage =
error instanceof Error ? error.message : String(error);
logToFile(`[SlidesService] Error during ${method}: ${errorMessage}`);
return {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocking] formatError is missing isError: true on the returned object. The project convention (see DriveService.ts:38–56 and DocsService.ts:113–121) is that MCP error responses set isError: true as the discriminator. Without it, every error from every Slides tool — auth failures, 404s, 403s, JSON.parse failures, slide-not-found — gets reported to the model/client as a successful tool call whose content happens to contain an error key. The model will treat it as success and continue confidently. This is the same issue that bites the older inline catches in getText / getMetadata / getImages / getSlideThumbnail, but at least those are pre-existing.

One-line fix here propagates to all 15 new tools:

return {
  isError: true,
  content: [{ type: 'text' as const, text: JSON.stringify({ error: errorMessage }) }],
};

},
});

const slideObjectId =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocking] response.data.replies?.[0]?.createSlide?.objectId ?? null returns success with slideObjectId: null whenever the API reply array is empty/malformed (partial success, throttle response, schema drift). The caller/model believes the slide was created, then any follow-up tool call using the null ID 400s with a confusing error.

Same pattern in duplicateSlide:436, addShape:835, addImage:904, addTable:976 — would suggest throwing instead:

const slideObjectId = response.data.replies?.[0]?.createSlide?.objectId;
if (!slideObjectId) {
  throw new Error('createSlide returned no objectId; batchUpdate reply was empty or malformed.');
}

},
});

const occurrencesChanged =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocking] Same ?? 0 pattern as the slideObjectId case above — occurrencesChanged reports 0 both when (a) Google legitimately found zero matches and (b) the reply array is missing entirely (API change, partial failure). Users will spend hours debugging template variables that "weren't found" when the real problem was an API/permissions failure. Should distinguish missing reply (throw) from occurrencesChanged === 0 (legitimate).

}

return {
slideIndex: index + 1,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocking] Returns slideIndex: index + 1 (1-based), but every other index on the API surface — addSlide.insertionIndex, reorderSlides.insertionIndex, etc. — is 0-based. The tool description at index.ts:603 doesn't document the difference, so an LLM caller will reasonably assume 0-based and feed this back into other tools off-by-one. Either drop the + 1 or call out the 1-based indexing in the description (with a strong nudge to the former for interop).

}
};

public addSlide = async ({
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[important] Zod and the service-layer types are diverging:

  • addSlide Zod uses predefinedLayout: z.enum([...10 literals]), but the service signature here widens to predefinedLayout?: string.
  • deleteText/updateTextStyle Zod constrain type to the 3-literal enum but the service uses string.
  • updateTextStyle.style and updateShapeProperties.shapeProperties are typed string | slides_v1.Schema$TextStyle in the service, but the Zod schema only ever accepts string — the object branch is dead code.

Would z.infer<typeof schema> work as the single source of truth? Compile-time guarantees from the Zod boundary are getting lost the moment values cross into the service.

server.registerTool(
'slides.addSlide',
{
description:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[important] Description says Adds a new slide ... Optionally specify position and layout. but doesn't mention that the tool returns the new slideObjectId. Callers that need to chain (e.g. addShape on the new slide) won't know they can read it from the response. Compare to slides.create which does document its return.

server.registerTool(
'slides.replaceAllText',
{
description:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[important] Wrapper defaults matchCase: true, which is opposite to Google's API default (false). The current wording (case-sensitive (default: true) lower in the schema) is correct in isolation but the description doesn't call out that omitting matchCase gives case-sensitive matching — a behavior surprise for anyone familiar with the underlying API.

});
}

if (requests.length > 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] When both requests end up empty (no existing notes to delete and notes === ''), this skips the batchUpdate and still returns the success payload at 619. Defensible as a true no-op, but there's no signal to the caller that nothing happened. Worth a noOp: true field on the result, or at least a debug log.

.describe('The object ID of the slide to add the image to.'),
imageUrl: z
.string()
.describe('The URL of the image to insert. Must be publicly accessible.'),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] "Must be publicly accessible" is the most common gotcha but Google also requires HTTPS, ≤50MB, ≤25MP, and PNG/JPEG/GIF only. Worth a one-line addition since these are common failure modes that come back as opaque API errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Add write tools for Google Slides

3 participants