Skip to content

fix(tool): allow custom default vision model in OpenAIMultiModalTool#1701

Merged
chickenlj merged 2 commits into
agentscope-ai:mainfrom
BukJiang:fix/openai-multimodal-default-model
Jun 11, 2026
Merged

fix(tool): allow custom default vision model in OpenAIMultiModalTool#1701
chickenlj merged 2 commits into
agentscope-ai:mainfrom
BukJiang:fix/openai-multimodal-default-model

Conversation

@BukJiang

Copy link
Copy Markdown
Contributor

AgentScope-Java Version

2.0.0-RC2

Description

When openaiImageToText is called without a model parameter, the fallback
was hardcoded to "gpt-4o". Non-OpenAI-compatible backends (e.g. MiniMax
proxies) don't recognize that model name and return HTTP 503, which looks like
a transient outage and is hard to diagnose.

The @ToolParam description also mentioned gpt-4o and gpt-4-vision-preview
explicitly, steering the LLM toward OpenAI-specific names even when a different
backend was configured.

Fixes #1694

Changes:

  • Add defaultModelName field and a new 3-argument constructor
    OpenAIMultiModalTool(String apiKey, String baseUrl, String defaultModelName)
  • Existing 1-arg and 2-arg constructors delegate to the new one with "gpt-4o"
    as the default (fully backward compatible)
  • Replace the hardcoded "gpt-4o" fallback in openaiImageToText with
    this.defaultModelName
  • Remove OpenAI-specific model name hints from the @ToolParam description
    and Javadoc

Checklist

  • Code has been formatted with mvn spotless:apply
  • All tests are passing (mvn test)
  • Javadoc comments are complete and follow project conventions
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

@BukJiang BukJiang requested a review from a team June 10, 2026 12:55
@CLAassistant

CLAassistant commented Jun 10, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@BukJiang BukJiang force-pushed the fix/openai-multimodal-default-model branch from 7cf1af9 to 9a89bc4 Compare June 10, 2026 13:00
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.00000% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ope/core/tool/multimodal/OpenAIMultiModalTool.java 50.00% 6 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@BukJiang BukJiang force-pushed the fix/openai-multimodal-default-model branch from d843e45 to ac88325 Compare June 10, 2026 13:38
@AgentScopeJavaBot AgentScopeJavaBot added bug Something isn't working area/core/tool Tool, skill, RAG abstractions labels Jun 11, 2026

@AgentScopeJavaBot AgentScopeJavaBot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Review

This PR cleanly addresses a real usability issue (#1694) where the hardcoded "gpt-4o" fallback in openaiImageToText causes opaque 503 errors on non-OpenAI-compatible backends. The fix introduces a defaultModelName field with a new 3-arg constructor while keeping existing 1-arg and 2-arg constructors fully backward compatible. Validation is consistent across both public and protected constructors, and the @ToolParam description is appropriately de-coupled from OpenAI-specific model names. Tests cover the custom default model path and constructor validation for null/blank inputs. The change is minimal, well-scoped, and correctly solves the reported issue.

Note: The same hardcoded-default pattern exists in sibling methods (openaiTextToImage"dall-e-3", openaiTextToAudio"tts-1", openaiAudioToText"whisper-1"). These are out of scope for this PR but would benefit from a similar refactoring in a follow-up.

@AgentScopeJavaBot AgentScopeJavaBot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Review

This PR cleanly addresses a real usability issue (#1694) where the hardcoded "gpt-4o" fallback in openaiImageToText causes opaque 503 errors on non-OpenAI-compatible backends. The fix introduces a defaultModelName field with a new 3-arg constructor while keeping existing 1-arg and 2-arg constructors fully backward compatible. Validation is consistent across both public and protected constructors, and the @ToolParam description is appropriately de-coupled from OpenAI-specific model names. Tests cover the custom default model path and constructor validation for null/blank inputs. The change is minimal, well-scoped, and correctly solves the reported issue.

Note: The same hardcoded-default pattern exists in sibling methods (openaiTextToImage"dall-e-3", openaiTextToAudio"tts-1", openaiAudioToText"whisper-1"). These are out of scope for this PR but would benefit from a similar refactoring in a follow-up.

@chickenlj chickenlj left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@chickenlj chickenlj merged commit 32c716d into agentscope-ai:main Jun 11, 2026
5 of 6 checks passed
@BukJiang BukJiang deleted the fix/openai-multimodal-default-model branch June 11, 2026 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core/tool Tool, skill, RAG abstractions bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OpenAIMultiModalTool throws HTTP 503 on non-OpenAI proxies due to hardcoded gpt-4o fallback model

4 participants