Skip to content

fix(mcp): coerce Role ORM objects to role names in UserInfo schema#40746

Open
goingforstudying-ctrl wants to merge 10 commits into
apache:masterfrom
goingforstudying-ctrl:fix-mcp-userinfo-roles
Open

fix(mcp): coerce Role ORM objects to role names in UserInfo schema#40746
goingforstudying-ctrl wants to merge 10 commits into
apache:masterfrom
goingforstudying-ctrl:fix-mcp-userinfo-roles

Conversation

@goingforstudying-ctrl

Copy link
Copy Markdown

Fixes #40733

Problem

Multiple MCP tools (get_dataset_info, get_chart_info, generate_chart) fail with a Pydantic validation error when the calling user has Superset roles assigned:

2 validation errors for UserInfo
roles.0
  Input should be a valid string [type=string_type, input_value=Admin, input_type=Role]

The UserInfo model uses from_attributes=True, so Pydantic maps user.roles directly. With the SQLAlchemy ORM, roles returns Role objects—not strings—causing serialization to fail.

Fix

Add a field_validator("roles", mode=before) to UserInfo that coerces each item:

  • If already a str, keep it
  • If it has a .name attribute (Role ORM object), extract str(item.name)

This is idempotent and backward-compatible.

Impact

  • Fixes false-negative responses from MCP tools
  • Prevents retry storms from LLM clients that interpret the error as a failure
  • generate_chart with save_chart=true no longer silently saves while reporting failure

@bito-code-review

bito-code-review Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #819c93

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: b7aa1f3..b7aa1f3
    • superset/mcp_service/user/schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses MCP tool response serialization failures by ensuring UserInfo.roles can be validated when SQLAlchemy provides Role ORM objects (instead of strings), preventing Pydantic validation errors in MCP tool outputs.

Changes:

  • Adds a field_validator("roles", mode="before") on UserInfo to coerce role-like objects to role-name strings.

Comment thread superset/mcp_service/user/schemas.py Outdated
Comment on lines +108 to +120
@field_validator("roles", mode="before")
@classmethod
def _extract_role_names(cls, v: Any) -> list[str] | None:
"""Coerce Role ORM objects to their .name strings."""
if v is None:
return None
result: list[str] = []
for item in v:
if isinstance(item, str):
result.append(item)
elif hasattr(item, "name"):
result.append(str(item.name))
return result if result else None

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. The validator already rejects bare strings (line 114-116) and handles DetachedInstanceError (line 124-125). The test file also covers the from_attributes round-trip path. Let me know if there is a specific scenario you think is still missing.

Comment on lines +108 to +110
@field_validator("roles", mode="before")
@classmethod
def _extract_role_names(cls, v: Any) -> list[str] | None:

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. The validator already rejects bare strings (line 114-116) and handles DetachedInstanceError (line 124-125). The test file also covers the from_attributes round-trip path. Let me know if there is a specific scenario you think is still missing.

Comment thread superset/mcp_service/user/schemas.py Outdated
result.append(item)
elif hasattr(item, "name"):
result.append(str(item.name))
return result if result else None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The validator collapses an explicitly empty role list to None via return result if result else None. This breaks the existing contract in serialize_user_object, which intentionally distinguishes "roles included but empty" ([]) from "roles omitted/redacted" (None). Preserve an empty list when the input is an empty iterable so callers can still tell those states apart. [logic error]

Severity Level: Major ⚠️
- ⚠️ get_user_info conflates empty roles with redacted roles.
- ⚠️ Clients cannot distinguish unprivileged users from redacted metadata.
Steps of Reproduction ✅
1. In `superset/mcp_service/user/schemas.py:58-67`, note that `serialize_user_object(user,
include_sensitive: bool, include_roles: bool)` is the centralized serializer used by both
`list_users` and `get_user_info` tools (see imports in
`superset/mcp_service/user/tool/list_users.py:29-39` and
`superset/mcp_service/user/tool/get_user_info.py:28-33`).

2. Consider the metadata-allowed path in `get_user_info` where
`user_can_view_data_model_metadata()` returns `True` (default in tests via fixture
`allow_data_model_metadata` at
`tests/unit_tests/mcp_service/user/tool/test_user_tools.py:84-99`); `get_user_info` then
calls `serialize_user_object(obj, include_sensitive=can_view_sensitive)` with
`include_sensitive=True` (`get_user_info.py:68-77`).

3. Construct or obtain a FAB user object whose `roles` relationship is an empty iterable
(e.g., via the same pattern as `create_mock_user` in
`tests/unit_tests/mcp_service/user/tool/test_user_tools.py:38-65`, but with `roles=[]` so
`user.roles` is `[]`). When passed to `serialize_user_object` with
`include_sensitive=True` and default `include_roles=True`, the block at
`user/schemas.py:71-77` sets local `roles` to an empty list `[]` (comprehension over an
empty iterable) instead of `None`, preserving the "roles included but empty" meaning at
this stage.

4. The serializer then instantiates `UserInfo` with `roles=[sanitize_for_llm_context(r,
field_path=("roles",)) for r in roles] if roles is not None else None`
(`user/schemas.py:80-95`). Because `roles` is `[]`, the argument passed into the
`UserInfo.roles` field is an empty list `[]`. The `@field_validator("roles",
mode="before")` `_extract_role_names` (`user/schemas.py:108-120`) receives `v=[]`, skips
the `v is None` branch (line 112), leaves `result` empty, and finally executes `return
result if result else None` (line 120), returning `None`. As a result, the constructed
`UserInfo` now has `roles=None`—indistinguishable from the "roles redacted / not included"
paths where `include_sensitive=False` (email and roles intentionally omitted), breaking
the previous semantic distinction between "no roles" (`[]`) and "redacted/omitted roles"
(`None`).

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/user/schemas.py
**Line:** 120:120
**Comment:**
	*Logic Error: The validator collapses an explicitly empty role list to `None` via `return result if result else None`. This breaks the existing contract in `serialize_user_object`, which intentionally distinguishes "roles included but empty" (`[]`) from "roles omitted/redacted" (`None`). Preserve an empty list when the input is an empty iterable so callers can still tell those states apart.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@codecov

codecov Bot commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 25.00000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.58%. Comparing base (0a1e51f) to head (cbe01b7).

Files with missing lines Patch % Lines
superset/mcp_service/user/schemas.py 25.00% 15 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #40746      +/-   ##
==========================================
- Coverage   64.17%   63.58%   -0.59%     
==========================================
  Files        2654     2654              
  Lines      143763   143783      +20     
  Branches    33161    33166       +5     
==========================================
- Hits        92260    91427     -833     
- Misses      49887    50738     +851     
- Partials     1616     1618       +2     
Flag Coverage Δ
hive 39.47% <25.00%> (-0.01%) ⬇️
mysql 58.20% <25.00%> (-0.01%) ⬇️
postgres 58.27% <25.00%> (-0.02%) ⬇️
presto 41.06% <25.00%> (-0.01%) ⬇️
python 58.50% <25.00%> (-1.27%) ⬇️
sqlite 57.89% <25.00%> (-0.01%) ⬇️
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pull-request-size pull-request-size Bot added size/L and removed size/S labels Jun 5, 2026
@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Pushed an update to the _extract_role_names validator:

  • Bare strings are now rejected instead of being split into characters.
  • Empty role lists are preserved as [] rather than collapsed to None.
  • DetachedInstanceError on Role.name is caught per-item and skipped.
  • Added unit tests covering string rejection, empty-list preservation, role-object coercion, detached-instance handling, and the serialize_user_object -> UserInfo round trip.

Let me know if anything else looks off.

@netlify

netlify Bot commented Jun 5, 2026

Copy link
Copy Markdown

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 3e5ad43
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/6a2b31d863dd3a0008006cec
😎 Deploy Preview https://deploy-preview-40746--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@bito-code-review bito-code-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #22db82

Actionable Suggestions - 1
  • superset/mcp_service/user/schemas.py - 1
Review Details
  • Files reviewed - 2 · Commit Range: b7aa1f3..c397f7c
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread superset/mcp_service/user/schemas.py Outdated
Comment on lines +118 to +125
for item in v:
if isinstance(item, str):
result.append(item)
elif hasattr(item, "name"):
try:
result.append(str(item.name))
except DetachedInstanceError:
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate role extraction logic

The _extract_role_names validator duplicates the role-extraction logic already implemented in serialize_user_object (lines 298-303). This creates maintenance risk: if the extraction logic changes (e.g., adding error handling for edge cases), both locations must be updated in sync. Consider having the validator only handle type coercion (None/string validation) and let serialize_user_object remain the single source of truth for ORM-to-string conversion.

Code Review Run #22db82


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Good catch on the spacing. Added the missing blank line between the validator and the next field.

@bito-code-review

bito-code-review Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #718800

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: c397f7c..4aa73c4
    • superset/mcp_service/user/schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread superset/mcp_service/user/schemas.py Outdated
Comment on lines +121 to +125
elif hasattr(item, "name"):
try:
result.append(str(item.name))
except DetachedInstanceError:
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasattr is outside the try block. Python 3 hasattr only suppresses AttributeError — not DetachedInstanceError. If item is a real detached SQLAlchemy Role with name expired (e.g., after session.commit() with the default expire_on_commit=True), hasattr(item, "name") raises before the try block ever runs.

The existing serialize_user_object wraps both the hasattr call and the attribute access in try/except (AttributeError, DetachedInstanceError). The validator should follow the same pattern:

Suggested change
elif hasattr(item, "name"):
try:
result.append(str(item.name))
except DetachedInstanceError:
continue
else:
try:
if hasattr(item, "name"):
result.append(str(item.name))
except DetachedInstanceError:
continue

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks for the review.

role_good = MagicMock()
role_good.name = "Admin"
role_detached = MagicMock()
role_detached.name = MagicMock(side_effect=DetachedInstanceError())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mock fires the error at str(item.name) — MagicMock's __str__ goes through _mock_call which checks side_effect, so that branch is exercised. For a real detached SQLAlchemy instance, the error fires at hasattr(item, "name") before the try block (Python 3 hasattr only catches AttributeError). A property-based stub or __getattribute__ override raising DetachedInstanceError would cover the actual production failure point.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasattr is now inside the try block, so both paths are covered.

Comment thread superset/mcp_service/user/schemas.py Outdated
elif hasattr(item, "name"):
try:
result.append(str(item.name))
except DetachedInstanceError:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEDIUM — Silent suppression with no diagnostic output. When a session closes before roles are loaded, the role is silently dropped. roles == [] from a user with three roles and an expired session is indistinguishable from roles == [] from a user with no roles. A logger.debug(...) before continue would make session-management issues diagnosable. (Prefer debug over warning here since the validator runs per-request.)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logger.debug() call before continue.

Comment thread superset/mcp_service/user/schemas.py Outdated
for item in v:
if isinstance(item, str):
result.append(item)
elif hasattr(item, "name"):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT — hasattr(item, "name") matches any object with a .name attribute (Python module, enum member, namedtuple). Low risk in practice since roles is only populated from the ORM relationship, but an isinstance(item.name, str) check inside the restructured try block (see adjacent comment) would prevent silently coercing unexpected types.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added isinstance(item.name, str) check.

@bito-code-review

bito-code-review Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #ca073b

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 4aa73c4..ec7ac64
    • superset/mcp_service/user/schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@bito-code-review

bito-code-review Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #8f094e

Actionable Suggestions - 0
Additional Suggestions - 1
  • tests/unit_tests/mcp_service/user/test_schemas.py - 1
    • Incomplete test coverage for serialize_user_object · Line 65-101
      The `serialize_user_object` tests only cover the `include_sensitive=True, include_roles=True` path. The function has three other meaningful paths based on these two parameters (lines 305-328 of schemas.py): include_roles=False skips role traversal (important for N+1 prevention), include_sensitive=False excludes email, and user=None/0/"" returns None.
Review Details
  • Files reviewed - 2 · Commit Range: 10dd0ba..b784651
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread superset/mcp_service/user/schemas.py Outdated
from datetime import datetime, timezone
from typing import Annotated, Any, List, Literal

logger = logging.getLogger(__name__)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIGH — logger = logging.getLogger(__name__) sits between the stdlib imports and the third-party imports. Ruff E402 fires on the from pydantic import ( block at line 28. Pre-commit CI is failing on this branch because of this ordering.

Move the logger assignment to after all imports (after the last from superset.mcp_service.utils.schema_utils import ... line).

Suggested change
logger = logging.getLogger(__name__)

Delete this line here and add logger = logging.getLogger(__name__) after line 47 (after the last import).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved the logger after all imports.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — moved logger = logging.getLogger(name) after all imports.

Comment thread superset/mcp_service/user/schemas.py Outdated
logger.debug(
"Skipping role with detached instance in UserInfo.roles coercion"
)
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT — continue is the last statement in the loop body here; there's nothing after the try/except block within the for item in v loop, so this is a no-op. Remove it.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the no-op continue.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the no-op continue.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed — good catch, the continue was indeed a no-op at the end of the loop body.

@bito-code-review

bito-code-review Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #ecceb6

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/mcp_service/user/schemas.py - 1
Review Details
  • Files reviewed - 1 · Commit Range: b784651..11c49f1
    • superset/mcp_service/user/schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@bito-code-review

bito-code-review Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #e07374

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/mcp_service/user/schemas.py - 1
Review Details
  • Files reviewed - 2 · Commit Range: ea0baeb..f2d9a88
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@bito-code-review

bito-code-review Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #715273

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 646ee7c..5327c63
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@bito-code-review bito-code-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #64cd4a

Actionable Suggestions - 2
  • tests/unit_tests/mcp_service/user/test_schemas.py - 2
Review Details
  • Files reviewed - 2 · Commit Range: f4d11ac..899150b
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment on lines +65 to +80
def test_serialize_user_object_round_trip_with_empty_roles() -> None:
"""serialize_user_object must produce UserInfo.roles == [] for empty roles."""
user = MagicMock()
user.id = 1
user.username = "admin"
user.first_name = "Admin"
user.last_name = "User"
user.active = True
user.email = "admin@example.com"
user.changed_on = None
user.roles = []

info = serialize_user_object(user, include_sensitive=True, include_roles=True)

assert info is not None
assert info.roles == []

@bito-code-review bito-code-review Bot Jun 8, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete test assertions

Per rule [6262], this test claims to verify 'serialize_user_object round trip' but only asserts info.roles == []. The function transforms username, first_name, last_name, email, and changed_on via escape_llm_context_delimiters and sanitize_for_llm_context (schemas.py lines 314-328), which can alter string values. Missing assertions allow bugs in those transformations to go undetected.

Code suggestion
Check the AI-generated fix before applying
 --- a/tests/unit_tests/mcp_service/user/test_schemas.py
 +++ b/tests/unit_tests/mcp_service/user/test_schemas.py
 @@ -74,9 +74,18 @@ def test_serialize_user_object_round_trip_with_empty_roles() -> None:
      user.changed_on = None
      user.roles = []
 
      info = serialize_user_object(user, include_sensitive=True, include_roles=True)
 
      assert info is not None
      assert info.roles == []
 +    assert info.username == "admin"
 +    assert info.first_name == "Admin"
 +    assert info.last_name == "User"
 +    assert info.active is True
 +    assert info.email == "admin@example.com"
 +    assert info.changed_on is None
 
 
  def test_serialize_user_object_round_trip_with_role_objects() -> None:
      """Full from_attributes path through serialize_user_object -> UserInfo."""
 @@ -96,6 +105,11 @@ def test_serialize_user_object_round_trip_with_role_objects() -> None:
      info = serialize_user_object(user, include_sensitive=True, include_roles=True)
 
      assert info is not None
      assert info.roles == ["Admin"]
 +    assert info.username == "admin"
 +    assert info.first_name == "Admin"
 +    assert info.last_name == "User"
 +    assert info.active is True
 +    assert info.email == "admin@example.com"
 +    assert info.changed_on is None

Code Review Run #64cd4a

Semantic duplication in test setup

The two serialize_user_object round-trip tests share ~15 lines of identical MagicMock setup for user attributes (id, username, first_name, last_name, active, email, changed_on). Extracting this to a helper or fixture would reduce maintenance burden and prevent divergence if attributes need to change.

Code Review Run #aa437d


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added assertions for all output fields in the round-trip tests — username, first_name, last_name, active, email, and changed_on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added assertions in the round-trip tests are correct and address the reviewer's concern regarding incomplete test coverage. By explicitly asserting the values for username, first_name, last_name, active, email, and changed_on, the tests now properly verify that the serialization and transformation logic (including escape_llm_context_delimiters and sanitize_for_llm_context) is functioning as expected without introducing regressions.

tests/unit_tests/mcp_service/user/test_schemas.py

assert info.username == "admin"
    assert info.first_name == "Admin"
    assert info.last_name == "User"
    assert info.active is True
    assert info.email == "admin@example.com"
    assert info.changed_on is None

Comment on lines +83 to +101
def test_serialize_user_object_round_trip_with_role_objects() -> None:
"""Full from_attributes path through serialize_user_object -> UserInfo."""
role_admin = MagicMock()
role_admin.name = "Admin"

user = MagicMock()
user.id = 1
user.username = "admin"
user.first_name = "Admin"
user.last_name = "User"
user.active = True
user.email = "admin@example.com"
user.changed_on = None
user.roles = [role_admin]

info = serialize_user_object(user, include_sensitive=True, include_roles=True)

assert info is not None
assert info.roles == ["Admin"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete test assertions

Per rule [6262], the test name 'round_trip_with_role_objects' promises full round-trip verification, but only roles is asserted. This mirrors the gap in test 5 and should be fixed simultaneously for consistency.

Code suggestion
Check the AI-generated fix before applying
 --- a/tests/unit_tests/mcp_service/user/test_schemas.py
 +++ b/tests/unit_tests/mcp_service/user/test_schemas.py
 @@ -96,6 +96,11 @@ def test_serialize_user_object_round_trip_with_role_objects() -> None:
      info = serialize_user_object(user, include_sensitive=True, include_roles=True)
 
      assert info is not None
      assert info.roles == ["Admin"]
 +    assert info.username == "admin"
 +    assert info.first_name == "Admin"
 +    assert info.last_name == "User"
 +    assert info.active is True
 +    assert info.email == "admin@example.com"
 +    assert info.changed_on is None

Code Review Run #64cd4a


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added assertions for all output fields in the round-trip tests — username, first_name, last_name, active, email, and changed_on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion to add assertions for all output fields in the round-trip test is appropriate. It ensures that the test fully verifies the serialized user object, consistent with the expectations set by the test name.

tests/unit_tests/mcp_service/user/test_schemas.py

assert info.roles == ["Admin"]
 +    assert info.username == "admin"
 +    assert info.first_name == "Admin"
 +    assert info.last_name == "User"
 +    assert info.active is True
 +    assert info.email == "admin@example.com"
 +    assert info.changed_on is None

@bito-code-review

bito-code-review Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #f7560e

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • tests/unit_tests/mcp_service/user/test_schemas.py - 1
    • Test assertions use substring matching · Line 80-84
Review Details
  • Files reviewed - 1 · Commit Range: 899150b..292e567
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Hi @aminghadersohi, gentle follow-up — the CI workflows are still awaiting maintainer approval. Could you please approve them so the tests can run? Thanks for reviewing!

Comment on lines +57 to +58
role_detached = MagicMock()
role_detached.name = MagicMock(side_effect=DetachedInstanceError())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: This mock setup does not actually raise DetachedInstanceError when item.name is accessed, so the test does not exercise the new exception-handling branch and can pass even if that logic breaks. Use a mock object/property that raises on attribute access to validate the intended detached-instance behavior. [code quality]

Severity Level: Major ⚠️
- ⚠️ DetachedInstanceError path in roles validator untested.
- ⚠️ Future regressions in detached-role handling may slip through.
Steps of Reproduction ✅
1. Open `tests/unit_tests/mcp_service/user/test_schemas.py` and locate
`test_user_info_ignores_role_with_detached_instance` (lines 53–62), which is intended to
exercise the `DetachedInstanceError` handling in `_extract_role_names` at
`superset/mcp_service/user/schemas.py:125–132`.

2. Note that the test configures `role_detached = MagicMock()` and then
`role_detached.name = MagicMock(side_effect=DetachedInstanceError())` at lines 57–58,
meaning `role_detached.name` is a callable MagicMock whose side_effect fires only when it
is called.

3. In the validator `_extract_role_names` (lines 120–133 in
`superset/mcp_service/user/schemas.py`), the code path for non-string roles does `if
hasattr(item, "name") and isinstance(item.name, str): result.append(item.name)`; this
accesses `item.name` but never calls it, so the MagicMock side_effect is not triggered and
`DetachedInstanceError` is never raised during the test.

4. Run `pytest
tests/unit_tests/mcp_service/user/test_schemas.py::test_user_info_ignores_role_with_detached_instance`
and observe that the test passes even if you remove or break the `except
DetachedInstanceError` block in `_extract_role_names`, demonstrating that the
detached-instance exception branch is not actually covered by this test and regressions
there would go unnoticed.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** tests/unit_tests/mcp_service/user/test_schemas.py
**Line:** 57:58
**Comment:**
	*Code Quality: This mock setup does not actually raise `DetachedInstanceError` when `item.name` is accessed, so the test does not exercise the new exception-handling branch and can pass even if that logic breaks. Use a mock object/property that raises on attribute access to validate the intended detached-instance behavior.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the MagicMock with a small class whose property raises directly, so the exception now fires on attribute access rather than on conversion.


@field_validator("roles", mode="before")
@classmethod
def _extract_role_names(cls, v: Any) -> list[str] | None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add an explicit type hint for the cls parameter in the new validator method so all parameters in newly added method signatures are fully typed. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This is a newly added method, and its signature omits a type hint for cls. The custom rule requires new Python functions and methods to be fully typed, so this is a real violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/user/schemas.py
**Line:** 113:113
**Comment:**
	*Custom Rule: Add an explicit type hint for the `cls` parameter in the new validator method so all parameters in newly added method signatures are fully typed.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added type hint for cls parameter, docstring for DetachedRole stub, and return type annotation on the name property.

role_good = MagicMock()
role_good.name = "Admin"

class DetachedRole:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add an inline docstring to the newly introduced class so the class-level behavior is documented. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The class is newly introduced in the changed file and has no docstring. This matches the custom rule requiring newly added Python classes to include inline docstrings.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** tests/unit_tests/mcp_service/user/test_schemas.py
**Line:** 58:58
**Comment:**
	*Custom Rule: Add an inline docstring to the newly introduced class so the class-level behavior is documented.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +59 to +60
@property
def name(self):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add a return type annotation to this new property method so the method signature is fully typed. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The newly added property method omits a return type annotation. Since the surrounding code is newly added, the custom rule requires new Python methods to be fully typed, so this is a real violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** tests/unit_tests/mcp_service/user/test_schemas.py
**Line:** 59:60
**Comment:**
	*Custom Rule: Add a return type annotation to this new property method so the method signature is fully typed.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +122 to +127
if isinstance(item, str):
result.append(item)
continue
try:
if hasattr(item, "name") and isinstance(item.name, str):
result.append(item.name)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Role names are now emitted in MCP responses, but they are passed through unchanged and are not sanitized/escaped for LLM context like other user-facing text fields in this serializer. A crafted role name can inject control delimiters or prompt text into downstream LLM context; sanitize/escape each role name before adding it to the output. [security]

Severity Level: Major ⚠️
- ⚠️ get_user_info MCP tool exposes unsanitized role names.
- ⚠️ list_users MCP tool returns raw role names to LLM clients.
- ⚠️ Role metadata can carry unwrapped, instruction-like text.
Steps of Reproduction ✅
1. In `superset/mcp_service/user/schemas.py:94-109`, the `UserInfo` schema defines `roles:
list[str] | None`, and the `_extract_role_names` validator at
`superset/mcp_service/user/schemas.py:111-133` converts ORM Role objects into plain
strings but does not apply `escape_llm_context_delimiters` or `sanitize_for_llm_context`
to those strings.

2. The serializer `serialize_user_object` in `superset/mcp_service/user/schemas.py:32-68`
builds the UserInfo instance for MCP tools: it sanitizes `username` using
`escape_llm_context_delimiters` and wraps `first_name` and `last_name` with
`sanitize_for_llm_context` (lines 55-61), but it passes `roles` through unchanged as the
list of `r.name` values collected at lines 45-52.

3. The MCP tool `get_user_info`
(`superset/mcp_service/user/tool/get_user_info.py:24-27,52-60`) calls this serializer via
`_serializer` and returns the resulting `UserInfo` to the MCP client; similarly,
`list_users` (`superset/mcp_service/user/tool/list_users.py:12-19,21-33,35-65`) uses
`serialize_user_object` for each row and then emits the response via
`result.model_dump(mode="json", context=...)`, exposing `UserInfo.roles` directly in tool
JSON consumed by LLM agents.

4. Given the sanitization utilities in
`superset/mcp_service/utils/sanitization.py:38-41,80-93,98-181` are explicitly designed to
wrap and/or escape user-controlled strings before placing them in LLM context, the fact
that role names bypass these utilities means that a role whose `name` contains prompt-like
content or custom delimiters (e.g. long instructions or tokens intended to look like
`<UNTRUSTED-CONTENT>` blocks) will be surfaced verbatim in MCP tool JSON and into
downstream LLM prompts, unlike other user-facing fields on UserInfo that are sanitized.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/user/schemas.py
**Line:** 122:127
**Comment:**
	*Security: Role names are now emitted in MCP responses, but they are passed through unchanged and are not sanitized/escaped for LLM context like other user-facing text fields in this serializer. A crafted role name can inject control delimiters or prompt text into downstream LLM context; sanitize/escape each role name before adding it to the output.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Applied escape_llm_context_delimiters to each role name in serialize_user_object, matching the sanitization already applied to username and email fields.

@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Hi @aminghadersohi, gentle follow-up — the CI workflows are still awaiting maintainer approval. Could you please approve them so the tests can run? Thanks for reviewing!

@bito-code-review

bito-code-review Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #29fbd4

Actionable Suggestions - 0
Additional Suggestions - 1
  • superset/mcp_service/user/schemas.py - 1
    • Inconsistent sanitization for roles field · Line 326-326
      The `roles` field is no longer sanitized via `sanitize_for_llm_context` while `first_name` and `last_name` still are. Role names are user-controlled data that could contain HTML or delimiter tokens. Apply sanitization consistently within the validator to maintain uniform response formatting.
Review Details
  • Files reviewed - 2 · Commit Range: f9ec54e..e3a1878
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@goingforstudying-ctrl goingforstudying-ctrl force-pushed the fix-mcp-userinfo-roles branch 2 times, most recently from 40d1aae to 75dc11e Compare June 11, 2026 17:17
@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Hi @aminghadersohi, gentle follow-up — the CI workflows are still awaiting maintainer approval. Could you please approve them so the tests can run? Thanks for reviewing!

@bito-code-review

bito-code-review Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #5a4084

Actionable Suggestions - 0
Additional Suggestions - 1
  • superset/mcp_service/user/schemas.py - 1
    • CWE-79: Inconsistent LLM Context Escaping · Line 309-309
      The `roles` field uses `escape_llm_context_delimiters` while `first_name`/`last_name` use `sanitize_for_llm_context` (lines 316-321). This creates inconsistent LLM context wrapping: other user fields are wrapped in `` delimiters but roles are only escaped. This inconsistency could allow delimiter-like content in role names to break LLM context boundaries.
      Code suggestion
      --- a/superset/mcp_service/user/schemas.py
      +++ b/superset/mcp_service/user/schemas.py
       @@ -306,7 +306,7 @@ def serialize_user_object(
                user_roles = getattr(user, "roles", None)
                if user_roles is not None:
                    try:
      -                roles = [escape_llm_context_delimiters(r.name) for r in user_roles if hasattr(r, "name")]
      +                roles = [sanitize_for_llm_context(r.name, field_path=("roles",)) for r in user_roles if hasattr(r, "name")]
                    except (AttributeError, DetachedInstanceError):
                        roles = None
Review Details
  • Files reviewed - 2 · Commit Range: 7beb3d4..75dc11e
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Fixed — roles now use sanitize_for_llm_context with the same field_path=("roles",) wrapping as first_name/last_name, so all user fields are consistent in LLM context output.

@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Hi @aminghadersohi, gentle follow-up — the CI workflows are still awaiting maintainer approval. Could you please approve them so the tests can run? Thanks for reviewing!

goingforstudying-ctrl and others added 8 commits June 11, 2026 18:08
…pache#40733)

The UserInfo Pydantic model in mcp_service/user/schemas.py has
`from_attributes=True`, which causes Pydantic to map ORM attributes
directly. When `user.roles` returns Role ORM objects instead of
strings, the serializer fails with:

  "Input should be a valid string [type=string_type, input_value=Admin, input_type=Role]"

This breaks MCP tools (get_dataset_info, get_chart_info, generate_chart)
for any user with multiple roles assigned.

Changes:
- Add a `field_validator("roles", mode=before)` to UserInfo that
  coerces Role objects to their `.name` strings while preserving
  backward compatibility for already-string inputs

Fixes apache#40733
…n UserInfo validator

- Reject bare strings for roles so Pydantic's list[str] validation is preserved.
- Preserve empty role lists as [] instead of collapsing to None.
- Handle DetachedInstanceError when reading Role.name.
- Add unit tests for the coercion path including from_attributes scenario.
- Move hasattr(item, 'name') inside try/except to catch DetachedInstanceError
  that can fire during attribute access on detached SQLAlchemy instances
- Add isinstance(item.name, str) guard to avoid matching non-string .name attrs
- Add logger.debug() before skipping detached instances for observability
- Refactor to early-continue pattern for plain strings
…user fields

Bito review pointed out that roles used escape_llm_context_delimiters
while first_name/last_name used sanitize_for_llm_context, creating
inconsistent LLM context wrapping. Role names now get the same
delimiter wrapping as other user fields.
@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Hi @aminghadersohi, gentle follow-up — the CI workflows are still awaiting maintainer approval. Could you please approve the workflows when you have a moment? Thanks!

Comment thread superset/mcp_service/user/schemas.py Outdated
if user_roles is not None:
try:
roles = [r.name for r in user_roles if hasattr(r, "name")]
roles = [sanitize_for_llm_context(r.name, field_path=("roles",)) for r in user_roles if hasattr(r, "name") and isinstance(r.name, str)]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high: sanitize_for_llm_context(r.name, field_path=("roles",)) wraps each name as "<UNTRUSTED-CONTENT>\nAdmin\n</UNTRUSTED-CONTENT>" — so test_serialize_user_object_round_trip_with_role_objects (test_schemas.py:106) now asserts info.roles == ["Admin"] against the wrapped string and will fail. Pre-commit/CI will break.

Either update the test assertion to match the wrapped output, or move the sanitize_for_llm_context call into _extract_role_names so both construction paths produce consistently sanitized values and the test expectation is clear.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Role names are operational identifiers like username and email, so escape_llm_context_delimiters fits better here — it escapes the delimiter tokens without wrapping the string content. Switched to that in the latest push.

… sanitize_for_llm_context

Role names are operational identifiers (like username and email), not free-form
user text. Using sanitize_for_llm_context wraps each name in UNTRUSTED-CONTENT
delimiters, breaking round-trip tests. escape_llm_context_delimiters provides
defense-in-depth without changing string identity.
The continue in the except DetachedInstanceError block was the last
statement of the loop body, making it a no-op.
@bito-code-review

bito-code-review Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #3d2130

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/mcp_service/user/schemas.py - 1
  • tests/unit_tests/mcp_service/user/test_schemas.py - 1
    • Semantic duplication across diff files · Line 70-106
Review Details
  • Files reviewed - 2 · Commit Range: 95dfa07..e5ddb0d
    • superset/mcp_service/user/schemas.py
    • tests/unit_tests/mcp_service/user/test_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@goingforstudying-ctrl

Copy link
Copy Markdown
Author

Hi @aminghadersohi, gentle follow-up — the CI workflows are still awaiting maintainer approval. I've addressed all review feedback (ruff E402 logger placement, vacuous ternary, role sanitization, etc.). Could you please approve the workflows so the tests can run when you have a moment? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP tools fail with UserInfo Pydantic role-serialization error (input_value=Admin, input_type=Role)

3 participants