Skip to content

expression: add full SQL auto embedding support for starter mode#69014

Open
ChangRui-Ryan wants to merge 4 commits into
pingcap:masterfrom
ChangRui-Ryan:changrui_cse_embed-text
Open

expression: add full SQL auto embedding support for starter mode#69014
ChangRui-Ryan wants to merge 4 commits into
pingcap:masterfrom
ChangRui-Ryan:changrui_cse_embed-text

Conversation

@ChangRui-Ryan
Copy link
Copy Markdown
Contributor

@ChangRui-Ryan ChangRui-Ryan commented Jun 8, 2026

What problem does this PR solve?

Issue Number: ref #67765

Problem Summary:
TiDB Cloud Starter needs SQL Auto Embedding support in the upstream TiDB kernel so that Starter can use an upstream-built TiDB binary while still providing the documented auto-embedding experience.

Before this PR, upstream TiDB did not have a complete SQL-side embedding path. Users could not call EMBED_TEXT() from SQL to obtain vector embeddings, could not define stored generated vector columns backed by embedding models, and could not use text-query vector distance helper functions such as VEC_EMBED_L2_DISTANCE() to automatically embed query text during vector search.

The feature also needs to be deployment-aware. Auto Embedding is intended to be available only in Starter deployment mode. Non-Starter deployments should not accidentally invoke external embedding services or hosted TiDB Cloud embedding infrastructure through EMBED_TEXT().

What changed and how does it work?

This PR adds full SQL Auto Embedding support to the upstream TiDB kernel.

At the SQL layer, it introduces the EMBED_TEXT(model, text[, options]) builtin, which returns a VECTOR<FLOAT32> embedding. The function supports provider-qualified model names and JSON options, and it is treated as a non-foldable, mutable-effect function because it may call external inference services and depends on runtime configuration.

At the inference layer, this PR adds a shared embedding framework under pkg/inference, including provider implementations for OpenAI-compatible endpoints, Jina AI, Cohere, Gemini, Hugging Face, NVIDIA, and TiDB Cloud hosted embedding. It also adds batching, request cancellation, per-session/global configuration integration, API key masking, endpoint validation, and test mock providers.

For auto-embedding generated columns, this PR allows EMBED_TEXT() to be used as the top-level expression of a stored generated vector column. The DDL validation enforces the supported shape: the generated column must be stored, the model/options arguments must be constants, and dependent generated columns on auto-embedding columns are rejected. During DML, TiDB evaluates the embedding expression and materializes the generated vector value on insert/update.

For vector search, this PR adds planner rewrite support for VEC_EMBED_*_DISTANCE() helper functions. When the vector column is backed by an auto-embedding generated column, TiDB rewrites text-query distance functions into the corresponding vector distance function with an EMBED_TEXT() call on the query text. This lets users query by text without manually embedding the query first.

The embedding function is managed through the TiDB domain so all sessions can share the same embedding manager and provider configuration. A small domain adaptor is added to avoid package dependency cycles between expression evaluation and domain initialization.

Finally, this PR gates runtime use of EMBED_TEXT() by deployment mode. EMBED_TEXT() now checks deploymode.IsStarter() before evaluating arguments or calling any provider. If the current TiDB instance is not running in Starter deployment mode, the function returns an error instead of invoking an embedding provider.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

  • New Features

    • EMBED_TEXT() builtin, auto-generated stored vector columns, vector-search rewrites, and new vector distance functions
    • Batched, concurrent embedding evaluation with multiple provider backends (OpenAI, Gemini, Cohere, HuggingFace, NVIDIA, Jina, TiDB Cloud, mock)
  • Configuration

    • New hosted-embedding config section and sysvars for embedding API credentials; API base URL normalization and masked key display
  • Tests

    • Extensive unit and integration coverage across embedding engines, batcher, and auto-embedding flows

@ti-chi-bot ti-chi-bot Bot added do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. labels Jun 8, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented Jun 8, 2026

@ChangRui-Ryan I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot Bot added sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 8, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Jun 8, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign d3hunter, terry1purcell, xuhuaiyu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tiprow
Copy link
Copy Markdown

tiprow Bot commented Jun 8, 2026

Hi @ChangRui-Ryan. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 8, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 908f779d-1217-4a74-9f6e-0fc2fe5f817f

📥 Commits

Reviewing files that changed from the base of the PR and between c3c302f and b6428d6.

📒 Files selected for processing (2)
  • pkg/planner/util/null_misc_test.go
  • tests/integrationtest/r/executor/show.result
✅ Files skipped from review due to trivial changes (1)
  • tests/integrationtest/r/executor/show.result

📝 Walkthrough

Walkthrough

Adds hosted-embedding config and sysvars, an EmbedFn runtime with batching and provider clients, EMBED_TEXT builtin and AST helpers, DDL validation for auto-embedding, executor pre-evaluation for generated embedding columns, planner rewrites, and extensive provider/unit/integration tests.

Changes

Embedding feature merge

Layer / File(s) Summary
Config and examples
pkg/config/config.go, pkg/config/config.toml.example, pkg/config/config.toml.nextgen.example
Adds HostedEmbedding config type, TOML example sections, and default Enabled: false.
DDL generated-column validation
pkg/ddl/add_column.go, pkg/ddl/create_table.go, pkg/ddl/generated_column.go
Enforces EMBED_TEXT() constraints for generated columns (top-level call, STORED only, literal model name) and disallows generated columns depending on auto-embedding columns.
Domain lifecycle
pkg/domain/inference.go, pkg/domain/domain.go, pkg/domain/BUILD.bazel
Initializes and closes a domain-managed inference.EmbedFn and wires the new source into domain build.
Inference runtime and providers
pkg/inference/*, pkg/inference/embedding/*, pkg/inference/BUILD.bazel
Adds EmbedFn registry, batching layer, base embedder interface, multiple provider clients (OpenAI, Gemini, HuggingFace, Cohere, Jina, NVIDIA, TiDB Cloud, mock), provider protocols, caching/singleflight, and many tests/build targets.
Expression builtin & AST helpers
pkg/expression/builtin_inference.go, pkg/expression/inference_helper.go, pkg/expression/*
Adds EMBED_TEXT builtin implementation, Eval helpers, AST detection/extraction helpers, function trait updates, thread-safety generator handling, column GeneratedExprString plumbing, and tests.
Planner rewrite and column plumbing
pkg/planner/core/expression_rewriter.go, pkg/planner/core/logical_plan_builder.go, pkg/parser/ast/functions.go
Preserves GeneratedExprString, adds embedding function constants, and rewrites VEC_EMBED_*_DISTANCE() calls into vector-distance expressions using constructed EMBED_TEXT() calls.
Executor insert-time evaluation
pkg/executor/insert_common.go, pkg/executor/BUILD.bazel
Adds concurrent pre-evaluation and population of auto-embedding generated columns before batch exec, with deploy-mode gating and concurrency controls.
Sysvars and normalization
pkg/sessionctx/vardef/tidb_vars.go, pkg/sessionctx/variable/embedding_vars.go, pkg/sessionctx/variable/sysvar.go
Adds embedding API sysvars (API keys and OpenAI base), OpenAI base normalization/whitelist, masked sysvar retrieval, and tests.
Integration and unit tests & build files
pkg/expression/integration_test/*, many pkg/inference/embedding/*_test.go, pkg/expression/*_test.go
Extensive tests covering builtin behavior, provider clients, batcher semantics, auto-embedding DDL/DML, and protocol handling; Bazel BUILD updates include added sources and deps.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Planner
  participant Executor
  participant Domain
  participant EmbedFn
  participant Batcher
  participant Provider
  Client->>Planner: SQL parse & rewrite (VEC_EMBED_* -> EMBED_TEXT)
  Planner->>Executor: execute plan with generated columns
  Executor->>Domain: request embed function (via domain.GetEmbedFn)
  Domain->>EmbedFn: GetEmbedFn()
  Executor->>EmbedFn: Embed(shouldCancel, model/provider, text, opts)
  EmbedFn->>EmbedFn: check cache (get)
  EmbedFn->>Batcher: CreateEmbeddings(model/provider, texts, opts)
  Batcher->>Provider: CreateEmbeddings(batch request)
  Provider-->>Batcher: embeddings or error
  Batcher-->>EmbedFn: batched results
  EmbedFn->>EmbedFn: cache set
  EmbedFn-->>Executor: embedding vector
  Executor-->>Client: DML/Query result
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Suggested labels

sig/sql-infra, ok-to-test

Suggested reviewers

  • AilinKid
  • qw4990
  • yudongusa
  • D3Hunter

Poem

"I nibble configs, tests, and glue,
I batch the texts and forward too.
From planner roots to runtime's den,
Embeddings hop — then bloom again.
— a rabbit, joyfully bright 🐇"

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🧹 Nitpick comments (12)
pkg/planner/core/expression_rewriter.go (1)

3027-3031: ⚡ Quick win

Improve error message robustness by handling empty column name.

Line 3029 uses vecArg.OrigName in the error message, which may be empty. Consider falling back to vecArg.String() or a placeholder to ensure the error message is always informative.

♻️ Proposed improvement
+	colName := vecArg.OrigName
+	if colName == "" {
+		colName = vecArg.String()
+	}
 	vecArg, ok := args[0].(*expression.Column)
 	if !ok || vecArg.GeneratedExprString == "" || !vecArg.RetType.EvalType().IsVectorKind() {
-		er.err = fmt.Errorf("%s() first argument must be a vector embedding column generated by EMBED_TEXT()", v.FnName.O)
+		er.err = fmt.Errorf("%s() first argument must be a vector embedding column generated by EMBED_TEXT(), got %s", v.FnName.O, colName)
 		return true
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/planner/core/expression_rewriter.go` around lines 3027 - 3031, The error
message for the vector argument in expression_rewriter.go can be empty because
vecArg.OrigName may be empty; update the check around the vecArg handling in the
block that sets er.err (where vecArg is derived from args[0] and v.FnName.O is
used) to build a robust name: if vecArg.OrigName is empty, use vecArg.String()
or a constant placeholder like "<unknown column>" when formatting the error text
so the message is always informative and never blank.
pkg/inference/embedding/base/BUILD.bazel (1)

15-15: 💤 Low value

Reconsider the flaky marking for deterministic tests.

The base_test is marked as flaky = True, but reviewing base_test.go shows it only performs deterministic base64 decoding and validation tests without concurrency or timing dependencies. Unless there's a known flakiness issue, consider removing this marking to ensure test failures are investigated rather than automatically retried.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/base/BUILD.bazel` at line 15, The test target
"base_test" in the BUILD rule is marked flaky = True even though base_test only
runs deterministic base64 decode/validation; remove the flaky = True attribute
(or set flaky = False) from the base_test rule so failures surface for
investigation, then run the test suite locally to confirm no flakiness before
committing.
pkg/inference/embedding/batcher/batcher.go (1)

244-260: ⚡ Quick win

Document the batch-level context cancellation semantics.

The batch request context (reqCtx) only cancels when all individual request contexts are done. This means if requests with different timeouts are batched together, a request with a short timeout may wait longer than expected (bounded by batchWindow). While this is correct batching behavior, consider adding a comment explaining this trade-off for future maintainers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/batcher/batcher.go` around lines 244 - 260, Add a
clear comment above the reqCtx/cancelReq creation and goroutine explaining that
reqCtx only cancels after all individual request contexts in thisBatch.calls are
done (the goroutine loops over call.ctx and only calls cancelReq after every
call is done), so batching can cause a request with a short timeout to wait
longer (up to batchWindow) before its batch-level context cancels; state this is
intentional for batching and note the trade-off for maintainers and any
expectations around embedder.CreateEmbeddings using reqCtx.
pkg/inference/embedding/huggingface/BUILD.bazel (2)

23-23: ⚡ Quick win

Investigate why tests are marked flaky.

The flaky = True attribute suggests these tests have non-deterministic failures. Consider investigating and fixing the root cause rather than marking tests as flaky, as this can mask real issues and reduce test signal quality.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/huggingface/BUILD.bazel` at line 23, The BUILD.bazel
target currently sets flaky = True, which masks nondeterministic test failures;
remove or revert the flaky = True attribute on the target and instead reproduce
the intermittent failures by running the corresponding Bazel test target(s)
repeatedly (e.g., bazel test --runs_per_test=50), add deterministic fixes
(stabilize timers, seed randomness, isolate shared state, add proper
setup/teardown) in the test code, and only reintroduce flaky metadata after root
causes are identified; look for references to the BUILD target that contains
flaky = True to locate the tests to run and stabilize.

18-26: 💤 Low value

Inconsistent shard_count across provider tests.

This test uses shard_count = 10, while the Gemini tests use shard_count = 7. Consider establishing a consistent sharding strategy across similar test suites, or document why different providers need different shard counts.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/huggingface/BUILD.bazel` around lines 18 - 26, The
shard_count for the Huggingface test target "huggingface_test" is set to 10
which is inconsistent with the Gemini provider tests (which use shard_count =
7); update the "huggingface_test" target in BUILD.bazel to use the same
shard_count value as other provider tests (e.g., change shard_count to 7) or add
a brief comment in the BUILD file next to the "huggingface_test" rule explaining
why it must remain different so the discrepancy is documented and intentional.
pkg/inference/embedding/gemini/BUILD.bazel (1)

18-26: ⚡ Quick win

Investigate why tests are marked flaky.

The flaky = True attribute suggests these tests have non-deterministic failures. Consider investigating and fixing the root cause rather than marking tests as flaky, as this can mask real issues and reduce test signal quality.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/gemini/BUILD.bazel` around lines 18 - 26, The test
target "gemini_test" in the BUILD.bazel currently sets flaky = True which masks
nondeterministic failures; investigate and fix the root causes in gemini_test.go
(race conditions, timing, shared state, or reliance on external services) by
reproducing failures locally (run the "gemini_test" target without flaky and
with shard_count reduced), add deterministic seeds or explicit synchronization
in the test code, increase the timeout only if legitimately slow, and then
remove the flaky = True attribute from the "gemini_test" rule once the tests are
stable.
pkg/inference/embedding/jina/jina.go (2)

55-55: ⚖️ Poor tradeoff

Configure HTTP client timeout to prevent indefinite hangs.

The http.Client is initialized with default settings (no timeout), which can cause requests to hang indefinitely if the JinaAI API is slow or unresponsive. While the context provides cancellation, the client itself should have a reasonable timeout.

⏱️ Suggested fix to add timeout configuration
 func NewJinaEmbedder(cfg EmbedderConfig) *Embedder {
 	return &Embedder{
-		client: http.Client{},
+		client: http.Client{
+			Timeout: 30 * time.Second,
+		},
 		cfg:    cfg,
 	}

Consider making the timeout configurable via EmbedderConfig if different deployments need different values.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/jina/jina.go` at line 55, The http.Client is created
with no timeout (client: http.Client{}), risking indefinite hangs; update the
embedder initialization (where the client is constructed in jina.go) to set a
sensible Timeout on the http.Client (e.g., from a new EmbedderConfig field) and
wire that timeout through the constructor/path that creates the client so the
timeout becomes configurable via EmbedderConfig and used when initializing the
client.

116-119: 💤 Low value

API key may be logged in error responses.

When logging failed requests at line 118, the body may contain echoed authorization information. While the Authorization header isn't logged directly, verify whether JinaAI's error responses could leak sensitive data.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/jina/jina.go` around lines 116 - 119, The error log
in pkg/inference/embedding/jina/jina.go currently logs the full response body in
the logutil.BgLogger().Error call (the line calling logutil.BgLogger().Error
with zap.String("body", string(body))) which could leak echoed API keys; before
calling that logger (or replace that call), sanitize the response body by
parsing/masking any sensitive fields (e.g., "authorization", "api_key", "token",
"access_token", "credentials") or redact Authorization-like patterns (e.g.,
Bearer <token>) and then log the sanitized string; update the code around the
resp handling in the same function to produce a safeBody string and pass
zap.String("body", safeBody) to the logger.
pkg/inference/embedding/huggingface/huggingface.go (2)

121-124: 💤 Low value

API key may be logged in error responses.

When logging failed requests at line 123, the body may contain echoed authorization information or API keys from the error response. While the Authorization header itself isn't logged here, some APIs echo request details in error responses. Consider whether this is a concern for HuggingFace's API.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/huggingface/huggingface.go` around lines 121 - 124,
The error log at the HuggingFace request failure uses the raw response body
(zap.String("body", string(body))) which may contain echoed API keys or
authorization data; update the failure handling in the function that makes the
request (the code that calls resp and builds the log with logutil.BgLogger()) to
avoid logging sensitive contents by either removing the full body from the log
or sanitizing it first (e.g., detect and redact common keys like
"authorization", "api_key", "token" or apply a safe snippet/length limit and
replace sensitive substrings with "[REDACTED]"), and keep the existing metadata
(resp.StatusCode) in the log. Ensure you change the zap.String("body", ...)
usage to use the sanitized string instead of the raw body.

54-54: ⚖️ Poor tradeoff

Configure HTTP client timeout to prevent indefinite hangs.

The http.Client is initialized with default settings, which means no timeout. This can cause requests to hang indefinitely if the HuggingFace API is slow or unresponsive. While the context passed to http.NewRequestWithContext provides cancellation, the client itself should have a reasonable timeout for connection and overall request duration.

⏱️ Suggested fix to add timeout configuration
 func NewHuggingFaceEmbedder(cfg EmbedderConfig) *Embedder {
 	return &Embedder{
-		client: http.Client{},
+		client: http.Client{
+			Timeout: 30 * time.Second,
+		},
 		cfg:    cfg,
 	}

Alternatively, make the timeout configurable via EmbedderConfig if different deployments need different values.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/huggingface/huggingface.go` at line 54, The
http.Client is created with zero timeout (client: http.Client{}) which can hang;
update the HuggingFace embedder initialization to set a sensible Timeout (e.g.,
15–30s) on the http.Client instance used by the embedder, or wire a timeout
value from EmbedderConfig into the client construction so the Timeout field is
configurable; change the client creation referenced by the client: http.Client{}
initializer in the huggingface embedder to use http.Client{Timeout: ...} (or
build the client with a config-supplied duration).
pkg/inference/embedding/jina/jina_test.go (1)

1-227: ⚡ Quick win

Consider adding missing API key test coverage.

The HuggingFace tests include TestHuggingFaceEmbedder_NoAPIKey, TestHuggingFaceEmbedder_CustomErrorHandling, and TestHuggingFaceEmbedder_CustomUnauthorizedError to verify API key validation and custom error handling. The Jina tests lack equivalent coverage for these scenarios. Consider adding similar tests for consistency and completeness.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/jina/jina_test.go` around lines 1 - 227, Add unit
tests to cover missing API key and custom error handling for the Jina embedder:
create tests similar to HuggingFace ones that exercise NewJinaEmbedder and
CreateEmbeddings for (1) no API key supplied (e.g., TestJinaEmbedder_NoAPIKey)
verifying it returns an error and nil embeddings, (2) custom unauthorized
response handling (e.g., TestJinaEmbedder_CustomUnauthorizedError) mocking a 401
response and asserting the error contains expected message, and (3) custom error
payload handling (e.g., TestJinaEmbedder_CustomErrorHandling) mocking non-2xx
responses with body and asserting CreateEmbeddings surfaces the body details;
place these tests alongside existing TestJinaEmbedder_* functions and reuse
httptest.NewServer, EmbedderConfig GetAPIKey/GetBaseURL, and assertions on
returned embeddings/errors to mirror the HuggingFace tests.
pkg/inference/embedding/nvidia/protocol.go (1)

17-22: 💤 Low value

Remove unused Request struct or use it in the implementation.

The Request struct is defined but never used. In nvidia.go lines 89-93, the request body is built using a map literal passed to base.MarshalJSONWithOptions instead of marshalling this struct. This is dead code that adds maintenance burden.

🧹 Option 1: Remove the unused struct
-// Request is the model for Nvidia NIM embeddings API request.
-type Request struct {
-	Input          []string `json:"input"`
-	Model          string   `json:"model"`
-	EncodingFormat string   `json:"encoding_format"`
-}
-
 // Response is the model for Nvidia NIM embeddings API response.

Alternatively, if the struct is intended for future use or documentation purposes, add a comment explaining why it's not currently used in the implementation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/inference/embedding/nvidia/protocol.go` around lines 17 - 22, The Request
struct (type Request) in protocol.go is dead code because the Nvidia embeddings
request body is constructed directly via a map passed to
base.MarshalJSONWithOptions in nvidia.go; either remove the unused Request type
or explicitly use it when marshalling (replace the map literal with an instance
of Request and marshal that), or if you keep the type for future/documentation
purposes add a clear comment above type Request explaining why it is
intentionally unused today and referencing the code path in nvidia.go that
builds the payload via base.MarshalJSONWithOptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/executor/insert_common.go`:
- Around line 688-699: The code captures a single shared evalCtx (evalCtx :=
e.Ctx().GetExprCtx().GetEvalCtx()) and reuses it from every goroutine, which can
race statement-scoped state when calling task.expr.Eval; instead create a
distinct EvalCtx per worker before calling task.expr.Eval (e.g., call
e.Ctx().GetExprCtx().GetEvalCtx() or use a provided Clone/New method inside the
eg.Go closure) so each goroutine has its own evalCtx; update the goroutine body
around task.expr.Eval and chunk.MutRowFromDatums(task.row).ToRow() to use that
per-goroutine evalCtx and ensure any cleanup if the EvalCtx type requires it.
- Around line 647-650: The code snapshots a single uint64 rowCntInLoadData from
e.rowCount for the whole batch when
e.Ctx().GetSessionVars().StmtCtx.InLoadDataStmt is true, then passes that same
value into HandleBadNull for every row causing LOAD DATA errors to be reported
against the batch's last row; change the logic to use the per-row LOAD DATA
index instead of the single snapshot: inside the per-row loop that calls
HandleBadNull, compute the correct per-row LOAD DATA index (using the per-row
counter/variable used to iterate the batch or the LOAD DATA line index) and pass
that per-row value to HandleBadNull rather than rowCntInLoadData; update any use
sites referencing rowCntInLoadData (e.g., where HandleBadNull is invoked) to use
the per-row index so failures are reported against the failing row.

In `@pkg/expression/builtin_inference.go`:
- Around line 1-13: Update the license header year in
pkg/expression/builtin_inference.go to match the corresponding test file
(builtin_inference_test.go) so the copyright years are consistent across related
files; open builtin_inference.go, locate the top-of-file license comment block
and change the "Copyright 2025" year to the same year used in
builtin_inference_test.go (2026) or vice versa if you prefer to standardize on
2025—ensure both files have the identical year in their header comment.

In `@pkg/expression/inference_helper.go`:
- Around line 83-85: ExtractAutoEmbedInfoFromAST currently only rejects fewer
than 2 args but silently accepts extra args; change the argument validation to
explicitly allow only the supported arities (e.g., exactly 2 or exactly 3
arguments) and return an error for any other arity. Update the len(fnCall.Args)
check(s) that currently read "if len(fnCall.Args) < 2" to instead validate
membership in the allowed set (len == 2 || len == 3) and return
fmt.Errorf("invalid EMBED_TEXT() usage: unexpected arity") when not allowed;
apply this same change to the second arg-count validation block in the file that
also inspects fnCall.Args (the later branch inside ExtractAutoEmbedInfoFromAST).

In `@pkg/expression/integration_test/integration_test.go`:
- Around line 4507-4513: The test mutates the global sysvar
tidb_exp_embed_jina_ai_api_key and currently only resets it at the end of the
happy path, risking leakage if an assertion fails; register an unconditional
cleanup before the first mutation (e.g., call t.Cleanup(func(){ tk.MustExec("set
@@global.tidb_exp_embed_jina_ai_api_key = ''") })) so the global is always
reset, and then proceed with the existing tk.MustExec/tk.MustQuery assertions
that set and check tidb_exp_embed_jina_ai_api_key.

In `@pkg/inference/embedding/cohere/cohere.go`:
- Around line 52-56: NewCohereEmbedder constructs an http.Client without a
timeout causing potential indefinite hangs; update NewCohereEmbedder to
initialize client: http.Client{Timeout: time.Second * <reasonable_value>}
(choose e.g. 10s or configurable), and add the "time" import; ensure the
Embedder struct's client field still uses that configured http.Client so all
Cohere requests get the client-level timeout as defense-in-depth.

In `@pkg/inference/embedding/gemini/gemini.go`:
- Around line 98-100: The model string is interpolated directly into the path
when building fullURL, which can break URLs or allow unintended endpoints;
before the fmt.Sprintf call that sets fullURL, validate that model is non-empty
and does not contain path traversal or illegal chars (e.g., reject segments with
"/" or "..") or alternatively escape it with a path-encoding function (e.g.,
url.PathEscape) and use the escaped value (escapedModel) in place of model;
update the code around variables fullURL, model and baseURL to perform this
validation/escaping and return an error if validation fails.
- Around line 51-56: The NewGeminiEmbedder function constructs an Embedder with
an http.Client that lacks a timeout; update NewGeminiEmbedder to set a sensible
client timeout (e.g., via http.Client{Timeout: time.Second * <N>}) so requests
cannot hang indefinitely, and add the "time" import; modify the Embedder
construction (function NewGeminiEmbedder, type Embedder and its client field) to
use the configured timeout value.

In `@pkg/inference/embedding/huggingface/protocol.go`:
- Around line 22-24: The comment above the Response type is inconsistent with
the actual type; update the docstring for Response (type Response [][]float32 in
protocol.go) to state that HuggingFace returns a 2D array of float32 values, or
if you confirm the API returns float64 instead, change the type to [][]float64
and update all usages accordingly; ensure the comment and the type (Response)
match exactly.

In `@pkg/inference/embedding/jina/BUILD.bazel`:
- Around line 23-24: The BUILD target currently sets flaky = True and
shard_count = 6; remove these patch-work flags and fix the underlying test
instability instead: locate the failing tests under the
pkg/inference/embedding/jina package, reproduce the flakes, and address root
causes such as race conditions, timing-dependent assertions, shared state, or
unmocked external dependencies by adding proper synchronization, deterministic
fixtures, explicit mocks, or adjusting timeouts; once tests are deterministic,
delete flaky = True and shard_count = 6 from the BUILD.bazel target so CI runs a
single stable shard.

In `@pkg/inference/embedding/mock/BUILD.bazel`:
- Around line 16-17: The BUILD.bazel target in pkg/inference/embedding/mock
currently sets flaky = True and shard_count = 3—remove those attributes and make
the tests deterministic instead of masking failures; locate the mock test target
in BUILD.bazel and (1) remove flaky and shard_count, (2) audit tests in
pkg/inference/embedding/mock for shared global state, non-deterministic
timing/race conditions, or reliance on test ordering, and (3) fix by adding
proper setup/teardown (reset globals, use fresh temp dirs, inject deterministic
mocks/fakes, or serialize accesses with locks) so the tests pass reliably in
parallel without sharding. Ensure the target no longer needs flaky/shard_count
after these fixes and run the test suite to confirm stability.

In `@pkg/inference/embedding/mock/mock.go`:
- Around line 72-84: The delay handling sleeps unconditionally and ignores the
provided context, causing hangs if the context is cancelled; modify the block
that parses opts["delay"] so that instead of time.Sleep(dur) you wait with a
select on time.After(dur) and ctx.Done(), returning early with ctx.Err() when
the context is cancelled. Keep the existing parsing and error messages (the opts
map and delay parsing code) and only replace the unconditional sleep with a
context-aware select that returns (nil, ctx.Err()) on cancellation.

In `@pkg/inference/embedding/nvidia/nvidia.go`:
- Around line 147-161: The current loop assigns embeddings by the sorted
position (row) which can misplace vectors if indices are duplicated or
non-sequential; instead validate that respObj.Data contains exactly one entry
for each index 0..len(texts)-1 and then assign using the actual Index.
Specifically, after sort.Slice(respObj.Data, ...) iterate respObj.Data and for
each item check that 0 <= item.Index < len(texts) and that
embeddings[item.Index] is nil (to detect duplicates); on any invalid or
duplicate index return an error indicating the problematic item.Index; otherwise
decode with base.DecodeFloat32ArrayBytes and set embeddings[item.Index] = e (not
embeddings[row]).
- Around line 54-56: The HTTP client for Embedder is created with zero timeout
(http.Client{}), which can cause indefinite hangs; set a sensible Timeout on the
client (e.g., http.Client{Timeout: ...}) and make that timeout configurable via
EmbedderConfig (add a Timeout field and use it when constructing the Embedder's
client); update the constructor that returns &Embedder{client: http.Client{},
cfg: cfg, ...} to initialize client with the configured Timeout and provide a
reasonable default if cfg.Timeout is unset.

In `@pkg/inference/embedding/openai/openai_test.go`:
- Line 28: Replace the misleading comment "Mock successful response from real
Jina API" in openai_test.go with a correct description for the OpenAI embedder
(e.g., "Mock successful response from real OpenAI API" or "Mock successful
response for OpenAI embedder") so the comment accurately matches the test;
locate the comment by searching for that exact string in the test file and
update it in the test function that sets up the mocked OpenAI response.

In `@pkg/inference/embedding/openai/openai.go`:
- Around line 142-157: After sorting respObj.Data by Index, validate that each
item's Index is within [0, len(texts)-1] and that indices form a unique sequence
(no duplicates) before assigning into embeddings; for each item in respObj.Data
call base.DecodeFloat32ArrayBytes(item.Embedding) as before but place the
decoded vector into embeddings[item.Index] (not embeddings[row]), and return a
clear error if any item.Index is out of range or a duplicate/missing so we never
misassign embeddings; reference respObj.Data, item.Index, embeddings and
base.DecodeFloat32ArrayBytes to locate and implement the checks and assignment
change.
- Around line 54-58: The http.Client in NewOpenAIEmbedder is created with
default settings (no timeout); update NewOpenAIEmbedder to set a sensible
timeout on Embedder.client by using a configured value from EmbedderConfig
(e.g., add a Timeout field to EmbedderConfig) and fall back to a reasonable
default if not provided; ensure the Embedder struct uses http.Client{Timeout:
cfg.Timeout} (or default) so embedding requests cannot hang indefinitely.

In `@pkg/planner/core/expression_rewriter.go`:
- Around line 3038-3042: The error wrapping currently replaces the original
error with a new one via fmt.Errorf in the block that calls
expression.ExtractAutoEmbedInfoFromAST(astExpr), losing the original
stack/context; change the assignment to preserve the original error by using
errors.Annotatef (or the project's equivalent) to annotate err and assign that
to er.err (refer to expression.ExtractAutoEmbedInfoFromAST, er.err, and
fmt.Errorf) so the original error context is preserved while adding the "column
%s cannot be used in %s" message.
- Around line 3057-3060: The inline initializer fnInit currently does an
unchecked type assertion to *expression.BuiltinEmbedTextSig which can panic;
update fnInit (used by er.newFunctionWithInit) to perform a safe type assertion
(e.g., using the "v, ok := sf.Function.(*expression.BuiltinEmbedTextSig)"
pattern or a type switch), set v.IsFromVecSearch = true only when ok, and return
a descriptive error (instead of panicking) when the function is not the expected
BuiltinEmbedTextSig type so callers can handle the failure.

---

Nitpick comments:
In `@pkg/inference/embedding/base/BUILD.bazel`:
- Line 15: The test target "base_test" in the BUILD rule is marked flaky = True
even though base_test only runs deterministic base64 decode/validation; remove
the flaky = True attribute (or set flaky = False) from the base_test rule so
failures surface for investigation, then run the test suite locally to confirm
no flakiness before committing.

In `@pkg/inference/embedding/batcher/batcher.go`:
- Around line 244-260: Add a clear comment above the reqCtx/cancelReq creation
and goroutine explaining that reqCtx only cancels after all individual request
contexts in thisBatch.calls are done (the goroutine loops over call.ctx and only
calls cancelReq after every call is done), so batching can cause a request with
a short timeout to wait longer (up to batchWindow) before its batch-level
context cancels; state this is intentional for batching and note the trade-off
for maintainers and any expectations around embedder.CreateEmbeddings using
reqCtx.

In `@pkg/inference/embedding/gemini/BUILD.bazel`:
- Around line 18-26: The test target "gemini_test" in the BUILD.bazel currently
sets flaky = True which masks nondeterministic failures; investigate and fix the
root causes in gemini_test.go (race conditions, timing, shared state, or
reliance on external services) by reproducing failures locally (run the
"gemini_test" target without flaky and with shard_count reduced), add
deterministic seeds or explicit synchronization in the test code, increase the
timeout only if legitimately slow, and then remove the flaky = True attribute
from the "gemini_test" rule once the tests are stable.

In `@pkg/inference/embedding/huggingface/BUILD.bazel`:
- Line 23: The BUILD.bazel target currently sets flaky = True, which masks
nondeterministic test failures; remove or revert the flaky = True attribute on
the target and instead reproduce the intermittent failures by running the
corresponding Bazel test target(s) repeatedly (e.g., bazel test
--runs_per_test=50), add deterministic fixes (stabilize timers, seed randomness,
isolate shared state, add proper setup/teardown) in the test code, and only
reintroduce flaky metadata after root causes are identified; look for references
to the BUILD target that contains flaky = True to locate the tests to run and
stabilize.
- Around line 18-26: The shard_count for the Huggingface test target
"huggingface_test" is set to 10 which is inconsistent with the Gemini provider
tests (which use shard_count = 7); update the "huggingface_test" target in
BUILD.bazel to use the same shard_count value as other provider tests (e.g.,
change shard_count to 7) or add a brief comment in the BUILD file next to the
"huggingface_test" rule explaining why it must remain different so the
discrepancy is documented and intentional.

In `@pkg/inference/embedding/huggingface/huggingface.go`:
- Around line 121-124: The error log at the HuggingFace request failure uses the
raw response body (zap.String("body", string(body))) which may contain echoed
API keys or authorization data; update the failure handling in the function that
makes the request (the code that calls resp and builds the log with
logutil.BgLogger()) to avoid logging sensitive contents by either removing the
full body from the log or sanitizing it first (e.g., detect and redact common
keys like "authorization", "api_key", "token" or apply a safe snippet/length
limit and replace sensitive substrings with "[REDACTED]"), and keep the existing
metadata (resp.StatusCode) in the log. Ensure you change the zap.String("body",
...) usage to use the sanitized string instead of the raw body.
- Line 54: The http.Client is created with zero timeout (client: http.Client{})
which can hang; update the HuggingFace embedder initialization to set a sensible
Timeout (e.g., 15–30s) on the http.Client instance used by the embedder, or wire
a timeout value from EmbedderConfig into the client construction so the Timeout
field is configurable; change the client creation referenced by the client:
http.Client{} initializer in the huggingface embedder to use
http.Client{Timeout: ...} (or build the client with a config-supplied duration).

In `@pkg/inference/embedding/jina/jina_test.go`:
- Around line 1-227: Add unit tests to cover missing API key and custom error
handling for the Jina embedder: create tests similar to HuggingFace ones that
exercise NewJinaEmbedder and CreateEmbeddings for (1) no API key supplied (e.g.,
TestJinaEmbedder_NoAPIKey) verifying it returns an error and nil embeddings, (2)
custom unauthorized response handling (e.g.,
TestJinaEmbedder_CustomUnauthorizedError) mocking a 401 response and asserting
the error contains expected message, and (3) custom error payload handling
(e.g., TestJinaEmbedder_CustomErrorHandling) mocking non-2xx responses with body
and asserting CreateEmbeddings surfaces the body details; place these tests
alongside existing TestJinaEmbedder_* functions and reuse httptest.NewServer,
EmbedderConfig GetAPIKey/GetBaseURL, and assertions on returned
embeddings/errors to mirror the HuggingFace tests.

In `@pkg/inference/embedding/jina/jina.go`:
- Line 55: The http.Client is created with no timeout (client: http.Client{}),
risking indefinite hangs; update the embedder initialization (where the client
is constructed in jina.go) to set a sensible Timeout on the http.Client (e.g.,
from a new EmbedderConfig field) and wire that timeout through the
constructor/path that creates the client so the timeout becomes configurable via
EmbedderConfig and used when initializing the client.
- Around line 116-119: The error log in pkg/inference/embedding/jina/jina.go
currently logs the full response body in the logutil.BgLogger().Error call (the
line calling logutil.BgLogger().Error with zap.String("body", string(body)))
which could leak echoed API keys; before calling that logger (or replace that
call), sanitize the response body by parsing/masking any sensitive fields (e.g.,
"authorization", "api_key", "token", "access_token", "credentials") or redact
Authorization-like patterns (e.g., Bearer <token>) and then log the sanitized
string; update the code around the resp handling in the same function to produce
a safeBody string and pass zap.String("body", safeBody) to the logger.

In `@pkg/inference/embedding/nvidia/protocol.go`:
- Around line 17-22: The Request struct (type Request) in protocol.go is dead
code because the Nvidia embeddings request body is constructed directly via a
map passed to base.MarshalJSONWithOptions in nvidia.go; either remove the unused
Request type or explicitly use it when marshalling (replace the map literal with
an instance of Request and marshal that), or if you keep the type for
future/documentation purposes add a clear comment above type Request explaining
why it is intentionally unused today and referencing the code path in nvidia.go
that builds the payload via base.MarshalJSONWithOptions.

In `@pkg/planner/core/expression_rewriter.go`:
- Around line 3027-3031: The error message for the vector argument in
expression_rewriter.go can be empty because vecArg.OrigName may be empty; update
the check around the vecArg handling in the block that sets er.err (where vecArg
is derived from args[0] and v.FnName.O is used) to build a robust name: if
vecArg.OrigName is empty, use vecArg.String() or a constant placeholder like
"<unknown column>" when formatting the error text so the message is always
informative and never blank.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6b57fe1c-2b9b-4cb6-9768-bcc6e68c1971

📥 Commits

Reviewing files that changed from the base of the PR and between b399d61 and b98639b.

📒 Files selected for processing (75)
  • pkg/config/config.go
  • pkg/config/config.toml.example
  • pkg/config/config.toml.nextgen.example
  • pkg/ddl/add_column.go
  • pkg/ddl/create_table.go
  • pkg/ddl/generated_column.go
  • pkg/domain/BUILD.bazel
  • pkg/domain/domain.go
  • pkg/domain/inference.go
  • pkg/executor/insert_common.go
  • pkg/expression/BUILD.bazel
  • pkg/expression/builtin.go
  • pkg/expression/builtin_inference.go
  • pkg/expression/builtin_inference_test.go
  • pkg/expression/builtin_threadunsafe_generated.go
  • pkg/expression/column.go
  • pkg/expression/expression.go
  • pkg/expression/function_traits.go
  • pkg/expression/function_traits_test.go
  • pkg/expression/generator/builtin_threadsafe.go
  • pkg/expression/inference_helper.go
  • pkg/expression/inference_helper_test.go
  • pkg/expression/integration_test/BUILD.bazel
  • pkg/expression/integration_test/integration_test.go
  • pkg/inference/BUILD.bazel
  • pkg/inference/domainadaptor/BUILD.bazel
  • pkg/inference/domainadaptor/adaptor.go
  • pkg/inference/embedding/base/BUILD.bazel
  • pkg/inference/embedding/base/base.go
  • pkg/inference/embedding/base/base_test.go
  • pkg/inference/embedding/batcher/BUILD.bazel
  • pkg/inference/embedding/batcher/batcher.go
  • pkg/inference/embedding/batcher/batcher_test.go
  • pkg/inference/embedding/cohere/BUILD.bazel
  • pkg/inference/embedding/cohere/cohere.go
  • pkg/inference/embedding/cohere/cohere_test.go
  • pkg/inference/embedding/cohere/protocol.go
  • pkg/inference/embedding/gemini/BUILD.bazel
  • pkg/inference/embedding/gemini/gemini.go
  • pkg/inference/embedding/gemini/gemini_test.go
  • pkg/inference/embedding/gemini/protocol.go
  • pkg/inference/embedding/huggingface/BUILD.bazel
  • pkg/inference/embedding/huggingface/huggingface.go
  • pkg/inference/embedding/huggingface/huggingface_test.go
  • pkg/inference/embedding/huggingface/protocol.go
  • pkg/inference/embedding/jina/BUILD.bazel
  • pkg/inference/embedding/jina/jina.go
  • pkg/inference/embedding/jina/jina_test.go
  • pkg/inference/embedding/jina/protocol.go
  • pkg/inference/embedding/mock/BUILD.bazel
  • pkg/inference/embedding/mock/mock.go
  • pkg/inference/embedding/mock/mock_test.go
  • pkg/inference/embedding/nvidia/BUILD.bazel
  • pkg/inference/embedding/nvidia/nvidia.go
  • pkg/inference/embedding/nvidia/nvidia_test.go
  • pkg/inference/embedding/nvidia/protocol.go
  • pkg/inference/embedding/openai/BUILD.bazel
  • pkg/inference/embedding/openai/openai.go
  • pkg/inference/embedding/openai/openai_test.go
  • pkg/inference/embedding/openai/protocol.go
  • pkg/inference/embedding/tidbcloud/BUILD.bazel
  • pkg/inference/embedding/tidbcloud/protocol.go
  • pkg/inference/embedding/tidbcloud/tidbcloud_free.go
  • pkg/inference/embedding/tidbcloud/tidbcloud_free_test.go
  • pkg/inference/manager_test.go
  • pkg/inference/sqlembed.go
  • pkg/parser/ast/functions.go
  • pkg/planner/core/BUILD.bazel
  • pkg/planner/core/expression_rewriter.go
  • pkg/planner/core/logical_plan_builder.go
  • pkg/sessionctx/vardef/tidb_vars.go
  • pkg/sessionctx/variable/BUILD.bazel
  • pkg/sessionctx/variable/embedding_vars.go
  • pkg/sessionctx/variable/embedding_vars_test.go
  • pkg/sessionctx/variable/sysvar.go

Comment thread pkg/executor/insert_common.go Outdated
Comment thread pkg/executor/insert_common.go Outdated
Comment thread pkg/expression/builtin_inference.go Outdated
Comment thread pkg/expression/inference_helper.go Outdated
Comment thread pkg/expression/integration_test/integration_test.go
Comment thread pkg/inference/embedding/openai/openai_test.go Outdated
Comment thread pkg/inference/embedding/openai/openai.go
Comment thread pkg/inference/embedding/openai/openai.go Outdated
Comment thread pkg/planner/core/expression_rewriter.go
Comment thread pkg/planner/core/expression_rewriter.go
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.8714%. Comparing base (d568a85) to head (b6428d6).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #69014        +/-   ##
================================================
+ Coverage   76.3213%   76.8714%   +0.5500%     
================================================
  Files          2041       2051        +10     
  Lines        562689     567130      +4441     
================================================
+ Hits         429452     435961      +6509     
+ Misses       132324     129438      -2886     
- Partials        913       1731       +818     
Flag Coverage Δ
integration 50.0414% <ø> (+10.3126%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 60.4610% <ø> (ø)
parser ∅ <ø> (∅)
br 65.6917% <ø> (+2.9013%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/expression/builtin_inference.go (1)

147-158: ⚡ Quick win

Consider adding a comment to explain the @search suffix behavior.

The @search suffix processing logic is subtle: when isFromVecSearch is true, options like "key@search": value are renamed to "key": value; when false, they're dropped entirely. This allows callers to specify search-time-only options that are ignored during direct EMBED_TEXT() calls but honored when the function is rewritten from VEC_EMBED_*_DISTANCE(). A brief comment would help future maintainers understand this intent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/expression/builtin_inference.go` around lines 147 - 158, Add a short
clarifying comment above the block that collects and processes keys with the
"`@search`" suffix explaining that entries like "key@search" are treated as
search-time-only options: if isFromVecSearch is true they are promoted to "key"
(opts[strings.TrimSuffix(k, "`@search`")] = opts[k]) to be honored when the call
was rewritten from a VEC_EMBED_*_DISTANCE path, otherwise they are removed
(delete(opts, k)) so direct EMBED_TEXT() calls ignore them; reference the
variables searchOpts, opts and isFromVecSearch in the comment so future readers
understand the intent of this rename-or-drop behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/expression/builtin_inference.go`:
- Around line 147-158: Add a short clarifying comment above the block that
collects and processes keys with the "`@search`" suffix explaining that entries
like "key@search" are treated as search-time-only options: if isFromVecSearch is
true they are promoted to "key" (opts[strings.TrimSuffix(k, "`@search`")] =
opts[k]) to be honored when the call was rewritten from a VEC_EMBED_*_DISTANCE
path, otherwise they are removed (delete(opts, k)) so direct EMBED_TEXT() calls
ignore them; reference the variables searchOpts, opts and isFromVecSearch in the
comment so future readers understand the intent of this rename-or-drop behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 32cbf36e-e183-4934-addb-0aaa6faec878

📥 Commits

Reviewing files that changed from the base of the PR and between b98639b and c3c302f.

📒 Files selected for processing (28)
  • pkg/executor/BUILD.bazel
  • pkg/executor/insert_common.go
  • pkg/expression/builtin_inference.go
  • pkg/expression/inference_helper.go
  • pkg/expression/inference_helper_test.go
  • pkg/expression/integration_test/integration_test.go
  • pkg/inference/embedding/base/BUILD.bazel
  • pkg/inference/embedding/base/base.go
  • pkg/inference/embedding/base/base_test.go
  • pkg/inference/embedding/cohere/cohere.go
  • pkg/inference/embedding/gemini/BUILD.bazel
  • pkg/inference/embedding/gemini/gemini.go
  • pkg/inference/embedding/gemini/gemini_test.go
  • pkg/inference/embedding/huggingface/huggingface.go
  • pkg/inference/embedding/huggingface/protocol.go
  • pkg/inference/embedding/jina/BUILD.bazel
  • pkg/inference/embedding/jina/jina.go
  • pkg/inference/embedding/jina/jina_test.go
  • pkg/inference/embedding/mock/mock.go
  • pkg/inference/embedding/mock/mock_test.go
  • pkg/inference/embedding/nvidia/BUILD.bazel
  • pkg/inference/embedding/nvidia/nvidia.go
  • pkg/inference/embedding/nvidia/nvidia_test.go
  • pkg/inference/embedding/openai/BUILD.bazel
  • pkg/inference/embedding/openai/openai.go
  • pkg/inference/embedding/openai/openai_test.go
  • pkg/inference/embedding/tidbcloud/tidbcloud_free.go
  • pkg/planner/core/expression_rewriter.go
🚧 Files skipped from review as they are similar to previous changes (20)
  • pkg/inference/embedding/base/BUILD.bazel
  • pkg/inference/embedding/openai/BUILD.bazel
  • pkg/inference/embedding/huggingface/protocol.go
  • pkg/inference/embedding/mock/mock.go
  • pkg/inference/embedding/jina/BUILD.bazel
  • pkg/inference/embedding/mock/mock_test.go
  • pkg/inference/embedding/cohere/cohere.go
  • pkg/inference/embedding/nvidia/nvidia.go
  • pkg/inference/embedding/openai/openai.go
  • pkg/inference/embedding/gemini/BUILD.bazel
  • pkg/inference/embedding/jina/jina_test.go
  • pkg/inference/embedding/nvidia/BUILD.bazel
  • pkg/inference/embedding/openai/openai_test.go
  • pkg/inference/embedding/huggingface/huggingface.go
  • pkg/planner/core/expression_rewriter.go
  • pkg/expression/inference_helper_test.go
  • pkg/inference/embedding/tidbcloud/tidbcloud_free.go
  • pkg/inference/embedding/gemini/gemini_test.go
  • pkg/expression/integration_test/integration_test.go
  • pkg/executor/insert_common.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant