[superlog] Log batch_union fallback as WARN instead of ERROR#502
[superlog] Log batch_union fallback as WARN instead of ERROR#502superlog-app[bot] wants to merge 7 commits into
Conversation
…-markdown-evals [codex] Improve Slack insight digest markdown
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
|
The latest updates on your projects. Learn more about Unkey Deploy
|
| if (requestLog) { | ||
| if (payload) { | ||
| requestLog.warn(err.message, payload); | ||
| } else { | ||
| requestLog.warn(err.message); | ||
| } | ||
| return; | ||
| } |
There was a problem hiding this comment.
captureWarning passes the string err.message to requestLog.warn(), while captureError passes the full Error object (err) to requestLog.error(). This drops the stack trace from warn-level logs, so when the fallback fires it will be harder to determine where in ClickHouse (or the query compilation path) the union failure originated. Passing the Error object is consistent with the existing captureError pattern and most logger APIs accept an Error as the first argument to warn just as well as to error.
| if (requestLog) { | |
| if (payload) { | |
| requestLog.warn(err.message, payload); | |
| } else { | |
| requestLog.warn(err.message); | |
| } | |
| return; | |
| } | |
| if (requestLog) { | |
| if (payload) { | |
| requestLog.warn(err, payload); | |
| } else { | |
| requestLog.warn(err); | |
| } | |
| return; | |
| } |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
The
POST /v1/queryendpoint logs at ERROR level whenever the ClickHouse UNION ALL batch optimization fails, even though the code immediately falls back to individual queries and always returns HTTP 200 with data. This produces noisy incidents and can mask real errors.The root cause is that
captureError()is called in therunGroupcatch block before the fallback runs.captureErrorcallsrequestLog.error(), which promotes the entire wide-event log to ERROR level regardless of the recovery that follows.This patch adds a
captureWarninghelper topackages/ai/src/lib/tracing.tsthat callsrequestLog.warn()instead ofrequestLog.error(), and swapscaptureErrorforcaptureWarningin the batch union catch block ofpackages/ai/src/query/batch-executor.ts. Thebatch_union_fallback,batch_union_error, andoperationfields are still recorded — the only change is the log level for these recovered failures.An alternative approach would be to omit any telemetry call in the catch block and rely solely on the
mergeWideEvent({ batch_union_fallback: 1 })fields for monitoring fallback rate, which would produce cleaner INFO-level logs. The warn approach was chosen to preserve the ability to detect when the fallback fires via log-level filtering.Incident on Superlog
Was this PR helpful? Leave feedback — goes straight to the Superlog team.
Summary by cubic
Log batch-union fallback as WARN instead of ERROR in POST /v1/query to reduce noisy incidents while keeping visibility and telemetry intact.
captureWarninginpackages/ai/src/lib/tracing.tsto log recovered failures withrequestLog.warn(...).captureErrorwithcaptureWarninginpackages/ai/src/query/batch-executor.tsforbatch_unionfailures; retainedbatch_union_fallback,batch_union_error, andoperationfields.Written for commit 8d891ce. Summary will update on new commits.