feat(metrics): migrate app/legacyabci, loadtest, utils/logging, wasmbinding to OpenTelemetry by amir-deris · Pull Request #3446 · sei-protocol/sei-chain

amir-deris · 2026-05-15T21:54:43Z

Adds OTel instrumentation to app/legacyabci, loadtest, utils/logging, and wasmbinding following the same pattern as PLT-329, PLT-330, and PLT-339 (#3439).

New instruments

app/legacyabci (meter legacyabci)

begin_blocker_duration — histogram, seconds, fine-grained buckets; dual-emit TODO(PLT-343)
ibc_begin_blocker_duration — histogram, seconds, fine-grained buckets; dual-emit TODO(PLT-343)
tx_duration — histogram, seconds, mode label (check/recheck/deliver); dual-emit TODO(PLT-343)

loadtest (meter loadtest)

produce — counter, msg_type label (replaces metrics.IncrProducerEventCount)
consume — counter, msg_type label (replaces metrics.IncrConsumerEventCount)
tps — gauge, msg_type label (replaces metrics.SetThroughputMetricByType)

utils/logging (meter utils_logging)

log_not_done_after — counter, label label (replaces metrics.IncrLogIfNotDoneAfter)

wasmbinding (meter wasmbinding)

wasm_query_association_error — counter, scenario + type labels (replaces metrics.IncrementErrorMetrics)
wasm_query_sdk_error — counter, scenario + codespace + code labels (new; fires for any structured SDK error that is not an association error)

Notes

loadtest and utils/logging (only used for tests) are direct replacement with no dual-emit — legacy calls removed entirely.
app/legacyabci uses dual-emit with TODO(PLT-343) comments pending dashboard verification.
utils/panic.MetricsPanicCallback was unused and removed as part of this cleanup.

cursor · 2026-05-15T21:55:08Z

PR Summary

Medium Risk
Touches BeginBlock, CheckTx, and DeliverTx hot paths to add OTel recording (with dual-emitted legacy telemetry), so any mistakes could impact node performance or metric cardinality, though business logic is otherwise unchanged.

Overview
Migrates instrumentation in app/legacyabci to OpenTelemetry by introducing begin_blocker_duration, ibc_begin_blocker_duration, and tx_duration histograms and recording them inside the existing defers for BeginBlock, IBC begin-block, and tx processing (check/recheck/deliver), while temporarily dual-emitting the prior telemetry.* measurements behind TODO(PLT-343).

Updates loadtest to emit OTel produce/consume counters and tps gauge (tagged by msg_type), and updates utils/logging.LogIfNotDoneAfter to increment a new OTel log_not_done_after counter instead of the legacy helper.

Adds OTel error counters for wasmbinding queries and routes query error recording through recordQueryError, while removing several unused legacy metric helpers (including utils/panic.MetricsPanicCallback) and deleting the old app/ante/metrics.go pending-nonce metric.

^{Reviewed by Cursor Bugbot for commit 9685149. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-05-15T21:55:42Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	May 19, 2026, 5:06 PM

codecov · 2026-05-15T21:59:38Z

Codecov Report

❌ Patch coverage is 46.37681% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.34%. Comparing base (25a36cb) to head (9685149).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
wasmbinding/metrics.go	26.31%	13 Missing and 1 partial ⚠️
app/legacyabci/check_tx.go	0.00%	11 Missing ⚠️
loadtest/metrics.go	0.00%	4 Missing ⚠️
app/legacyabci/metrics.go	50.00%	1 Missing and 1 partial ⚠️
loadtest/loadtest_client.go	0.00%	2 Missing ⚠️
utils/logging/metrics.go	50.00%	1 Missing and 1 partial ⚠️
loadtest/main.go	0.00%	1 Missing ⚠️
wasmbinding/queries.go	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3446      +/-   ##
==========================================
+ Coverage   59.32%   59.34%   +0.01%     
==========================================
  Files        2127     2130       +3     
  Lines      175898   175859      -39     
==========================================
+ Hits       104353   104364      +11     
+ Misses      62457    62404      -53     
- Partials     9088     9091       +3

Flag	Coverage Δ
sei-chain-pr	`23.54% <46.37%> (?)`
sei-db	`70.41% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
app/legacyabci/begin_block.go	`100.00% <100.00%> (ø)`
app/legacyabci/deliver_tx.go	`96.05% <100.00%> (+0.21%)`	⬆️
utils/logging/time.go	`100.00% <100.00%> (ø)`
utils/metrics/metrics_util.go	`86.36% <ø> (+19.32%)`	⬆️
utils/panic.go	`78.57% <ø> (+37.83%)`	⬆️
loadtest/main.go	`0.00% <0.00%> (ø)`
wasmbinding/queries.go	`27.94% <50.00%> (-0.43%)`	⬇️
app/legacyabci/metrics.go	`50.00% <50.00%> (ø)`
loadtest/loadtest_client.go	`0.00% <0.00%> (ø)`
utils/logging/metrics.go	`50.00% <50.00%> (ø)`
... and 3 more

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

amir-deris · 2026-05-15T22:35:35Z

 			// Generate a message type first
 			messageType := c.getRandomMessageType(config.MessageTypes)
-			metrics.IncrProducerEventCount(messageType)
+			loadtestMetrics.produceCount.Add(context.Background(), 1, otelmetric.WithAttributes(attribute.String("msg_type", messageType)))


For loadtest, replaced the metrics as it is only for test and no need to emit dual metrics.

amir-deris · 2026-05-15T22:36:42Z

 			// reraise panic in main goroutine
 			panic(err)
 		case <-time.After(after):
-			metrics.IncrLogIfNotDoneAfter(label)


This function LogIfNotDoneAfter is only used in tests, so no need for keeping the older metric.

amir-deris · 2026-05-15T22:37:05Z

-// Metric Names:
-//
-//	sei_log_not_done_after_counter
-func IncrLogIfNotDoneAfter(label string) {


removed functions that were only used in tests.

amir-deris · 2026-05-15T22:37:28Z

 	}
 }

-func MetricsPanicCallback(err any, ctx sdk.Context, key string) {


Unused function.

…and-some-others-Otel

masih · 2026-05-19T08:56:09Z

+		wasmQueryMetrics.sdkError.Add(ctx, 1, metric.WithAttributes(
+			attribute.String("scenario", scenario),
+			attribute.String("codespace", codespace),
+			attribute.String("code", fmt.Sprintf("%d", code)),


Why convert to string? attributes can have numeric types.

What is the cardinality of the code? For example if we read this code over the wire, we mustn't add it as an attribute to metrics since it opens up arbitrary memory growth.

That is a great feedback! I will remove the code attribute to make the cardinality more manageable while keeping scenario and codespace which tells us "which query type is failing in which module".

…and-some-others-Otel

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 9685149. Configure here.}

cursor · 2026-05-19T17:09:07Z

-	"github.com/sei-protocol/sei-chain/utils/metrics"
 	tokenfactorytypes "github.com/sei-protocol/sei-chain/x/tokenfactory/types"
+	"go.opentelemetry.io/otel/attribute"
+	otelmetric "go.opentelemetry.io/otel/metric"


Loadtest OTel provider never configured

Medium Severity

The loadtest binary records produce, consume, and tps via OpenTelemetry but never configures a global MeterProvider (unlike seid, which calls SetupOtelMetricsProvider). Instruments use the default no-op provider, so those metrics are dropped and no longer show up on the loadtest /metrics scrape after the legacy telemetry helpers were removed.

Additional Locations (2)

loadtest/loadtest_client.go#L197-L198

loadtest/main.go#L276-L277

^{Reviewed by Cursor Bugbot for commit 9685149. Configure here.}

Not spending too much time on loadtest as the sei-load is the preferred load testing framework.

Added metrics for legacyabci, loadtest, utils and wasmbinding

8d71861

amir-deris self-assigned this May 15, 2026

amir-deris added the non-app-hash-breaking label May 15, 2026

amir-deris changed the title ~~Added metrics for legacyabci, loadtest, utils and wasmbinding~~ feat(metrics): migrate app/legacyabci, loadtest, utils/logging, wasmbinding to OpenTelemetry May 15, 2026

cursor Bot reviewed May 15, 2026

View reviewed changes

Comment thread wasmbinding/queries.go Outdated

Comment thread utils/metrics/metrics_util.go

amir-deris added 2 commits May 15, 2026 15:05

Added the dual emit metric for wasmbinding

45b43f9

Code cleanup

9ccc1c6

cursor Bot reviewed May 15, 2026

View reviewed changes

Comment thread wasmbinding/metrics.go Outdated

Removed dual import

bc0a6eb

amir-deris requested review from bdchatham and masih May 15, 2026 22:33

amir-deris commented May 15, 2026

View reviewed changes

Merge branch 'main' into amir/plt-342-migrate-appLegacyAbci-loadTest-…

d0f0022

…and-some-others-Otel

bdchatham approved these changes May 18, 2026

View reviewed changes

masih approved these changes May 19, 2026

View reviewed changes

amir-deris added 2 commits May 19, 2026 10:03

Removed code from query error metric, cleaned up some unused methods

cbe1714

Merge branch 'main' into amir/plt-342-migrate-appLegacyAbci-loadTest-…

9685149

…and-some-others-Otel

cursor Bot reviewed May 19, 2026

View reviewed changes

amir-deris added this pull request to the merge queue May 19, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 19, 2026

amir-deris added this pull request to the merge queue May 19, 2026

Merged via the queue into main with commit 77504a6 May 19, 2026
42 checks passed

amir-deris deleted the amir/plt-342-migrate-appLegacyAbci-loadTest-and-some-others-Otel branch May 19, 2026 18:29

Conversation

amir-deris commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

New instruments

Notes

Uh oh!

cursor Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amir-deris May 15, 2026

Choose a reason for hiding this comment

Uh oh!

amir-deris May 15, 2026

Choose a reason for hiding this comment

Uh oh!

amir-deris May 15, 2026

Choose a reason for hiding this comment

Uh oh!

amir-deris May 15, 2026

Choose a reason for hiding this comment

Uh oh!

masih May 19, 2026

Choose a reason for hiding this comment

Uh oh!

amir-deris May 19, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 19, 2026

Choose a reason for hiding this comment

Loadtest OTel provider never configured

Uh oh!

amir-deris May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amir-deris commented May 15, 2026 •

edited

Loading

cursor Bot commented May 15, 2026 •

edited

Loading

github-actions Bot commented May 15, 2026 •

edited

Loading

codecov Bot commented May 15, 2026 •

edited

Loading