Introduce kv slo workload for userver#621
Conversation
Co-authored-by: Artem Ermoshkin <shfdis@yandex-team.ru>
There was a problem hiding this comment.
AI Review Summary
Verdict: \u274c 1 critical issue(s) found
Critical issues
- Critical | Medium: Division by zero when
maxIdis 0 \u2014tests/slo_workloads/userver/key_value/run.cpp:155
Other findings
- Major | High: Signal handlers call non-async-signal-safe functions (TStringBuilder, Cout) \u2014
tests/slo_workloads/userver/key_value/run.cpp:130-138 - Major | High: Blocking OS-level sleep (
SleepUntil) inside userver coroutines starves worker threads \u2014tests/slo_workloads/userver/key_value/run.cpp:186(viaTRpsProvider::Use()) - Major | Medium: Unbounded task vector growth during long runs (600s x 1000 rps = ~600k handles) \u2014
tests/slo_workloads/userver/key_value/run.cpp:180 - Minor | High: Uninitialized variable
result\u2014tests/slo_workloads/utils/utils_main.cpp:121 - Minor | Medium: Division by zero in
PrintStatisticswhentimePassedis 0 \u2014tests/slo_workloads/utils/statistics.cpp:121(pre-existing, not in diff)
This review was generated automatically. Critical issues require attention; other findings are advisory.
If this comment was useful, please give it a \U0001f44d \u2014 it helps us improve the review bot.
| auto readRps = runOptions.Read_rps; | ||
| auto readTimeout = runOptions.ReadTimeout; | ||
| auto maxInfly = runOptions.CommonOptions.MaxInfly; | ||
| auto objectIdRange = static_cast<std::uint32_t>(maxId * 1.25); |
There was a problem hiding this comment.
Severity: Critical | Confidence: Medium\n\nIf maxId is 0 (table empty or initial population failed), objectIdRange becomes 0, and RandomNumber<std::uint32_t>() % objectIdRange on line 183 is a division by zero (undefined behavior, typically SIGFPE crash).\n\nSuggested fix: auto objectIdRange = std::max(1u, static_cast<std::uint32_t>(maxId * 1.25));
| ShouldStop.store(false); | ||
| signal(SIGUSR1, [](int) { | ||
| Cout << TInstant::Now().ToRfc822StringLocal() << " SIGUSR1 handle" << Endl; | ||
| if (GlobalReporter) { |
There was a problem hiding this comment.
Severity: Major | Confidence: High\n\nThe SIGUSR1 handler calls GlobalReporter->ShowProgress() which uses TStringBuilder, Cout, and accesses shared atomics/data. These are not async-signal-safe functions. The SIGINT handler on line 137 also uses Cerr. This is undefined behavior per POSIX.\n\nConsider limiting signal handlers to setting an atomic flag, then checking it in the main loop. Alternatively, use a self-pipe or signalfd to handle signals within the userver event loop.
| std::uint32_t idToSelect = RandomNumber<std::uint32_t>() % objectIdRange; | ||
| const std::uint32_t objectIdKey = GetHash(idToSelect); | ||
|
|
||
| rpsProvider.Use(); |
There was a problem hiding this comment.
Severity: Major | Confidence: High\n\nTRpsProvider::Use() (from utils.cpp) calls SleepUntil() which is a blocking OS-level sleep. This lambda runs inside userver::engine::AsyncNoSpan, so it's a userver coroutine. A blocking sleep blocks the entire worker thread instead of yielding the coroutine. With only 4 worker threads (hardcoded in userver_utils.cpp:141), the read and write RPS limiter loops will each block one thread, leaving only 2 threads for actual request dispatch.\n\nSuggested fix: Replace SleepUntil calls in the RPS limiter with userver::engine::SleepUntil or userver::engine::SleepFor so the coroutine yields instead of blocking the OS thread. This likely requires a userver-specific RPS provider implementation.
|
|
||
| semaphore.lock_shared(); | ||
|
|
||
| tasks.push_back(userver::engine::AsyncNoSpan( |
There was a problem hiding this comment.
Severity: Major | Confidence: Medium\n\nBoth the read loop (lines 180-208) and write loop (lines 256-283) push TaskWithResult<void> into a tasks vector without ever removing completed tasks during execution. They are only waited on at the end. For a 600-second run at 1000 read rps, this accumulates ~600,000 task handles in memory, each holding engine-internal state.\n\nConsider periodically draining completed tasks (e.g., batch-wait and clear every N iterations), or use a counter + semaphore pattern instead of accumulating handles.
| StartStatCollecting(driver, statConfigFile); | ||
|
|
||
| TDatabaseOptions dbOptions{ driver, prefix }; | ||
| int result; |
There was a problem hiding this comment.
Severity: Minor | Confidence: High\n\nint result; is uninitialized. While every code path through the switch sets it before use, some compilers will emit -Wmaybe-uninitialized warnings. The userver variant (userver_utils.cpp:133) correctly initializes this to EXIT_FAILURE.\n\nSuggested fix: int result = EXIT_FAILURE;
|
Analysis performed by claude, claude-opus-4-6. |
No description provided.