refactor: committor service#1263
Conversation
# Conflicts: # magicblock-accounts/src/scheduled_commits_processor.rs # magicblock-api/src/magic_validator.rs
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR replaces the actor-based CommittorService with an intent-execution model: CommittorProcessor now correlates execution results to callers via oneshot listeners and a dispatcher; a new IntentExecutionService accepts scheduled bundles, tracks in-flight intent metadata, processes execution results, and uses an ER intent client to signal sent commits. Persister gains pending-bundle reconstruction and tests were migrated to use CommittorProcessor. Validator, MagicSysAdapter, and tickers were rewired to the new components; ComputeBudget and CommittorServiceError sets were narrowed; account-cloner and scheduled-commits processor public surfaces were removed; Cargo/dev-deps updated accordingly. Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@magicblock-api/src/magic_sys_adapter.rs`:
- Around line 39-63: Replace the enter() + tokio::spawn + drop(_guard) pattern
in fetch_current_commit_nonces_sync with scheduling the task directly onto the
stored runtime Handle: clone self.handle into a local handle (e.g. let handle =
self.handle.clone()), remove the _guard and enter()/drop calls, and call
handle.spawn(async move { ... }) to run the async block that calls
committor_processor.fetch_current_commit_nonces(&pubkeys,
min_context_slot).await and then sender.send(result). This keeps
committor_processor, pubkeys and sender moved into the async block and avoids
relying on a runtime-guard lifetime.
In `@magicblock-api/src/magic_validator.rs`:
- Around line 423-424: Remove the stale NOTE comment that says "set during
[Self::start]" for the struct field intent_execution_service and update the
comment to reflect that intent_execution_service is constructed earlier in
try_from_config and moved into the struct at construction time (or simply remove
the NOTE altogether); locate the field declaration for intent_execution_service
in the MagicValidator struct and delete or replace the misleading comment so the
code accurately documents initialization happening in try_from_config rather
than Self::start.
- Around line 1009-1016: The shutdown sequence in MagicValidator::stop is wrong:
canceling self.token before stopping the committor/intent pipeline can cause
IntentExecutionService::result_processor to exit on cancellation and drop
in‑flight committor results; reorder shutdown so the committor/intent pipeline
is drained/stopped before calling self.token.cancel() (e.g., drain
processor.subscribe_for_results() or call intent_execution_service.stop() and
await it fully while ensuring result_processor consumes outstanding messages),
and replace the ignored let _ = self.intent_execution_service.stop().await; with
proper error handling/logging (capture the Result from
intent_execution_service.stop().await and log failures via your logger) so
shutdown errors aren’t discarded.
In `@magicblock-committor-service/src/committor_processor.rs`:
- Line 165: The .expect(POISONED_MUTEX_MSG) call on the mutex lock must not
panic; replace it with proper error handling by matching the lock result or
using lock().map_err(|e| CommittorServiceError::PoisonedLock(format!("{}: {:?}",
POISONED_MUTEX_MSG, e)))? (or an equivalent CommittorServiceError variant) at
both lock sites referenced (the call using POISONED_MUTEX_MSG and the other at
the 240-240 location), returning the error up the call chain; alternatively, if
you truly guarantee the lock cannot be poisoned, add an explicit invariant
comment near the lock location documenting why a panic is acceptable and keep
tests demonstrating the invariant.
- Around line 160-193: The code inserts oneshot senders into
self.pending_result_listeners while building `receivers` but doesn't remove them
on error paths, leaking entries and causing permanent `RepeatingMessageError`
for those ids; fix by tracking which ids were inserted while iterating
`intent_bundles` (e.g., push inserted `intent.id`s into a local Vec) and on any
early failure (duplicate detection or after a failing `schedule_intent_bundles`
call) remove those ids from the `pending_result_listeners` map before returning
(or use a scope guard/RAII to automatically rollback); ensure this cleanup
references the same mutex-protected map (`pending_result_listeners`) and handles
the oneshot senders you inserted so future calls can reuse those ids.
- Around line 98-112: The code currently calls
pending_result_listeners.lock().expect(POISONED_MUTEX_MSG) in
execute_intent_bundles and dispatcher; replace those panicking expects with
proper PoisonError handling: in execute_intent_bundles (the function that
returns a Result) map or match the Mutex::lock() result and convert a
PoisonError into a propagated Err (or a clear domain-specific error) instead of
panicking; in dispatcher (the async task spawned with tokio::spawn) match the
lock result and, on poisoning, log an error via the existing logger and
gracefully return/stop the spawned task rather than calling expect. Use the
MutexGuard from Ok(...) for normal flow and reference the same
pending_result_listeners Arc<Mutex<...>> to locate the calls.
In `@magicblock-committor-service/src/service.rs`:
- Line 187: The call to self.intents_meta_map.lock().expect(POISONED_MUTEX_MSG)
uses expect on a potentially poisoned mutex; replace this panic-on-poison with
proper handling by mapping the PoisonError into a controlled Err return (or
propagate a custom error) from the surrounding function, or if you can guarantee
the invariant, add a clear comment documenting why poisoning cannot occur and
convert expect into unwrap_or_else with a documented unreachable!() guarded by
that invariant; specifically update the code paths where intents_meta_map.lock()
and the other occurrence at line ~305 are used (referencing
self.intents_meta_map.lock() and POISONED_MUTEX_MSG) to either return a Result
with an appropriate error variant or to assert the invariant in a
well-documented, reviewed comment rather than calling expect.
- Around line 184-207: When inserting ScheduledBaseIntentMeta entries into
intents_meta_map for each intent in intent_bundles, collect the inserted intent
IDs and ensure you remove those same IDs if
processor.schedule_intent_bundles(...) returns an error; specifically, capture
the inserted IDs (before calling process_undelegation_requests or immediately
after), call process_undelegation_requests, then call schedule_intent_bundles
inside a match/if-let so on Err(e) you lock intents_meta_map again and remove
each collected id (or use retain) to undo the inserts, then return the error;
reference intents_meta_map, ScheduledBaseIntentMeta::new, intent_bundles,
process_undelegation_requests, and processor.schedule_intent_bundles when
locating where to add the cleanup.
In `@magicblock-committor-service/src/service/intent_client.rs`:
- Around line 88-91: The code panics via expect when fetching the MagicContext
account (get_account(&MAGIC_CONTEXT_PUBKEY)) which must be handled or documented
as an invariant; replace the expect by explicit error handling in the function
that calls get_account (e.g.,
match/self.accounts_db.get_account(&MAGIC_CONTEXT_PUBKEY)) and return a proper
error variant (e.g., Err(Self::Error::MissingMagicContext) or map to an existing
error type) when the account is absent, or if a panic is truly intended add a
clear invariant comment next to MAGIC_CONTEXT_PUBKEY and magic_context_acc
explaining why the account is guaranteed to exist in production; ensure you
reference and update the function signature to propagate the error if needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: edccc0c8-3e2d-4553-b65f-f3d735a7d6c9
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (27)
magicblock-account-cloner/Cargo.tomlmagicblock-account-cloner/src/account_cloner.rsmagicblock-account-cloner/src/lib.rsmagicblock-account-cloner/src/util.rsmagicblock-accounts/Cargo.tomlmagicblock-accounts/src/errors.rsmagicblock-accounts/src/lib.rsmagicblock-accounts/src/scheduled_commits_processor.rsmagicblock-api/src/errors.rsmagicblock-api/src/magic_sys_adapter.rsmagicblock-api/src/magic_validator.rsmagicblock-api/src/tickers.rsmagicblock-committor-service/Cargo.tomlmagicblock-committor-service/src/committor_processor.rsmagicblock-committor-service/src/compute_budget.rsmagicblock-committor-service/src/error.rsmagicblock-committor-service/src/lib.rsmagicblock-committor-service/src/pubkeys_provider.rsmagicblock-committor-service/src/service.rsmagicblock-committor-service/src/service/intent_client.rsmagicblock-committor-service/src/service_ext.rsmagicblock-committor-service/src/stubs/changeset_committor_stub.rsmagicblock-committor-service/src/stubs/mod.rsmagicblock-committor-service/src/transactions.rsmagicblock-processor/tests/replay_base_slot.rstest-integration/test-committor-service/Cargo.tomltest-integration/test-committor-service/tests/test_ix_commit_local.rs
💤 Files with no reviewable changes (13)
- magicblock-committor-service/src/pubkeys_provider.rs
- magicblock-account-cloner/src/lib.rs
- magicblock-committor-service/src/stubs/mod.rs
- magicblock-accounts/src/scheduled_commits_processor.rs
- magicblock-committor-service/src/stubs/changeset_committor_stub.rs
- magicblock-accounts/Cargo.toml
- magicblock-committor-service/src/service_ext.rs
- magicblock-account-cloner/src/util.rs
- magicblock-account-cloner/src/account_cloner.rs
- magicblock-account-cloner/Cargo.toml
- magicblock-accounts/src/lib.rs
- magicblock-committor-service/src/transactions.rs
- magicblock-committor-service/src/compute_budget.rs
| fn fetch_current_commit_nonces_sync( | ||
| &self, | ||
| pubkeys: &[Pubkey], | ||
| min_context_slot: u64, | ||
| ) -> std::sync::mpsc::Receiver<TaskInfoFetcherResult<HashMap<Pubkey, u64>>> | ||
| { | ||
| let (sender, receiver) = std::sync::mpsc::channel(); | ||
| let committor_processor = self.committor_processor.clone(); | ||
| let pubkeys = pubkeys.to_owned(); | ||
|
|
||
| // This is required to switch from TransactionExecutor runtime | ||
| // blocking on it would cause a panic | ||
| let _guard = self.handle.enter(); | ||
| tokio::spawn(async move { | ||
| let result = committor_processor | ||
| .fetch_current_commit_nonces(&pubkeys, min_context_slot) | ||
| .await; | ||
| if let Err(err) = sender.send(result) { | ||
| error!(error = ?err, "Failed to send result back"); | ||
| } | ||
| }); | ||
| drop(_guard); | ||
|
|
||
| receiver | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Prefer handle.spawn over enter() + tokio::spawn + drop.
The enter()/tokio::spawn/drop(_guard) dance works, but its correctness silently depends on the guard being live exactly around the spawn call. Handle::spawn schedules directly onto the target runtime without mutating the current thread's runtime context, removing that footgun and the explanatory comment dependency.
♻️ Proposed refactor
- // This is required to switch from TransactionExecutor runtime
- // blocking on it would cause a panic
- let _guard = self.handle.enter();
- tokio::spawn(async move {
- let result = committor_processor
- .fetch_current_commit_nonces(&pubkeys, min_context_slot)
- .await;
- if let Err(err) = sender.send(result) {
- error!(error = ?err, "Failed to send result back");
- }
- });
- drop(_guard);
-
- receiver
+ // Spawn on the validator runtime handle rather than the caller's
+ // (TransactionExecutor) runtime.
+ self.handle.spawn(async move {
+ let result = committor_processor
+ .fetch_current_commit_nonces(&pubkeys, min_context_slot)
+ .await;
+ if let Err(err) = sender.send(result) {
+ error!(error = ?err, "Failed to send result back");
+ }
+ });
+
+ receiver🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@magicblock-api/src/magic_sys_adapter.rs` around lines 39 - 63, Replace the
enter() + tokio::spawn + drop(_guard) pattern in
fetch_current_commit_nonces_sync with scheduling the task directly onto the
stored runtime Handle: clone self.handle into a local handle (e.g. let handle =
self.handle.clone()), remove the _guard and enter()/drop calls, and call
handle.spawn(async move { ... }) to run the async block that calls
committor_processor.fetch_current_commit_nonces(&pubkeys,
min_context_slot).await and then sender.send(result). This keeps
committor_processor, pubkeys and sender moved into the async block and avoids
relying on a runtime-guard lifetime.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
magicblock-committor-service/src/service.rs (4)
38-40:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftTrack both worker handles in
Started.
start()launches two tasks, butStartedstores only theaccept_workerhandle. Ifresult_processorexits or panics, the service cannot observe it, andstop()cannot await it, so commit notifications can fail silently while shutdown still looks clean.Also applies to: 82-90, 132-144
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@magicblock-committor-service/src/service.rs` around lines 38 - 40, The Started variant only stores one JoinHandle so the service can’t observe or await the result_processor task; update the IntentExecutionService enum (Created/Started) so Started holds both JoinHandle<()> values (accept_worker and result_processor), modify start() to store both handles when transitioning to Started, and update stop() (and any matching pattern arms at other spots like the other mentions around the enum use) to join/await both handles and propagate or log any panic/errors from result_processor so the service can detect failures.
191-193:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject duplicate in-flight intent IDs instead of overwriting the existing metadata.
HashMap::insertreplaces the previous entry for the sameintent.id. If the same intent is accepted again before the first execution result arrives, the first result will be paired with the newer metadata, and the later result will miss the map entirely. This needs a collision check before scheduling.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@magicblock-committor-service/src/service.rs` around lines 191 - 193, The current scheduling loop uses intent_metas.insert(intent.id, ScheduledBaseIntentMeta::new(intent)) which silently overwrites existing in-flight metadata; change this to detect collisions and reject duplicates instead of replacing them: before inserting (in the function processing intent_bundles / loop over intent_bundles) check intent_metas.contains_key(&intent.id) or use the HashMap::entry API to only insert when Vacant, and return or propagate a scheduling error for that intent ID so callers know the duplicate was rejected; reference the intent_bundles iterator, intent.id, intent_metas map, and ScheduledBaseIntentMeta::new when making this change.
304-325:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep the intent metadata until
notify_commit_sentsucceeds.This code removes the map entry before awaiting the external
notify_commit_sent(...)call. On a transient client error,result_processorjust logs and continues, but the metadata is already gone, so that notification cannot be retried and is effectively lost.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@magicblock-committor-service/src/service.rs` around lines 304 - 325, The code removes the entry from intents_meta_map before awaiting intent_client.notify_commit_sent, which loses metadata on transient failures; instead, do not remove the map entry up front—use intents_meta_map.lock().expect(...).get_mut(&intent_id) (or clone the minimal data you need such as intent_meta.intent_sent_transaction and build_sent_commit from a cloned intent_meta) to prepare sent_transaction and sent_commit, then await intent_client.notify_commit_sent(sent_transaction, sent_commit), and only after notify_commit_sent returns Ok remove the entry (or mem::take the intent_meta.intent_sent_transaction and update/remove the map) so metadata remains available for retries on errors.
66-73:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid mutating the service into
Erroron invalid state calls.
self.take()swaps the enum toErrorbefore the state check runs. A secondstart()/stop()call therefore destroys the previous state and can drop the liveStartedhandle, which makes later shutdown/join impossible. Check the state first, then only extract it in the valid branch.Also applies to: 82-87
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@magicblock-committor-service/src/service.rs` around lines 66 - 73, The start() method (and similarly stop()) currently calls self.take() which replaces the enum with Self::Error before checking the state, destroying a live Started handle; instead, first pattern-match on &self (or use matches/if let) to verify it's in Self::Created (or the expected state) and only then call take() to move the value out; update the start() function to return Err(IntentExecutionServiceError::InvalidState(...)) when the pattern check fails without mutating self, and apply the same change to the corresponding stop() branch referenced in the comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@magicblock-committor-service/src/service.rs`:
- Around line 38-40: The Started variant only stores one JoinHandle so the
service can’t observe or await the result_processor task; update the
IntentExecutionService enum (Created/Started) so Started holds both
JoinHandle<()> values (accept_worker and result_processor), modify start() to
store both handles when transitioning to Started, and update stop() (and any
matching pattern arms at other spots like the other mentions around the enum
use) to join/await both handles and propagate or log any panic/errors from
result_processor so the service can detect failures.
- Around line 191-193: The current scheduling loop uses
intent_metas.insert(intent.id, ScheduledBaseIntentMeta::new(intent)) which
silently overwrites existing in-flight metadata; change this to detect
collisions and reject duplicates instead of replacing them: before inserting (in
the function processing intent_bundles / loop over intent_bundles) check
intent_metas.contains_key(&intent.id) or use the HashMap::entry API to only
insert when Vacant, and return or propagate a scheduling error for that intent
ID so callers know the duplicate was rejected; reference the intent_bundles
iterator, intent.id, intent_metas map, and ScheduledBaseIntentMeta::new when
making this change.
- Around line 304-325: The code removes the entry from intents_meta_map before
awaiting intent_client.notify_commit_sent, which loses metadata on transient
failures; instead, do not remove the map entry up front—use
intents_meta_map.lock().expect(...).get_mut(&intent_id) (or clone the minimal
data you need such as intent_meta.intent_sent_transaction and build_sent_commit
from a cloned intent_meta) to prepare sent_transaction and sent_commit, then
await intent_client.notify_commit_sent(sent_transaction, sent_commit), and only
after notify_commit_sent returns Ok remove the entry (or mem::take the
intent_meta.intent_sent_transaction and update/remove the map) so metadata
remains available for retries on errors.
- Around line 66-73: The start() method (and similarly stop()) currently calls
self.take() which replaces the enum with Self::Error before checking the state,
destroying a live Started handle; instead, first pattern-match on &self (or use
matches/if let) to verify it's in Self::Created (or the expected state) and only
then call take() to move the value out; update the start() function to return
Err(IntentExecutionServiceError::InvalidState(...)) when the pattern check fails
without mutating self, and apply the same change to the corresponding stop()
branch referenced in the comment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b8f60698-abd1-4c6f-9243-c39ad655f55e
📒 Files selected for processing (1)
magicblock-committor-service/src/service.rs
# Conflicts: # magicblock-accounts/src/scheduled_commits_processor.rs # magicblock-api/src/magic_validator.rs # magicblock-committor-service/src/committor_processor.rs # magicblock-committor-service/src/service.rs # test-integration/test-committor-service/tests/test_ix_commit_local.rs
# Conflicts: # magicblock-api/src/magic_validator.rs # magicblock-committor-service/src/committor_processor.rs
# Conflicts: # Cargo.lock # test-integration/Cargo.lock
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@magicblock-committor-service/src/tasks/task_strategist.rs`:
- Around line 22-23: The temporary field `standalone_action_nonce` is public but
marked for removal; change its visibility to `pub(crate)` to avoid expanding the
public API and add a short doc comment above `standalone_action_nonce`
explaining its purpose, that it is temporary, and referencing the TODO for
removal (e.g., "Temporary nonce used for standalone actions; kept for
migration/testing; remove once X is implemented"). Update the field declaration
in the same struct that contains `standalone_action_nonce` to reflect
`pub(crate)` and include the documentation line(s).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7d1fab1b-54f6-457b-b2a2-6774a7ad0b93
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
magicblock-account-cloner/Cargo.tomlmagicblock-account-cloner/src/lib.rsmagicblock-committor-service/src/tasks/task_strategist.rs
💤 Files with no reviewable changes (1)
- magicblock-account-cloner/Cargo.toml
# Conflicts: # magicblock-committor-service/src/transactions.rs
refactor(committor-service): simplify code, remove unused and dead code
Summary
Simplifies
magicblock-committor-serviceby removing the Actor pattern (CommittorService/CommittorActor/CommittorMessage/BaseIntentCommittor/CommittorServiceExt) and replacing it with direct use ofCommittorProcessorand a newIntentExecutionService, aligning the committor service with the same service model already used by task-scheduler and actions_callback. No business logic was changed.Breaking Changes
Test Plan
test-integration/test-committor-serviceupdated to use the newCommittorProcessorAPI directly and pass unchanged.Summary by CodeRabbit
Refactor
New Features
Tests