Harden recorder state sync and cloud sync scheduling#85
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Checklist
Please ensure your PR meets the following requirements:
Summary
This PR hardens recorder/transfer state coordination around reconnects, timeouts, and connection takeover, adds device state streaming support, improves timeout logging, and introduces an opt-in cloud sync auto-scan toggle.
Motivation
Changes
Modified Files
internal/api/handlers/axon_rpc.go- Harden recorder RPC handling, state confidence, reconnect reconciliation, and stale state protections.internal/api/handlers/task.go- Improve task callback and recorder/transfer coordination behavior.internal/api/handlers/transfer.go- Handle transfer connection takeover and replaced messages more safely.internal/services/recorder_hub.go- Track recorder state and connection lifecycle more defensively.internal/services/transfer_hub.go- Add transfer hub support for connection takeover and stale connection handling.internal/services/device_state_broker.go- Add device state broadcast infrastructure.internal/api/handlers/device_state_stream.go- Expose device state stream handling.internal/cloud/*- Add timeout logging helpers and improve cloud client timeout visibility.internal/config/config.go- AddKEYSTONE_SYNC_AUTO_SCAN_ENABLED, defaulting tofalse.internal/services/sync_worker.go- Gate only newly eligible episode auto-discovery behind the new auto-scan setting while keeping pending/retry processing active.internal/api/handlers/sync.go- Returnauto_scan_enabledfrom/api/v1/sync/config.cmd/keystone-edge/main.go- Pass the auto-scan setting into the sync worker and include it in startup logs.README.mdandARCHITECTURE.md- Document manual-first cloud sync scheduling.docs/designs/*- Document recorder interaction tests and cloud sync scheduling behavior.Added Files
internal/api/handlers/recorder_axon_interaction_test.go- Adds recorder/Axon interaction coverage.internal/api/handlers/task_state_recovery_test.go- Adds task state recovery coverage.internal/api/handlers/transfer_connection_takeover_test.go- Adds transfer connection takeover coverage.internal/services/device_state_broker_test.go- Adds device state broker tests.internal/cloud/timeout_log.go- Adds shared timeout logging helpers.docs/designs/recorder-axon-interaction-test-plan.md- Documents recorder interaction test coverage.docs/designs/recorder-axon-interaction-tests.html- Adds human-readable recorder interaction test documentation.Deleted Files
None.
Type of Change
Impact Analysis
Breaking Changes
None.
Backward Compatibility
Fully backward compatible. Cloud sync remains available when
KEYSTONE_SYNC_ENABLED=true; only automatic discovery of newly eligible episodes is now opt-in throughKEYSTONE_SYNC_AUTO_SCAN_ENABLED=true.Testing
Test Environment
/usr/local/go/bin/goTest Cases
Manual Testing Steps
/usr/local/go/bin/go fmt ./..../usr/local/go/bin/go test ./....git diff --checkpasses.Test Coverage
Screenshots / Recordings
Not applicable.
Performance Impact
Documentation
Related Issues
Additional Notes
KEYSTONE_SYNC_AUTO_SCAN_ENABLEDdefaults tofalse, so newly approved unsynced episodes are no longer auto-discovered unless explicitly enabled.Reviewers
@
Notes for Reviewers
internal/api/handlers/axon_rpc.goandinternal/services/recorder_hub.go.internal/services/sync_worker.goto confirm manual sync and retry recovery remain active while auto-scan is disabled.Checklist for Reviewers