Skip to content

feat(batch): support external device task workflow#78

Merged
shark0F0497 merged 7 commits into
mainfrom
keystone-worktree2
May 13, 2026
Merged

feat(batch): support external device task workflow#78
shark0F0497 merged 7 commits into
mainfrom
keystone-worktree2

Conversation

@shark0F0497
Copy link
Copy Markdown
Collaborator

Pull Request Checklist

Please ensure your PR meets the following requirements:

  • Code follows the style guidelines
  • Tests pass locally
  • Code is formatted
  • Documentation updated if needed
  • Commit messages follow conventional commits
  • PR description is complete and clear

Summary

This PR adds the Keystone backend support for an external-device operator workflow, including collector-owned batch task completion by task group and the design documentation for the mobile handoff payload.


Motivation

  • Some data collection devices run their own recording and upload program and do not depend on Axon recorder or Axon transfer.
  • Keystone still needs to manage planned production work, enforce order quotas, and let data collectors mark planned tasks complete from a mobile workflow.
  • The operator-facing handoff now needs stable task-group metadata instead of a raw batch_id.

Changes

Modified Files

  • [internal/api/handlers/batch.go](internal/api/handlers/batch.go) - Adds data-collector task-group completion for planned batch tasks and enforces dispatch/adjustment quota boundaries.
  • [internal/api/handlers/batch_test.go](internal/api/handlers/batch_test.go) - Adds and updates coverage for external-device completion, task-group validation, quota enforcement, ownership checks, and terminal batch rejection.
  • [internal/api/handlers/auth.go](internal/api/handlers/auth.go) - Includes workstation robot type capability metadata in /auth/me so Synapse can select the correct operator workflow.
  • [internal/api/handlers/order.go](internal/api/handlers/order.go) - Aligns order completion and open-batch finalization behavior with the planned-task workflow.
  • [internal/server/server.go](internal/server/server.go) - Registers the collector completion route.
  • [docs/designs/production-units.md](docs/designs/production-units.md) - Updates production-unit quota semantics for task-row-based dispatch.
  • [docs/designs/task-manage.md](docs/designs/task-manage.md) - Updates task management behavior for external-device completion.

Added Files

  • [docs/designs/external-device-mobile-task-flow.md](docs/designs/external-device-mobile-task-flow.md) - Documents the external-device mobile task flow, backend behavior, copy payload shape, and acceptance criteria.

Deleted Files

  • None

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update (documentation changes only)
  • Refactoring (code improvement without functional changes)
  • Performance improvement (code changes that improve performance)
  • Test changes (adding, modifying, or removing tests)

Impact Analysis

Breaking Changes

None

Backward Compatibility

Fully backward compatible. Existing robot types continue to use the Axon workflow unless robot_types.capabilities.requires_axon is explicitly set to false.


Testing

Test Environment

  • Local development environment
  • Go test suite from keystone-worktree2

Test Cases

  • Unit tests pass locally
  • Integration tests pass locally
  • E2E tests pass (if applicable)
  • Manual testing completed

Manual Testing Steps

No manual browser or device testing was performed for this backend PR.

Test Coverage

  • New tests added
  • Existing tests updated
  • Coverage maintained or improved

Command run:

go test ./...

Screenshots / Recordings

Not applicable for this backend and documentation PR.


Performance Impact

  • Memory usage: No change
  • CPU usage: No change
  • Throughput: No change expected
  • Lock contention: Slightly increased only during the new batch task completion transaction, where row locking is intentional to prevent concurrent over-completion.

Documentation


Related Issues

  • Fixes #
  • Related to #
  • Refers to #

Additional Notes

  • The external-device workflow completes planned tasks only; it does not create additional task rows.
  • The task-group copy payload is intentionally documented as a stable one-line JSON string for the external device parser.

Reviewers

@reviewer1 @reviewer2


Notes for Reviewers

  • Please review the changes to [internal/api/handlers/batch.go](internal/api/handlers/batch.go).
  • Focus on collector ownership checks, task-group validation, transaction locking, and order finalization behavior.
  • Verify the documented copy payload remains stable for the Synapse operator workflow.

Checklist for Reviewers

  • Code changes are correct and well-implemented
  • Tests are adequate and pass
  • Documentation is updated and accurate
  • No unintended side effects
  • Performance impact is acceptable
  • Backward compatibility maintained (if applicable)

Comment thread internal/api/handlers/batch.go Outdated
@@ -1198,7 +1554,7 @@ func (h *BatchHandler) AdjustBatchTasks(c *gin.Context) {
}
var batch batchStatusRow
if err := tx.Get(&batch,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: AdjustBatchTasks acquires the batch row lock before acquiring the order row lock. tryAdvanceOrderStatus (called by CompleteTasks) locks the order first and then updates batches. This inconsistent lock ordering can cause deadlocks when both operations run concurrently on the same order and batch (each holds one lock and waits for the other). Fix: always lock the order before the batch. For example, first read the batch's order_id using a non-locking SELECT, then lock the order with FOR UPDATE, and finally lock the batch with FOR UPDATE.


if info.Status == "in_progress" && completedCount == info.TargetCount {
if info.Status == "in_progress" && completedCount >= info.TargetCount {
if _, err := tx.Exec(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIGH: After executing the UPDATE to set order status='completed', the code does not check RowsAffected(). If another concurrent request already completed the order, this UPDATE will affect 0 rows, but the function still calls finalizeOpenBatchesAfterOrderCompletedTx and sends recorder notifications. This causes duplicate finalization and spurious device cancellations. Fix: capture sql.Result, check RowsAffected(), and return early if 0 rows were updated.

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 13, 2026

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 1
HIGH 1
Issue Details (click to expand)

CRITICAL

File Line Issue
internal/api/handlers/batch.go 1556 AdjustBatchTasks locks batch before order, creating deadlock risk with tryAdvanceOrderStatus which locks order first. Should lock order before batch.

HIGH

File Line Issue
internal/api/handlers/order.go 1026 tryAdvanceOrderStatus does not check rows affected after updating order to completed. If another request already completed the order, this still triggers finalization and duplicate recorder notifications. Check RowsAffected and exit early if 0.
Other Observations (not in diff)
  • The created→in_progress transition in tryAdvanceOrderStatus also lacks rows-affected check (less severe but should be fixed for consistency).
  • Test coverage for concurrent scenarios is limited but acceptable.
Files Reviewed (8 files)
  • docs/designs/external-device-mobile-task-flow.md (new)
  • docs/designs/production-units.md
  • docs/designs/task-manage.md
  • internal/api/handlers/auth.go
  • internal/api/handlers/batch.go
  • internal/api/handlers/batch_test.go
  • internal/api/handlers/order.go
  • internal/server/server.go

Fix Link: Fix these issues in Kilo Cloud


Reviewed by step-3.5-flash · 2,439,796 tokens

@shark0F0497 shark0F0497 merged commit ff800d5 into main May 13, 2026
6 checks passed
@shark0F0497 shark0F0497 deleted the keystone-worktree2 branch May 13, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant