Skip to content

Coordinate V1 production cutover and human-gate closeout #71

Description

@alexeygrigorev

Coordinate V1 production cutover and human-gate closeout

Status: blocked
Tags: enhancement, portal, work-engine, assistant, podcast, migration, infra, data, testing, human, P0
Depends on: #7, #30, #41
Blocks: final V1 production launch acceptance

Scope

Coordinate the final DataOps V1 production cutover so launch readiness is proven in one place instead of scattered across implementation and human-gated issues.

This is a launch/readiness checklist, not a feature implementation issue. Agent-verifiable V1 launch-candidate work is complete for current main commit b5ab507: the app is deployed through GitHub Actions OIDC, seeded with users/templates/recurring work, shows actionable daily work in Operations Home, links seeded work to process docs, hydrates contextual docs images, and has current export/restore evidence.

The issue remains open because final production acceptance still needs real external-system and operator checks that an agent cannot safely perform without human credentials/authority: Telegram/Podcast Assistant smoke, real assistant execution decision, Trello export/import approval, and final operator acceptance.

Use these issues and comments as source of truth:

The cutover target is the deployed V1 app at ops.dtcdev.click or the current DataOps Lambda Function URL if the custom domain is not available. The operator should be able to log in, see what needs action today, open workflows/tasks, act on due/overdue/waiting/follow-up work, complete tasks with proof, use SOP/process context, and trust that export/restore evidence exists before valuable production data is imported.

Current Launch Candidate Evidence

  • Commit: b5ab507 (Hydrate docs image assets in production cache).
  • Deploy workflow passed: https://github.com/DataTalksClub/dataops/actions/runs/28332978858
  • Content validation passed: https://github.com/DataTalksClub/dataops/actions/runs/28332978870
  • Planning docs validation passed: https://github.com/DataTalksClub/dataops/actions/runs/28332978859
  • AWS CloudFormation stack dataops-v1 is UPDATE_COMPLETE.
  • Production Process Quality after refresh: blocking=0, broken-asset-reference=0.
  • Current production archive: s3://dataops-v1-dataopsexportarchivebucket-iqwwwmalyjbl/execution-exports/prod/2026-06-28/dataops-execution-2026-06-28T19-24-17-410Z.tar.gz.
  • Archive checksum: sha256:1a22c1294aaefdf3f20c3b25012aa5b4344a44c88111786402926bf5c74626a3.
  • Restore drill evidence: .tmp/exports/current-launch-candidate-b5ab507/restore-drill/restore-evidence-2026-06-28T19-24-25-817Z.json.
  • Restore drill result: validation.valid=true, dry_run_import.valid=true, dryRunTotalRecords=25, targetEnvironment=staging-drill.

Acceptance Criteria

  • Make first-run V1 production land on actionable workflow work #69 first-run/actionable workflow work is complete, tested, accepted, merged, deployed, and superseded by later production daily-work evidence.
  • Seed production-safe recurring operations for daily V1 work #70 production-safe recurring operations seed is complete, tested, accepted, merged, deployed, and superseded by Show seeded daily tasks in Operations Home today queue #73/Attach process instructions to seeded recurring tasks #74 daily-task production evidence.
  • Show seeded daily tasks in Operations Home today queue #73 shows seeded daily tasks in Operations Home today queue.
  • Attach process instructions to seeded recurring tasks #74 attaches process instructions, systems, and tags to seeded recurring tasks and repairs generated recurring tasks that were missing this context.
  • Hydrate content images in production docs cache #75 hydrates content images in the production docs cache; seeded daily-task SOP images no longer produce broken-asset findings.
  • Latest Deploy DataOps V1 Lambda run for the launch candidate passes end to end: repo checks, SAM validation/build, stack deploy, deployed docs/full-app smoke, workflow template smoke, and private work-engine Lambda smoke.
  • Production runtime seed evidence is recorded for the launch candidate: users exist, workflow templates exist, recurring configs exist, notifications exist, and the template API returns seeded podcast and newsletter templates.
  • Production runtime has real actionable daily work for the operations manager: two seeded daily tasks are visible in Operations Home with process instruction context.
  • Offsite export/archive evidence is current for the launch candidate: export archive exists in the retained DataOps export bucket with checksum and entity counts.
  • Restore drill evidence exists for the current production archive/export path, with export validation and dry-run import validation passing without production writes.
  • Before valuable Trello/workflow imports, current backup/export/archive evidence and restore drill evidence have been recorded in this issue.
  • Current CI/CD failure notifications were triaged; latest deploy/content/planning workflows on current main are green, and historical failures are superseded.
  • [HUMAN] Real Podcast Assistant Telegram smoke from Import Podcast Assistant into assistants/podcast #7 is performed with real TELEGRAM_BOT_API_KEY / TELEGRAM_CHAT_ID: an allowed chat sends a harmless intake item, /status reports it, and a processing run or safe dry-run starts without unauthorized writes. Evidence is attached to Import Podcast Assistant into assistants/podcast #7 and summarized here.
  • [HUMAN] If real assistant execution is considered production-ready for V1, Implement assistant job model and lifecycle #30's external execution path is manually verified with the real Telegram/Heru/Codex/Claude credential set that production will use. Evidence must show the job lifecycle in the app and no raw secrets or unbounded logs in exported data. If not production-ready for V1, explicitly defer it from launch acceptance.
  • [HUMAN] Alexey or Valeria confirms the real Trello export source for Migrate active Trello cards into integrated operations-manager work #41: board, lists included, export date/time, redaction requirements, and whether sensitive card content or attachments must be excluded before import.
  • [HUMAN] Before any production DynamoDB import write from Migrate active Trello cards into integrated operations-manager work #41, Alexey or Valeria approves the exact target environment, backup/export/archive location, dry-run summary, skipped/error report, and rollback/restore plan.
  • [HUMAN] If production Trello import is run for V1 cutover, the post-import production state is verified by the operator: imported bundles/tasks appear in the normal today/overdue/waiting/follow-up/workflow surfaces, proof blockers work, process-doc links open, and no sensitive Trello-only URLs or credentials appear in export metadata.
  • [HUMAN] Final operator acceptance is recorded from the daily workflow perspective: the operations manager can log in, identify what needs action today, open work, complete or block tasks with required proof, see follow-ups/waiting work, and use process docs in context without returning to disconnected tools for the core V1 flow.
  • [HUMAN] Any remaining open human-gated issues after cutover are explicitly classified as post-launch follow-up, launch blocker, or intentionally deferred. The issue comment links each decision to the relevant issue number.

Test Scenarios

Scenario: Launch candidate deploy is green

Given: the launch candidate is on main
When: Deploy DataOps V1 Lambda runs through GitHub Actions OIDC
Then: repository checks, SAM validation/build, stack deploy, deployed docs/full-app smoke, workflow template smoke, and private work-engine Lambda smoke all pass, with run URLs recorded in this issue.

Scenario: Production workspace is not a dead end

Given: production has seeded users, workflow templates, recurring configs, generated daily tasks, and notifications
When: the operations manager opens the deployed V1 workspace
Then: Operations Home shows real daily work and the work has contextual process instructions instead of only disconnected docs or hidden templates.

Scenario: Daily/recurring work is production-safe

Given: recurring/default work seeding or catch-up generation runs more than once for the same date/date range
When: CI/CD or an operator reruns seeding
Then: it creates or repairs documented daily work idempotently, prevents duplicates, reports created/skipped/recovered/failure counts, and does not overwrite operator-created work.

Scenario: Process docs are contextual

Given: the seeded daily tasks are visible to the operator
When: the operator opens task context or Process Quality
Then: linked SOPs and referenced SOP images load from the production docs cache, and the current Process Quality report has no blocking findings or broken asset findings.

Scenario: Export and restore evidence precedes valuable imports

Given: production may receive valuable Trello/workflow data
When: the operator reviews launch readiness
Then: there is a current offsite archive/export and restore-drill evidence report, and any production import approval references that evidence before writes occur.

Scenario: Real Podcast Assistant smoke is manually verified

Given: real Telegram and assistant credentials are available to a human verifier
When: the verifier sends a harmless test intake and starts a safe processing path
Then: #7/#30 evidence shows the intake, job lifecycle or dry-run behavior, artifact/log boundaries, and no unintended external writes.

Scenario: Trello production import is manually approved and checked

Given: a real Trello export is reviewed and redacted as needed
When: the production import dry-run summary and rollback plan are approved and the import is run
Then: imported work appears as normal active workflow bundles/tasks, waiting/follow-up/proof behavior is visible, export validation remains clean, and sensitive Trello-only content is not exposed.

Scenario: Final operator acceptance

Given: deploy, seed, daily work, export/restore, assistant, and Trello gates are resolved or explicitly deferred
When: the operations manager performs the V1 daily workflow acceptance pass
Then: the operator can see what needs action today, act on workflows/tasks, complete tasks with proof, follow up on waiting work, and use process docs in context from the deployed production app.

Out of Scope

  • Implementing new feature work in this cutover issue.
  • Changing source repositories ../dtc-operations, ../datatasks, or ../podcast-assistant.
  • Creating fake/demo production work just to make the dashboard non-empty.
  • Replacing the GitHub Actions OIDC deploy path with a normal manual app deploy.
  • Applying shared AWS infra templates from this repo; shared deploy-role infra still requires a credentialed AWS operator when it changes.
  • Running production Trello imports, destructive restores, external Telegram/assistant credentials, or other real external writes without the explicit [HUMAN] approvals above.

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    P0Must haveassistantAssistant modules and jobsdataData model, migration, storageenhancementNew or improved functionalityhumanCode done or issue blocked on human verificationinfraDeployment and infrastructuremigrationImport or migration workpodcastPodcast workflow and assistantportalShared portal shell and UXtestingTests and QAwork-engineDataTasks task execution engine

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions