Skip to content

fix(zaparoo): shut down frontend before FPGA reconfig to stop atari launch freeze#13

Merged
wizzomafizzo merged 2 commits into
masterfrom
fix/atari-frontend-launch-freeze
Jun 26, 2026
Merged

fix(zaparoo): shut down frontend before FPGA reconfig to stop atari launch freeze#13
wizzomafizzo merged 2 commits into
masterfrom
fix/atari-frontend-launch-freeze

Conversation

@wizzomafizzo

@wizzomafizzo wizzomafizzo commented Jun 26, 2026

Copy link
Copy Markdown
Member

Problem

Launching a core from the launcher frontend could hard-freeze the device — silent black screen, serial output dead, reboot required. Reliably reproduced on atari5200/atari800; the identical launch works on stock MiSTer.

Root cause (traced)

The freeze is silent, right after Bitstream size: N bytes, with no further serial output:

MiSTer_cmd: load_core /media/fat/.LASTLAUNCH.mgl
xml_load [...]  RBF: .../Atari800_20260603.rbf
Loading RBF: .../Atari800_20260603.rbf
Bitstream size: 4063756 bytes
<freeze — reboot required>

Every loop in the path after that print (fpgamgr_program_write, fpgamgr_program_poll_*) is bounded and prints on error — there is no software infinite loop and no silent failure path. A totally silent, reboot-requiring freeze is therefore a hardware AXI bus deadlock during FPGA reconfiguration.

The only fork-unique factor: the frontend process is still alive, holding mmaps into /dev/fb0 — the HPS framebuffer the FPGA scans out over the same f2sdram bridge that fpga_load_rbf resets via do_bridge(0). The fork only killed the frontend afterward, in app_restart. So the live frontend + active framebuffer overlap the bridge teardown → bus deadlock. Upstream has no frontend, so it never hits this. The atari bitstream just reproduces the timing window reliably.

Fix

  • fpga_io.cpp: call alt_launcher_shutdown() at the top of fpga_load_rbf, so the frontend (and its framebuffer) is torn down before do_bridge(0)/socfpga_load. The call is idempotent, so app_restart's existing call is left untouched.
  • support/zaparoo/alt_launcher.cpp: harden the VT takeover — replace the unbounded video_chvt() (its VT_WAITACTIVE blocks forever if the frontend stalls bringing up video) with a bounded VT_ACTIVATE + VT_GETSTATE poll, and defer the takeover until the FPGA is ready. Fixes a separate poll-cothread wedge during menu+frontend bring-up after a 15kHz core.

Both changes are surgical and fork-friendly (one added line in the upstream file plus fork-owned code) — minimal merge surface, no upstream reformatting, no Makefile change.

Testing

Built via ./docker-build.sh and deployed to hardware:

  • ✅ atari800 and atari5200 now boot from the frontend instead of freezing.
  • ✅ Generic cores (SNES/Genesis) still launch fine — no regression.

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability when loading FPGA configurations by shutting down the launcher interface earlier to help prevent freezes during reconfiguration.
    • Fixed a potential hang when returning to the launcher screen by making the screen/terminal switching process more resilient and waiting for hardware readiness before switching.

…freeze

Launching a core from the launcher frontend could hard-freeze the device
(silent black screen, reboot required) — reliably reproduced on atari5200
/atari800. The frontend process stays alive holding mmaps into /dev/fb0,
the HPS framebuffer the FPGA scans out over the f2sdram bridge. fpga_load_rbf
calls do_bridge(0) to reset that bridge while the frontend is still mapped to
it, deadlocking the AXI bus mid-reconfiguration. Upstream has no frontend, so
it never hits this.

Move the frontend teardown ahead of reconfiguration: call alt_launcher_shutdown()
at the top of fpga_load_rbf so the frontend (and its framebuffer) is gone before
do_bridge(0)/socfpga_load. The call is idempotent, so app_restart's existing
call is left untouched.

Also harden the frontend VT takeover: replace the unbounded video_chvt()
(VT_WAITACTIVE blocks forever if the frontend stalls bringing up video) with a
bounded VT_ACTIVATE + VT_GETSTATE poll, and defer the takeover until the FPGA
is ready. This fixes a separate poll-cothread wedge during menu+frontend
bring-up after a 15kHz core.
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ea9a7f46-75ad-4591-b6cc-93d74c41b07d

📥 Commits

Reviewing files that changed from the base of the PR and between 01f8cfe and a2eacf8.

📒 Files selected for processing (1)
  • support/zaparoo/alt_launcher.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • support/zaparoo/alt_launcher.cpp

📝 Walkthrough

Walkthrough

Launcher VT switching now waits for FPGA readiness and uses a bounded activation loop. FPGA loading now shuts down the launcher before OSD disable and bridge reset.

Changes

Launcher and FPGA reconfiguration timing

Layer / File(s) Summary
Bounded VT takeover
support/zaparoo/alt_launcher.cpp
Adds fpga_io.h, gates VT switching on is_fpga_ready(1), clears the TTY deadline, and replaces the direct VT switch with a bounded /dev/tty0 activation loop.
Launcher shutdown before reconfiguration
fpga_io.cpp
Calls alt_launcher_shutdown() before OsdDisable() and FPGA reconfiguration work in fpga_load_rbf.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

A bunny taps the TTY gate,
And waits for FPGA, calm and straight.
The launcher bows, then hops aside,
While bounded VTs swiftly guide.
🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: shutting down the frontend before FPGA reconfiguration to prevent launch freezes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/atari-frontend-launch-freeze

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@support/zaparoo/alt_launcher.cpp`:
- Around line 485-490: The activation wait loop in switch_to_launcher_vt is
blocking the poll cothread by using usleep inside a fixed retry loop. Replace
the busy-sleep with a cooperative yield to the libco scheduler (for example via
scheduler_yield() in the alt_launcher_poll/finalize_spawn path) and keep the
bounded retry behavior using a wall-clock deadline or equivalent non-blocking
timeout logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f03b4a06-1ec3-4cc1-a9c4-3dee46076d64

📥 Commits

Reviewing files that changed from the base of the PR and between 82f4021 and 01f8cfe.

📒 Files selected for processing (2)
  • fpga_io.cpp
  • support/zaparoo/alt_launcher.cpp

Comment thread support/zaparoo/alt_launcher.cpp Outdated
switch_to_launcher_vt() ran in the poll cothread. usleep() in libco's
cooperative single-thread model sleeps the whole process, so the bounded VT
wait stalled both cothreads (no UI/OSD, no poll) for up to 500ms. Replace the
fixed usleep retry loop with a wall-clock deadline (GetTimer/CheckTimer) plus
scheduler_yield(), keeping the same bound while staying cooperative.
@wizzomafizzo wizzomafizzo merged commit d6254a5 into master Jun 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant