Test suite: opt-in parallel runs, faster serial, stricter hygiene by tony · Pull Request #1064 · tmux-python/tmuxp

tony · 2026-06-28T14:09:52Z

Summary

Add opt-in parallel test execution via pytest-xdist and a just test-parallel recipe, so the largely-independent, tmux-driven suite can fan out across cores for a much faster local run while the default just test stays serial.
Remove two fixed time.sleep(0.50) settle waits in the freezer tests — WorkspaceBuilder.build() already waits for pane readiness (_wait_for_pane_ready), so the post-build wait was dead time; trims ~1s from the default serial run without weakening the assertions.
Fix a color-env test fragility: an autouse fixture neutralizes FORCE_COLOR/NO_COLOR for every test, so the CLI help-text assertions are hermetic regardless of a contributor's exported shell color settings (and stay green under xdist).
Enforce --strict-markers so a mistyped or unregistered marker becomes a hard collection error instead of a silent warning.
Flatten the only two class-based test groupings into module-level functions, aligning with the project's functional-tests convention.

Changes by area

Parallelism (opt-in)

pyproject.toml: add pytest-xdist to the dev and testing dependency groups (lockfile updated).
justfile: add a test-parallel recipe using a capped worker count (-n 8); test remains serial.

Test robustness & speed

tests/workspace/test_freezer.py: drop the fixed time.sleep(0.50) settle wait (and the now-unused import time) from test_freeze_config and test_freeze_logs_debug — build() already waits for pane readiness, so freezing immediately after build() is safe.
conftest.py: new autouse _pin_test_color_env clearing FORCE_COLOR/NO_COLOR (mirrors the existing _pin_test_shell_env hermeticity pattern); color-specific tests still set the vars explicitly.

Test hygiene & convention

pyproject.toml: append --strict-markers to [tool.pytest.ini_options] addopts.
tests/workspace/test_finders_local.py: flatten TestFindLocalWorkspaceEdgeCases and TestLocalWorkspaceFilesConstant into module-level def test_* functions (bodies and assertions unchanged; the two constant tests renamed so they read at module scope).

Tooling

.gitignore: ignore .pytest-optimizer/ (local test-profiling tool state; not part of the project).

Design decisions

Capped worker count, not -n auto: on a high-core machine -n auto saturates every core and regresses below serial (each worker re-imports and re-collects the whole suite). A fixed -n 8 sits in the sweet spot. -n is deliberately kept out of addopts so it doesn't degrade --pdb/-x ergonomics or the default run; parallelism is opt-in via the recipe.
No post-build wait needed: WorkspaceBuilder.build() already calls _wait_for_pane_ready() per pane, so the freezer tests freeze immediately after build() returns — the prior fixed sleep was redundant.
Neutralize color env at the root, not per-test: like _pin_test_shell_env pins $SHELL, pinning the color env once (autouse) makes every test independent of the contributor's ambient environment rather than patching each color-sensitive test.
No markers ini entry: the only non-builtin marker (flaky) is owned and registered by pytest-rerunfailures (a mandatory dependency); the project defines no custom markers of its own, so --strict-markers alone is complete.

Verification

No fixed sleeps remain in the freezer tests:

$ rg 'time\.sleep' tests/workspace/test_freezer.py

No class-based groupings remain in the finders tests:

$ rg '^class Test' tests/workspace/test_finders_local.py

Strict-markers is active:

$ rg 'strict-markers' pyproject.toml

Test plan

Default serial suite green — uv run py.test
Parallel suite green — just test-parallel (uv run py.test -n 8)
Suite green with FORCE_COLOR exported — proves the color-env hermeticity fix
Unregistered marker hard-errors at collection — proves --strict-markers is enforced via addopts
tests/workspace/test_finders_local.py collects the same set after flattening (no class prefixes)
Lint clean — uv run ruff check . and uv run ruff format .
Types clean — uv run mypy (strict)

why: The pytest-optimizer tooling writes durable profiling and benchmark JSON under .pytest-optimizer/; it is local state, not source, and should never be committed. what: - Add .pytest-optimizer/ to .gitignore

why: test_freeze_config and test_freeze_logs_debug each slept a fixed time.sleep(0.50) to let tmux settle after build() before freeze() — ~1.0s of dead wait per run. what: - Replace both sleeps with retry_until polling on len(session.windows) >= len(session_config["windows"]) - Import retry_until (libtmux), drop now-unused import time - Suite 39.69s -> 38.53s (-1.16s), above the 0.68s k*MAD noise band

why: the suite is ~88% call-bound across independent, tmux-driven test bodies (each test spawns its own daemon on a unique socket), so it parallelizes cleanly. Measured: serial 39.69s -> ~16-19s under capped xdist (best 13.7s) ~= 2.1-2.9x, far above the 0.68s noise band; fully green under -n 8 in clean CI conditions. what: - Add pytest-xdist to the dev and testing dependency groups - Add a `just test-parallel` recipe with a capped worker count (-n 8); `just test` stays serial - Keep -n out of addopts: -n auto (=cores) thrashes at full core saturation (~40s) and -n breaks --pdb/-x ergonomics

why: CLI color is decided live from FORCE_COLOR/NO_COLOR (see tmuxp._internal.colors.Colors). A contributor who exports FORCE_COLOR in their shell gets ANSI-wrapped help text, so test_help_examples.py fails (len(examples) == 0). The same leak made the suite look order-dependent under xdist when FORCE_COLOR reached the workers. what: - Add autouse _pin_test_color_env clearing FORCE_COLOR/NO_COLOR for every test; color-specific tests still set them explicitly and monkeypatch restores the contributor's values at teardown - Suite now green with FORCE_COLOR exported and under xdist regardless of ambient color env

why: CLAUDE.md mandates functional tests only ("avoid class TestFoo groupings"). test_finders_local.py had two class-based groupings, the only such violation in the suite. what: - Flatten TestFindLocalWorkspaceEdgeCases (7 methods) and TestLocalWorkspaceFilesConstant (2 methods) into module-level def test_* functions; drop the self receiver - Rename the constant tests to test_local_workspace_files_constant_* so they read at module scope without the class for context - No behavior change: same tests collected, assertions verbatim

why: without --strict-markers an unregistered or mistyped marker (e.g. @pytest.mark.flakey) is a silent warning, so a typo'd mark quietly stops applying. Strict mode turns it into a hard collection error. what: - Add --strict-markers to [tool.pytest.ini_options] addopts - No markers ini entry needed: the only custom marker (flaky) is registered by pytest-rerunfailures; there are no project-defined custom markers - Verified: suite green and an unregistered mark now errors at collection

codecov · 2026-06-28T14:10:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.01%. Comparing base (c706e5d) to head (bd3db6e).

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1064      +/-   ##
==========================================
+ Coverage   81.98%   82.01%   +0.02%     
==========================================
  Files          28       28              
  Lines        2548     2552       +4     
  Branches      485      485              
==========================================
+ Hits         2089     2093       +4     
  Misses        328      328              
  Partials      131      131

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tony · 2026-06-28T14:34:32Z

Code review

Found 3 issues:

The retry_until(...) guard is satisfied the instant build() returns. build() creates windows synchronously (each new_window() is a blocking tmux call), so len(session.windows) >= expected_windows is already true on the first check and the call returns without polling. The speedup is real and the tests stay green, but the comment "Wait for all configured windows to register" overstates it, and the pane/shell settling that the prior time.sleep(0.50) allowed before freeze() is no longer awaited. If only window registration matters, the indirection adds little; if settling matters, this condition doesn't capture it. Same pattern at L126-L130. (code observation)

tmuxp/tests/workspace/test_freezer.py

Lines 33 to 37 in f30986a

    
           # Wait for all configured windows to register instead of a fixed sleep. 
        
           expected_windows = len(session_config["windows"]) 
        
           retry_until(lambda: len(session.windows) >= expected_windows, 2)

Commit bodies for 599c7a76 and 900cc704 end with Refs: pytest-optimizer H33 / H19 — internal AI-tooling identifiers with no meaning to a git-log reader. (CLAUDE.md says remove "AI-tool metadata")

https://github.com/tmux-python/tmuxp/blob/f30986a31ee1b48eefef1390db310658d1a056cd/CLAUDE.md#L528-L530

Several commit bodies embed fragile pass/collection counts — "797 passed" (f30986a3, ac9c318a), "17 collected items" (25b71b14), "10/10 + 3/3 green" (599c7a76). The timing figures that justify the worker-count decision are evidence and fine; the pass counts add nothing over "green". (CLAUDE.md Brittle References: "fragile file/test counts")

https://github.com/tmux-python/tmuxp/blob/f30986a31ee1b48eefef1390db310658d1a056cd/CLAUDE.md#L531-L534

why: WorkspaceBuilder.build() already calls _wait_for_pane_ready() for each normal pane, so the session is fully materialized when build() returns. The post-build retry_until on window count was a no-op (windows register synchronously) and its comment overstated what it did. what: - Remove the retry_until window-count waits from test_freeze_config and test_freeze_logs_debug, plus the now-unused retry_until import - build() flows straight to freeze(); the existing freezer tests remain the regression coverage

why: test_at_filesystem_root requested tmp_path and monkeypatch but used neither — it exercises traversal against the real filesystem root and patches nothing. what: - Remove the unused tmp_path/monkeypatch parameters from the test signature

why: the comment said "should find both" above an assertion that tolerates more than two results — when traversal continues past home it can also pick up configs in ancestor directories. what: - Reword the comment to match the >= 2 assertion

tony · 2026-06-28T15:40:53Z

Code review (re-review after force-push)

The three findings from the earlier review are resolved: the no-op retry_until was removed (build() already waits for pane readiness), and the commit bodies no longer carry Refs: metadata or brittle counts. I also corrected the PR description, which still described the removed retry_until approach.

Found 1 issue:

test_local_workspace_files_constant_is_list is fully subsumed by test_local_workspace_files_constant_order just above it — the equality assertion LOCAL_WORKSPACE_FILES == [".tmuxp.yaml", ".tmuxp.yml", ".tmuxp.json"] already proves the value is a list of length 3, so the isinstance/len checks add no coverage. Optional cleanup (carried over verbatim when the class was flattened).

tmuxp/tests/workspace/test_finders_local.py

Lines 268 to 276 in 8a39075

    
           def test_local_workspace_files_constant_order() -> None: 
        
               """Verify LOCAL_WORKSPACE_FILES has correct order (yaml, yml, json).""" 
        
               assert LOCAL_WORKSPACE_FILES == [".tmuxp.yaml", ".tmuxp.yml", ".tmuxp.json"] 
        
           def test_local_workspace_files_constant_is_list() -> None: 
        
               """Verify LOCAL_WORKSPACE_FILES is a list.""" 
        
               assert isinstance(LOCAL_WORKSPACE_FILES, list) 
        
               assert len(LOCAL_WORKSPACE_FILES) == 3

why: test_local_workspace_files_constant_is_list only asserted the constant is a list of length 3 — both already proven by the sibling test_local_workspace_files_constant_order via its equality assertion against the list literal. what: - Remove the subsumed is_list test

tony added 6 commits June 28, 2026 06:47

chore(git[ignore]): ignore .pytest-optimizer state dir

61551e2

why: The pytest-optimizer tooling writes durable profiling and benchmark JSON under .pytest-optimizer/; it is local state, not source, and should never be committed. what: - Add .pytest-optimizer/ to .gitignore

tony added 3 commits June 28, 2026 09:50

tests(finders): drop unused fixture params

2fb036f

why: test_at_filesystem_root requested tmp_path and monkeypatch but used neither — it exercises traversal against the real filesystem root and patches nothing. what: - Remove the unused tmp_path/monkeypatch parameters from the test signature

tony force-pushed the pytest-improvements branch from f30986a to 8a39075 Compare June 28, 2026 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Test suite: opt-in parallel runs, faster serial, stricter hygiene#1064

Test suite: opt-in parallel runs, faster serial, stricter hygiene#1064
tony wants to merge 10 commits into
masterfrom
pytest-improvements

tony commented Jun 28, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 28, 2026 •

edited

Loading

Uh oh!

tony commented Jun 28, 2026

Uh oh!

tony commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

tony commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes by area

Parallelism (opt-in)

Test robustness & speed

Test hygiene & convention

Tooling

Design decisions

Verification

Test plan

Uh oh!

codecov Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tony commented Jun 28, 2026

Code review

Uh oh!

tony commented Jun 28, 2026

Code review (re-review after force-push)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tony commented Jun 28, 2026 •

edited

Loading

codecov Bot commented Jun 28, 2026 •

edited

Loading