gdb/testsuite: support optimized code in gdb.rocm testsuite#103
gdb/testsuite: support optimized code in gdb.rocm testsuite#103spatrang wants to merge 4 commits into
Conversation
|
Hi @lancesix This PR is a re-submission of the optimized-code series that was originally
The corresponding fix-up commit is on top of this branch. Please take cc: @lumachad |
|
pre-commit seems to be grumpy. |
e05abcd to
12346d2
Compare
|
As stated elsewhere, the series only handles GDB_TESTCASE_OPTIONS. If someone passes CFLAGS directly to the compiler the optimization level checks won't work properly. Do we want to fix that as well? Maybe as a follow up if not in this one? |
|
There's a pre-commit failure. Other than that, I'm still seeing FAIL's for gdb.rocm/watchpoint-basic.exp when running the testsuite with -O1 or -O3. It passes when running with -O0. Could you please check that you're running the testsuite correctly? |
7c21341 to
a912e0e
Compare
|
Did the latest force-pushes provide anything that would fix gdb.rocm/watchpoint-basic.exp? I can't tell. |
7320b08 to
c729dec
Compare
|
For -lto runs, it looks like gdb.rocm/hip-builtin-variables.exp and gdb.rocm/hip-lang-detect.exp are still failing. |
Both hip-builtin-variables.exp and hip-lang-detect.exp now require no_lto. LTO can re-inline across TUs (defeating noinline/optnone boundaries) and rewrite the per-CU DW_LANG attribute, so the test premises don't hold at -flto. Matches the existing require no_lto pattern in watchpoint-at-end-of-shader.exp and mi-aspace.exp. |
c729dec to
65a4a86
Compare
65a4a86 to
24d7933
Compare
Introduce the necessary helper functions into the gdb.rocm
framework so that support for optimized code can be implemented.
- check_for_any_flags: detects whether GDB_TESTCASE_OPTIONS,
CFLAGS_FOR_TARGET or CXXFLAGS_FOR_TARGET contains any of the
given flags. Designed for whitespace-free flags such as
-O2, -flto.
- gdb_caching_proc no_optimized_code: predicate for -O1/-O2/-O3.
- gdb_caching_proc no_lto: predicate for -flto.
Co-Authored-By: Andrey Kasaurov <andrey.kasaurov@amd.com>
Co-Authored-By: Luis Machado <luis.machado@amd.com>
Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
24d7933 to
d415a28
Compare
|
Things look clean from -O1 through -O3. I do see a FAIL with -flto though: FAIL: gdb.rocm/hip-builtin-completions.exp: language detection (dwarfdump execution) Since we claim to be clean on -flto runs as well, we should fix that or skip it. |
d415a28 to
ea1067d
Compare
|
Confirmed a clean run for -flto as well. Thanks. |
ea1067d to
42d183e
Compare
lumachad
left a comment
There was a problem hiding this comment.
Leaving my approval here. If Lancelot doesn't have any restrictions on this series, feel free to merge.
lancesix
left a comment
There was a problem hiding this comment.
Comments for patch 2.
It seems that this pattern is not consistently implemented
- Make breakpoint-anchor helper functions ("done", "foo",
"bar") non-static and noinline, with an asm volatile body,
so the symbol survives the optimiser and the breakpoint
can be inserted.
Seems that some of that logic could be factored out to help (would avoid having to repeat the comment giving the rationale for the asm and so on).
lancesix
left a comment
There was a problem hiding this comment.
Comments for the 3rd path.
It really seems that many of the changes from this patch belong to the previous one (many changes are just modifying what the previous patch). Arguably the previous patch does too much at once and could be split, but regardless the fixup should not end-up here.
Seems like Lancelot has more comments. I'll let him take over.
Adapt several gdb.rocm tests so they pass under -O1/-O2/-O3:
- Apply __attribute__((optnone)) to device functions whose
locals or call structure the tests rely on, and to host
main() where its store-then-exit pattern would otherwise
be tail-optimised.
- Make breakpoint-anchor helper functions ("done", "foo",
"bar") non-static and noinline, with an asm volatile body,
so the symbol survives the optimiser and the breakpoint
can be inserted.
- Mark local variables volatile where the test reads them
via convenience variables or takes their address.
- Restore "__global__ void" qualifier on device-attach
kernel, and mark its spin-loop guard volatile so the loop
survives optimisation.
- Relax regex/line-number expectations under optimisation
in deep-stack, hip-builtin-variables, snapshot-objfile-on-
load, and step-schedlock-spurious-waves.
- Brace expr-substitutions in deep-stack and until-tests
(tclint hygiene).
Co-Authored-By: Luis Machado <luis.machado@amd.com>
Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
|
Hello @lancesix @lumachad
Re-tested the modified tests at -O0/-O1/-O2/-O3: 0 failures (the 3 UNSUPPORTED at -O1+ are the intended watchpoint-basic before-runtime-load skips). |
42d183e to
b94cbc7
Compare
Match the optional "0xADDR in " prefix explicitly: GDB prints it at -O0 (and for non-innermost frames) but omits it at -O1/-O2/-O3. The old "#N .* NAME .*" patterns also depended on frame-number column alignment for frames #10-#12. Make the address prefix optional for all frames #0-#12 so the backtrace matches at every optimization level. Signed-off-by: Sarang Patrange <spatrang@amd.com>
For tests whose semantics cannot be reconciled with optimised
code, gate them so they UNSUPPORTED-out cleanly instead of
failing.
- "require no_optimized_code" at the top of: aspace-user-
input, aspace-watchpoint, convenience_variables, device-
barrier, finish, fork-exec-non-gpu-to-gpu, lane-execution,
mi-aspace, mi-lanes, nonstop-displaced, precise-memory-
warning-watchpoint, register-watchpoint, runtime-core,
shared-memory.
- "require no_lto" at the top of: hip-builtin-variables,
hip-lang-detect, mi-aspace, watchpoint-at-end-of-shader.
- watchpoint-basic.exp: proc-level skip of
test_non_write_watchpoint_before_runtime_load's rwatch/
awatch sub-cases under optimisation. The HIP runtime is
initialised before main() reaches "Break before runtime
load", so amd-dbgapi is already attached and rejects the
host-side read/access watchpoint at insert time. Only
this sub-test depends on the "before runtime load"
premise; the other procs still work under optimisation,
so use a proc-level check_for_any_flags skip rather than
a file-level "require no_optimized_code".
Co-Authored-By: Andrey Kasaurov <andrey.kasaurov@amd.com>
Co-Authored-By: Luis Machado <luis.machado@amd.com>
Change-Id: I750d64e842413429e281bb8f7d85c4f46319a627
Signed-off-by: Sarang Patrange <spatrang@amd.com>
b94cbc7 to
83529fb
Compare
Summary
This series makes the
gdb.rocmtestsuite pass when the test programs arecompiled with optimization (
-O1/-O2/-O3,-flto) passed viaGDB_TESTCASE_OPTIONS, rather than the implicit-O0the suite assumes today.Three commits, split by concern:
Update
gdb.rocmframework to support optimized code — adds helpers ingdb/testsuite/lib/rocm.exp:check_for_any_flags {flags}— detects whetherGDB_TESTCASE_OPTIONScontains any of the listed flags.
gdb_caching_proc no_optimized_code— predicate for-O1/-O2/-O3.gdb_caching_proc no_lto— predicate for-flto.Update some
gdb.rocmtests to support optimized code (22 files) —two patterns applied surgically:
.cpp/.c: addvolatileon locals the test inspects, and__attribute__((optnone))on helper functions the test breakpoints into.Add a trivial
volatile intstatement to empty "break here" functionsso their body is not elided.
.exp: relax line-number arithmetic, backtrace PC formatting,expected source strings after
step, and add a bounded retry loop forextra ticks under optimization.
Skip unsupported tests in case of optimized code (19 files) — for tests
whose semantics cannot be reconciled with optimized code, add
require no_optimized_codeat the top;mi-aspace.expalso getsrequire no_lto.register-watchpoint.expsoft-skips the tail viacheck_for_any_flagsafter the earlier portion has run.Diffstat: 42 files changed, +146 / -46.
Context
This continues the work started in internal PR
AMD-ROCm-Internal/ROCgdb#6,
rebased onto the current
amd-stagingof the public repo. The commits arecontent-identical to the internal PR, preserving the original authors
(Aleksandar Rikalo, Andrey Kasaurov) with their
Co-Authored-By:trailersand adding my
Signed-off-by.