Fix script bugs from EventDisplay_v4 issue #352#145
Merged
Conversation
C1 ANALYSIS.anasum.sh: execute UTILITY.readSubmissionCommand.sh instead
of storing its path as a string (no jobs were ever submitted)
C2 ANALYSIS.mscw_energy_sub.sh: fix RedHV DispBDT fallback condition
(use HVSETTINGS=obsLowHV instead of exact SIMTYPE_RUN match)
C5 ANALYSIS.mscw_energy_sub.sh: add missing $ before DISPBDT in log guard
C10 IRF.optimizeTMVAforGammaHadronSeparation.sh: escape * in grep/sed
ENERGYBINS regex (was undefined BRE behaviour)
C11 IRF.optimizeTMVAforGammaHadronSeparation.sh: add exit 1 on missing
effective-area file
C20 ANALYSIS.v2dl3.sh: fix LOGIDR→LOGDIR typo; fix AFILE→J (undefined var)
C21 ANALYSIS.anasum_allcuts.sh: use $(dirname "$0") for sibling scripts
D1 IRF.production.sh: remove redundant inner epoch/atmo loops inside the
TRAINTMVA/OPTIMIZETMVA block (caused N²-fold over-submission)
D2 IRF.production.sh: replace ${EPOCH:0:2} with ${VX:0:2} inside the
per-epoch loop (EPOCH is the full list, VX is the loop variable)
D3 IRF.mscw_energy_MC_sub.sh: quote ${INDIR} in glob expansion
D4 IRF.generate_lookup_table_parts.sh, IRF.mscw_energy_MC.sh: FSCRIPT
already ends in .sh; remove spurious .sh suffix in submission calls
D6 IRF.combine_lookup_table_parts.sh: check $ODIR/$OFILE, not $OFILE
D7 Four helper scripts: replace 'which zstd' with 'command -v zstd'
(POSIX-portable; 'which' may return 0 even when absent)
D9 IRF.optimizeTMVAforGammaHadronSeparation_sub.sh: move
'rm -f ${MVADIR}/rates.log' to after MVADIR is assigned
D10 IRF.production.sh: replace ./IRF.*.sh with $(dirname "$0")/IRF.*.sh
for dispXGB and trainXGB calls
E3 IRF.trainXGBforAngularReconstructionBinned.sh: use RECID${RECID}_DISP
instead of hard-coded RECID0_DISP
E7 IRF.trainXGBforAngularReconstruction_sub.sh: replace undefined
$MSCW_FILE with $LLIST in TEMPDIR; add uuidgen for uniqueness
Bonus IRF.mscw_energy_MC_sub.sh: fix RedHV 55deg path being appended
instead of replacing the 60/65deg path
Ref: VERITAS-Observatory/EventDisplay_v4#352
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
67 tasks
C12: guard pip install in v2dl3_sub.sh to avoid per-job reinstall C13: check anasum.root files exist before combining in anasum_combine.sh C14: add DISPDIR existence check in mscw_energy_sub.sh C15: initialize index variables before while loops in db_run.sh C16: use get_file_status to check laserrun sidecar before skipping re-read C17: move convert_table_comment_to_ascii inside None guard in db_write_fits.py C18: change join_type from exact to outer in db_combine_dqm_fits.py C19: guard l3_rate and dead-time divisions against zero denominators Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR applies a set of targeted bug fixes across the EventDisplay VTS analysis/IRF production shell scripts and DB/FITS utilities, addressing failures and unsafe behaviors reported in EventDisplay_v4 issue #352.
Changes:
- Fixes job submission/execution logic, path handling, and portability issues in IRF/analysis shell scripts (submission command parsing, relative
./invocations, missing guards,command -vchecks,.sh.shpaths, etc.). - Improves robustness of DB-related scripts (FITS writing/combining, tarball/sidecar checks, division-by-zero guards).
- Corrects configuration and training script issues (RECID handling for XGB angular reconstruction, schema drift handling in DQM combines).
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/set_environment.sh | Updates the apptainer image reference (rc3 → rc4). |
| scripts/IRF.trainXGBforAngularReconstructionBinned.sh | Fixes hard-coded RECID0 input directory to respect RECID argument. |
| scripts/IRF.selectRunsForGammaHadronSeparationTraining.sh | Adjusts zenith-bin parsing path; currently adds extra stdout output (see comment). |
| scripts/IRF.production.sh | Fixes redundant TMVA training loops and makes IRF script invocations robust to cwd. |
| scripts/IRF.optimizeTMVAforGammaHadronSeparation.sh | Fixes unsafe grep regex for * ENERGYBINS and exits on missing effective-area file. |
| scripts/IRF.mscw_energy_MC.sh | Fixes job submission using $FSCRIPT (avoids .sh.sh). |
| scripts/IRF.generate_lookup_table_parts.sh | Fixes job submission using $FSCRIPT (avoids .sh.sh). |
| scripts/IRF.combine_lookup_table_parts.sh | Fixes output-file existence check to use $ODIR/$OFILE. |
| scripts/helper_scripts/IRF.trainXGBforAngularReconstruction_sub.sh | Improves temp dir uniqueness; needs quoting fixes for robustness (see comment). |
| scripts/helper_scripts/IRF.optimizeTMVAforGammaHadronSeparation_sub.sh | Removes rm that referenced MVADIR before it is set. |
| scripts/helper_scripts/IRF.mscw_energy_MC_sub.sh | Fixes RedHV 55° directory selection, zstd detection, and copying behavior. |
| scripts/helper_scripts/IRF.lookup_table_parallel_sub.sh | Uses command -v instead of non-portable which-style checks. |
| scripts/helper_scripts/IRF.evndisp_MC_sub.sh | Adds nullglob protection and uses command -v for zstd checks. |
| scripts/helper_scripts/IRF.compress_evndisp_MC_sub.sh | Uses command -v for zstd checks. |
| scripts/helper_scripts/ANALYSIS.v2dl3_sub.sh | Avoids per-job pip install -e by checking package presence first. |
| scripts/helper_scripts/ANALYSIS.mscw_energy_sub.sh | Fixes RedHV fallback condition, adds missing $ in DISPBDT check, and validates DispBDT directory exists. |
| scripts/helper_scripts/ANALYSIS.evndisp_sub.sh | Adds error handling and readability checks for unpacked DBTEXT metadata. |
| scripts/helper_scripts/ANALYSIS.anasum_sub.sh | Removes atmosphere “normalization” helper and uses runinfo-derived ATMO directly. |
| scripts/db_scripts/db_write_fits.py | Fixes None-guard ordering and guards divide-by-zero in rate/deadtime computations. |
| scripts/db_scripts/db_run.sh | Initializes header indices and improves tarball/sidecar presence logic. |
| scripts/db_scripts/db_combine_dqm_fits.py | Switches vstack join type to outer to tolerate schema drift. |
| scripts/ANALYSIS.v2dl3.sh | Fixes ${LOGIDR} typo and undefined $AFILE in status logging. |
| scripts/ANALYSIS.mscw_energy.sh | Updates usage/help text and removes unused forced-atmosphere argument parsing. |
| scripts/ANALYSIS.anasum.sh | Fixes submission command acquisition; current syntax needs correction (see comment). |
| scripts/ANALYSIS.anasum_combine.sh | Adds guard to ensure there are .anasum.root files before combining. |
| scripts/ANALYSIS.anasum_allcuts.sh | Makes script invocations robust to being run outside scripts/ directory. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug-Fix Summary — EventDisplay Issue #352
Fixes applied to
Eventdisplay_AnalysisScripts_VTSfor bugs reported inVERITAS-Observatory/EventDisplay_v4#352.
Analysis scripts (Section C)
C1 —
ANALYSIS.anasum.sh: job submission never executedFile:
scripts/ANALYSIS.anasum.shSUBCwas assigned the path to the helper script as a string instead ofexecuting it. The submission command was therefore never built, so no jobs
were ever submitted.
C2 —
ANALYSIS.mscw_energy_sub.sh: wrong condition for RedHV DispBDT fallbackFile:
scripts/helper_scripts/ANALYSIS.mscw_energy_sub.shThe
get_disp_dir()function selected a 55° zenith-angle fallback trainingdirectory for RedHV data only when
${SIMTYPE_RUN} == "CARE_RedHV"(exactmatch). The function already uses
${HVSETTINGS}for the base directory,so the condition was changed to match that variable instead:
C5 —
ANALYSIS.mscw_energy_sub.sh: missing$beforeDISPBDTFile:
scripts/helper_scripts/ANALYSIS.mscw_energy_sub.shLog message printed the literal string
DISPBDTinstead of the variablevalue:
C10 —
IRF.optimizeTMVAforGammaHadronSeparation.sh: unsafe regex for* ENERGYBINSFile:
scripts/IRF.optimizeTMVAforGammaHadronSeparation.shRunparameter files use
* ENERGYBINS(literal asterisk). The*withoutescaping is undefined in POSIX BRE right after
^:C11 —
IRF.optimizeTMVAforGammaHadronSeparation.sh: noexit 1on missing effective-area fileFile:
scripts/IRF.optimizeTMVAforGammaHadronSeparation.shThe script printed an error and continued instead of stopping when the
effective-area file was missing:
echo "ERROR: effective area file not found: ${EFFAREA}" +exit 1C20 —
ANALYSIS.v2dl3.sh:LOGIDRtypo and undefined$AFILEFile:
scripts/ANALYSIS.v2dl3.shTwo distinct bugs:
${LOGIDR}(typo) →${LOGDIR}— the cleanuprm -ftargeted thewrong (unexpanded/empty) directory.
$AFILE(undefined variable) →$J(the loop variable) — run IDs inlog output were blank.
C21 —
ANALYSIS.anasum_allcuts.sh: fragile relative./invocationsFile:
scripts/ANALYSIS.anasum_allcuts.shThe script invoked sibling scripts with
./ANALYSIS.…, which fails unlessthe user's working directory is the
scripts/folder:IRF production chain (Section D)
D1+D2 —
IRF.production.sh: redundant inner loops and wrong loop variableFile:
scripts/IRF.production.shThe TRAINTMVA/OPTIMIZETMVA block contained extra
for VX in $EPOCHandfor ATM in $ATMOSloops nested inside the outer epoch/atmosphere loops,causing N_epoch × N_atmo times more job submissions than intended. After
removing the redundant inner loops,
${EPOCH:0:2}(the full epoch list)was replaced with
${VX:0:2}(the correct loop variable):D3 —
IRF.mscw_energy_MC_sub.sh: unquoted glob, can fail on spaces / special charsFile:
scripts/helper_scripts/IRF.mscw_energy_MC_sub.shD4 —
IRF.generate_lookup_table_parts.sh/IRF.mscw_energy_MC.sh: double.shextensionFiles:
scripts/IRF.generate_lookup_table_parts.sh,scripts/IRF.mscw_energy_MC.shFSCRIPTis defined with the.shsuffix, but submission calls used$FSCRIPT.sh, producing a nonexistent….sh.shpath:(Same change applied to all submission/execution branches: SGE, Condor,
GNU parallel, and simple.)
D6 —
IRF.combine_lookup_table_parts.sh: existence check on filename aloneFile:
scripts/IRF.combine_lookup_table_parts.sh$OFILEis just a filename, not a path — the check always failed when thecurrent directory differed from
$ODIR:D7 —
zstdavailability check used non-portablewhichFiles:
scripts/helper_scripts/IRF.mscw_energy_MC_sub.sh,IRF.compress_evndisp_MC_sub.sh,IRF.evndisp_MC_sub.sh,IRF.lookup_table_parallel_sub.shwhich zstdreturns exit 0 even when the command is absent on somesystems. Replaced with POSIX-portable
command -v:D9 —
IRF.optimizeTMVAforGammaHadronSeparation_sub.sh:rmusesMVADIRbefore it is definedFile:
scripts/helper_scripts/IRF.optimizeTMVAforGammaHadronSeparation_sub.shrm -f ${MVADIR}/rates.logappeared 11 lines beforeMVADIR=…wasassigned, so it silently operated on an empty path. Moved to immediately
after the assignment:
D10 —
IRF.production.sh: fragile./IRF.invocationsFile:
scripts/IRF.production.shThree calls used
./IRF.*.sh, which fails when the script is launched fromoutside the
scripts/directory:Bonus —
IRF.mscw_energy_MC_sub.sh: RedHV 55° path was appended, not substitutedFile:
scripts/helper_scripts/IRF.mscw_energy_MC_sub.shAnalogous to C2, for MC production: when zenith angle ≥ 58° and the
simulation type is
CARE_RedHV*, the script should replace the60degor
65degtraining path with55deg, but instead it appended/55deg/to the already-constructed path. Fixed by restructuring the if/else to set
the path once for each case.
XGB angular reconstruction (Section E)
E3 —
IRF.trainXGBforAngularReconstructionBinned.sh: hard-codedRECID0File:
scripts/IRF.trainXGBforAngularReconstructionBinned.shThe input directory always pointed to
MSCW_RECID0_DISPregardless of theRECIDargument ($6):E7 —
IRF.trainXGBforAngularReconstruction_sub.sh: temp dir used undefined$MSCW_FILEFile:
scripts/helper_scripts/IRF.trainXGBforAngularReconstruction_sub.shMSCW_FILEis never set in this script;LLIST(the input.listfile)is the actual variable. The temp directory name was also not unique enough
for parallel runs:
Python / DB scripts (Section C continued)
C12 —
ANALYSIS.v2dl3_sub.sh:pip install -eran on every batch jobFile:
scripts/helper_scripts/ANALYSIS.v2dl3_sub.shInstalling the package unconditionally on every worker job caused unnecessary
network traffic and race conditions. Now guarded with
pip show:C13 —
ANALYSIS.anasum_combine.sh: no completeness check before combiningFile:
scripts/ANALYSIS.anasum_combine.shThe combine step could silently produce an empty/invalid result when the
parallel anasum jobs had not yet finished. A guard now aborts early:
C14 —
ANALYSIS.mscw_energy_sub.sh: DISPDIR used without existence checkFile:
scripts/helper_scripts/ANALYSIS.mscw_energy_sub.shget_disp_dir()can return a path that does not exist (e.g., wrong ZA bin).Added a guard matching the existing TABFILE check pattern:
C15 —
db_run.sh: index variables uninitialized before parser loopsFile:
scripts/db_scripts/db_run.shstart_time_index,end_time_index, andsource_indexwere used in${a[$index]}subscripts on the first loop iteration before the headerline set them. This caused reading from index 0 (silently wrong) rather
than the correct column:
+start_time_index=0 while IFS="|" read -ra a; do …Same fix applied to
end_time_indexandsource_index.C16 —
db_run.sh: tarball presence suppressed laser re-read even when sidecar missingFile:
scripts/db_scripts/db_run.shThe original comment admitted "implementation of testing missing". The
existing
get_file_status()function is now used to also verify the.laserrunsidecar exists inside the tarball:C17 —
db_write_fits.py:convert_table_comment_to_asciicalled beforeNonecheckFile:
scripts/db_scripts/db_write_fits.pyWhen
read_file()returnsNone(unreadable file), the DQM converter wascalled on
None, raising anAttributeError. Moved inside theNoneguard:
C18 —
db_combine_dqm_fits.py:join_type="exact"crashes on schema driftFile:
scripts/db_scripts/db_combine_dqm_fits.pyastropy.table.vstackwithjoin_type="exact"raises an exception whencolumn sets differ across runs (e.g., a new DQM field added mid-season).
Changed to
"outer"so missing columns are filled with masked values:C19 —
db_write_fits.py: unguarded division by zero for L3 rate and dead timeFile:
scripts/db_scripts/db_write_fits.pyl3_values / time_diffsandbusy / ten_mhzcould divide by zero whenconsecutive timestamps are equal or
TenMHzScalerwraps. Both divisionsare now guarded with
np.where:Issues intentionally skipped
$8present)