Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
.DS_Store
.env
pr-summary.md
backend/bin/sheets/
20 changes: 19 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,32 @@
- Accept a file path as the first positional argument and fall back to stdin.
- Add an entry to `dotpy/README.md` following the existing format.

## Pull Request Summaries

- **When the developer asks for a PR summary**, write it to `pr-summary.md` in
the project root and open the file so they can select-all and copy from the
editor. `pr-summary.md` is listed in `.gitignore` and will never be
accidentally committed.
- Use standard GitHub-flavoured Markdown: `##` / `###` headings, `**bold**`,
inline backticks, and bullet lists. Do not use HTML tags.
- Structure the summary as:
1. **`## <Title>`** — one-line description matching the branch purpose.
2. **`### Summary`** — 2–4 sentences on what the PR does and why.
3. **`### Changes`** — one bold entry per changed file or directory with
bullet sub-points explaining what changed.
4. **`### Notes`** *(optional)* — follow-up items, known limitations, or
things the reviewer should verify manually.
- Delete `pr-summary.md` after the PR is created; do not commit it.

## Markdown Formatting

- **Format tables correctly**: Every column in a Markdown table must be padded so that all cells in that column (header, separator, and every data row) are the same width. The separator row must use dashes (`-`) at least as wide as the widest cell in each column. Mismatched widths cause IDE warnings ("Table is not correctly formatted").
- Determine the widest cell in each column (considering the rendered source text, not the display text of links).
- Pad every shorter cell with trailing spaces to match that width.
- Use the same number of dashes in the separator row as the column width.
- **The data rows — not just the header — define the required column width.** The header and separator must be padded/extended to match the widest data cell, not the other way around.
- To compute the exact separator, run: `python3 dotpy/calc_widths.py <file.md>` — it prints the maximum between-pipe width per column and the ready-to-paste separator row for every table in the file.
- To auto-format a table (strip whitespace, recalculate all widths, pad in place), run: `python3 dotpy/format_table.py <file.md>` — rewrites the file with every table correctly padded. **Use this first.**
- To compute the exact separator without editing, run: `python3 dotpy/calc_widths.py <file.md>` — it prints the maximum between-pipe width per column and the ready-to-paste separator row for every table in the file.
- To validate alignment after editing, run: `python3 dotpy/check_tables.py <file.md>` — exits `0` if all tables are consistent, `1` with error details if not.
- If a table requires very long lines (e.g., > 120 characters per row), prefer using a shorter link display text or a bullet-list format instead of a wide table.

20 changes: 20 additions & 0 deletions DONE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
# DONE

## 2026-04-22T00:00:00 — Backend Bin Scripts README
Documented all 58 scripts and data files in `backend/bin/` (excluding
`README.md` itself and the CSV inventory sheets under `sheets/`). Initial
descriptions were synthesised from the v2 CSV inventory, then corrected via
systematic file-by-file review. Thirteen factual errors were found and fixed;
all corrections are documented as numbered footnotes in a "Review Notes"
section appended to the README. All six Markdown tables pass `check_tables.py`
validation.

- [x] Extract descriptions for all files from inventory v2 CSV
- [x] Document top-level scripts: `additonal_stats_data`, `consolidate_ips`, `embargo-item`, `find_crawlers`, `ip_stats_data`, `meta_file`, `meta_file_delete`, `meta_file_orcid`, `monthly_report`, `prep-logs`, `remove_ips`, `update_all_ip_table`
- [x] Document `aptrust/` directory: `email-aptrust-bagging-errors`, `metadata2aptrust-info.xsl`, `metadata2bag-info.xsl`, `move-to-aptrust-regular`, `report-aptrust-status`, `saxon9he.jar`
- [x] Document `cronjobs/` directory: `check-checksum`, `filter-media-cronjob`, `find-items-to-unrestrict-bio`, `find-items-to-unrestrict-rack`, `meta_file_author`, `meta_file_delete`, `meta_file_desc`, `perl_mailer`, `pubmedV2`, `report-about-dearborn-items`, `report-double-original`, `report-embargo`, `report-tombstone`, `report-too-many-authors`, `report-too-many-db-connections`, `update_author_list_too_long`
- [x] Document `monthlies/` directory: `JOURNAL_SUBJECTS`, `change-type-martha`, `check-retiree.rb`, `clear-bit-description`, `find-authors-monthly`, `find-bit-change-perpmission`, `prepare-wiley`, `prepare-wiley_1`, `prepare-wiley_2`, `prepare-wiley_3`, `prepare-wiley_4`, `replace-funny-char-for-wiley`, `report-martha-types`, `report-users-out`, `update-orcid-values-monthly`
- [x] Document `rackham/` directory: `prepare-rackham`, `report-for-rackham`
- [x] Document `stats_monthlies/` directory: `StatsUtils.pm`, `find-alicia-stats`, `find-size-stats`, `monthly_admin_report`, `monthly_admin_report_for_bentley`, `monthly_individual_report`, `monthly_individual_report_based_on_author`
- [x] Write `backend/bin/README.md` with all sections
- [x] Validate README.md tables with `python3 dotpy/check_tables.py backend/bin/README.md`
- [x] Verify with the developer that the task is complete

## 2026-04-21T00:00:00 — Address Minor Issues from PR Review (DEEPBLUE-466/Refactor)
Resolved all actionable follow-up items flagged during the PR review: consolidated
`dspace/backend.dockerfile` ant/wget layers, merged `dspace-uid/solr.dockerfile`
Expand Down
1 change: 1 addition & 0 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# TODO



## Scrub Deleted `.cpt` Files from Git History
The five encrypted config files (`backend/config/*.cpt`) and the production log
(`backend/logs/dspace.log.2023-11-01`) were deleted from the working tree in
Expand Down
Loading