Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
85b7b3c
WIP: action to check engine against published rules
alexfurmenkov Apr 21, 2026
44da428
added workflow options to test it
alexfurmenkov Apr 22, 2026
77d07d7
debug step
alexfurmenkov Apr 22, 2026
f0273b7
set rules_2 as default branch for open-rules
alexfurmenkov Apr 22, 2026
1771592
set rules_2 as default branch for open-rules
alexfurmenkov Apr 22, 2026
490de0c
Merge branch 'main' into 798-test-against-published
alexfurmenkov Apr 28, 2026
57ce680
report adjustments
alexfurmenkov Apr 30, 2026
37551af
indentation fix
alexfurmenkov Apr 30, 2026
118685e
indentation fix(2)
alexfurmenkov Apr 30, 2026
e9b1a69
indentation fix(3) -- heredoc in tmp file
alexfurmenkov Apr 30, 2026
ac7e2c3
Merge branch 'refs/heads/main' into 798-test-against-published
alexfurmenkov May 11, 2026
7ad65fe
moved validation logic to python script
alexfurmenkov May 11, 2026
0df39d2
removed trigger on feature branch push event
alexfurmenkov May 11, 2026
d1179a3
fix action
alexfurmenkov May 12, 2026
3bdab27
Merge branch 'main' into 798-test-against-published
RamilCDISC May 12, 2026
b05b140
fixed naming in report
alexfurmenkov May 21, 2026
26bccad
Merge branch 'main' into 798-test-against-published
gerrycampion Jun 2, 2026
b63e1c4
temp allow to run on branch
gerrycampion Jun 2, 2026
89ceff7
try to fix failure
gerrycampion Jun 2, 2026
91ff691
still trying
gerrycampion Jun 3, 2026
24713b5
got/actual update
gerrycampion Jun 3, 2026
352cc7e
more actual got fix
gerrycampion Jun 3, 2026
ba9773e
comment change
gerrycampion Jun 3, 2026
8e19c8c
got->actual
gerrycampion Jun 3, 2026
6f93af6
core ids arg to limit number of rules run
gerrycampion Jun 4, 2026
4a4e627
cross
gerrycampion Jun 4, 2026
0f3f781
fix the cross
gerrycampion Jun 4, 2026
6689709
change csv conversion to an engine output format
gerrycampion Jun 4, 2026
d84e54e
let's run the entire suite
gerrycampion Jun 4, 2026
5264865
add unit tests for csv reports
gerrycampion Jun 4, 2026
d46e92a
fix regression test
gerrycampion Jun 4, 2026
39ffafe
Merge branch 'main' into 798-test-against-published
gerrycampion Jun 4, 2026
19ec15e
remove execution column and put exec fails in actual
gerrycampion Jun 4, 2026
5d72869
run all again
gerrycampion Jun 4, 2026
c84ce47
remove todo's
gerrycampion Jun 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions .github/workflows/validate-published-rules.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# ==============================================================================
# This workflow:
# 1. Checks out cdisc-rules-engine (the engine itself)
# 2. Checks out cdisc-open-rules (rules + test data) into ./open-rules/
# 3. Installs engine Python dependencies
# 4. Iterates every Published/ rule from cdisc-open-rules
# 5. Runs the engine against each test case
# 6. Compares actual output with expected results.csv baseline
# 7. Publishes a Markdown report to Job Summary and as an artifact
# ==============================================================================
name: Validate Published Rules

on:
push:
branches:
- main
workflow_dispatch:
inputs:
rules_ref:
description: "Branch/tag/SHA of cdisc-open-rules to validate against"
required: false
default: "main"
core_ids:
description: "Space-separated list of rule IDs to validate (e.g. CORE-000001 CORE-000002). Leave blank to validate all."
required: false
default: ""

jobs:
validate-published-rules:
runs-on: ubuntu-latest
permissions:
contents: read

steps:
# -----------------------------------------------------------------------
# 1. Checkout cdisc-rules-engine
# -----------------------------------------------------------------------
- name: Checkout cdisc-rules-engine
uses: actions/checkout@v6
with:
repository: cdisc-org/cdisc-rules-engine
path: engine
token: ${{ secrets.GITHUB_TOKEN }}

# -----------------------------------------------------------------------
# 2. Checkout cdisc-open-rules (rules + test data + helper scripts)
# -----------------------------------------------------------------------
- name: Checkout cdisc-open-rules
uses: actions/checkout@v6
with:
repository: cdisc-org/cdisc-open-rules
ref: ${{ inputs.rules_ref}}
path: open-rules

# -----------------------------------------------------------------------
# 2b. Debug — verify directory layout
# -----------------------------------------------------------------------
- name: Debug — list workspace layout

Check warning

Code scanning / CodeQL

Checkout of untrusted code in a trusted context Medium

Potential unsafe checkout of untrusted pull request on privileged workflow.
run: |
echo "=== Workspace root ==="
ls -la
echo "=== open-rules/ ==="
ls -la open-rules/ || echo "open-rules/ NOT FOUND"
echo "=== open-rules/Published/ (first 10) ==="
ls open-rules/Published/ 2>/dev/null | head -10 || echo "Published/ NOT FOUND"
echo "=== engine/ ==="
ls engine/ | head -10 || echo "engine/ NOT FOUND"

# -----------------------------------------------------------------------
# 3. Set up Python
# -----------------------------------------------------------------------
- name: Set up Python 3.12
Comment on lines +48 to +72
uses: actions/setup-python@v6
with:
python-version: "3.12"

# -----------------------------------------------------------------------
# 4. Install engine dependencies
# -----------------------------------------------------------------------
- name: Install engine dependencies
run: |
python -m venv venv
./venv/bin/pip install --upgrade pip
./venv/bin/pip install -r engine/requirements.txt

# -----------------------------------------------------------------------
# 5. Run validation for every Published rule
# -----------------------------------------------------------------------
- name: Run validation for all Published rules
id: validate
continue-on-error: true
run: |
chmod +x open-rules/.github/scripts/run_validation.sh

CORE_IDS_ARG=""
if [ -n "${{ inputs.core_ids }}" ]; then
CORE_IDS_ARG="--core-ids ${{ inputs.core_ids }}"
fi

./venv/bin/python engine/scripts/validate_published_rules.py \
--rules-root "$(pwd)/open-rules" \
--engine-dir "$(pwd)/engine" \
--python-cmd "$(pwd)/venv/bin/python" \
--output-dir "$(pwd)" \
$CORE_IDS_ARG

# -----------------------------------------------------------------------
# 6. Upload both reports + raw results as artifacts
# -----------------------------------------------------------------------
- name: Upload validation artifacts
if: always()
uses: actions/upload-artifact@v6
with:
name: published-rules-validation-${{ github.run_id }}
path: |
open-rules/Published/**/results/results.csv
summary_table.md
detail_report.md
if-no-files-found: warn

# -----------------------------------------------------------------------
# 7. Write ONLY the summary table to GitHub Actions Job Summary
# -----------------------------------------------------------------------
- name: Write summary table to workflow summary
if: always()
run: |
[ -f summary_table.md ] && cat summary_table.md >> $GITHUB_STEP_SUMMARY || true

# -----------------------------------------------------------------------
# 8. Fail the job if any rule failed
# -----------------------------------------------------------------------
- name: Check overall status
if: steps.validate.outcome == 'failure'
run: |
echo "One or more published rules failed validation — see the artifacts for detail_report.md."
exit 1
2 changes: 1 addition & 1 deletion cdisc_rules_engine/constants/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

NULL_FLAVORS = ["", None, {}, {None}, [], [None], np.nan]

KNOWN_REPORT_EXTENSIONS = [".json", ".xlsx", ".xls"]
KNOWN_REPORT_EXTENSIONS = [".json", ".xlsx", ".xls", ".csv"]

VALIDATION_FORMATS_MESSAGE = (
"SAS V5 XPT, Dataset-JSON (JSON or NDJSON), or Excel (XLSX)"
Expand Down
1 change: 1 addition & 0 deletions cdisc_rules_engine/enums/report_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
class ReportTypes(BaseEnum):
XLSX = "XLSX"
JSON = "JSON"
CSV = "CSV"
10 changes: 9 additions & 1 deletion cdisc_rules_engine/services/reporting/base_report_data.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from abc import ABC
from abc import ABC, abstractmethod
from io import IOBase
from typing import Iterable

Expand Down Expand Up @@ -53,3 +53,11 @@ def process_values(values: list[str]) -> list[str]:
else:
processed_values.append(value)
return processed_values

@abstractmethod
def get_csv_rows(self) -> tuple[list[str], list[list[str]]]:
"""
Return (header, rows) for the CSV output format.
Each row is a list of string values matching the header columns.
"""
pass
43 changes: 43 additions & 0 deletions cdisc_rules_engine/services/reporting/csv_report.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import csv
import os
from io import IOBase
from typing import override

from cdisc_rules_engine.enums.report_types import ReportTypes
from cdisc_rules_engine.models.validation_args import Validation_args
from cdisc_rules_engine.services.reporting.base_report_data import BaseReportData

from .base_report import BaseReport


class CsvReport(BaseReport):
"""
Writes a results.csv file in the format defined by the report standard,
compatible with the cdisc-open-rules test harness baselines.
"""

def __init__(
self,
report_standard: BaseReportData,
args: Validation_args,
template: IOBase | None = None,
):
super().__init__(report_standard, args, template)

@property
@override
def _file_ext(self) -> str:
return ReportTypes.CSV.value.lower()

@override
def write_report(self) -> None:
output_dir = os.path.dirname(self._output_name)
if output_dir:
os.makedirs(output_dir, exist_ok=True)

header, rows = self._report_standard.get_csv_rows()

with open(self._output_name, "w", newline="", encoding="utf-8") as fh:
writer = csv.writer(fh)
writer.writerow(header)
writer.writerows(rows)
2 changes: 2 additions & 0 deletions cdisc_rules_engine/services/reporting/report_factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from .base_report import BaseReport
from .excel_report import ExcelReport
from .json_report import JsonReport
from .csv_report import CsvReport


class ReportFactory:
Expand Down Expand Up @@ -46,6 +47,7 @@ def __init__(
self._output_type_service_map: dict[str, Type[BaseReport]] = {
ReportTypes.XLSX.value: ExcelReport,
ReportTypes.JSON.value: JsonReport,
ReportTypes.CSV.value: CsvReport,
}
self._standard_type_map: dict[str, Type[BaseReportData]] = {
"usdm": USDMReportData,
Expand Down
12 changes: 12 additions & 0 deletions cdisc_rules_engine/services/reporting/sdtm_report_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,18 @@ def _generate_error_details(
)
return errors

def get_csv_rows(self) -> tuple[list[str], list[list[str]]]:
header = ["Dataset", "Record", "Variable", "Value"]
rows = []
for issue in self.data_sheets.get("Issue Details", []):
dataset = (issue.get("dataset") or "").removesuffix(".csv")
record = str(issue.get("row", ""))
variables = issue.get("variables") or []
values = issue.get("values") or []
for variable, value in zip(variables, values):
rows.append([dataset, record, variable, str(value)])
return header, rows

def get_rules_report_data(self) -> list[dict]:
"""
Generates the rules report data that goes into the excel export.
Expand Down
11 changes: 11 additions & 0 deletions cdisc_rules_engine/services/reporting/usdm_report_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,17 @@ def _generate_error_details(
)
return errors

def get_csv_rows(self) -> tuple[list[str], list[list[str]]]:
header = ["path", "attribute", "value"]
rows = []
for issue in self.data_sheets.get("Issue Details", []):
path = issue.get("path") or ""
attributes = issue.get("attributes") or []
values = issue.get("values") or []
for attribute, value in zip(attributes, values):
rows.append([path, attribute, str(value)])
return header, rows

def get_rules_report_data(self) -> list[dict]:
"""
Generates the rules report data that goes into the excel export.
Expand Down
16 changes: 8 additions & 8 deletions docs/cli-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,14 +69,14 @@ python core.py validate --help

### Output

| Flag | Description |
| -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
| `-o, --output TEXT` | Output file path (without extension). Extension is added automatically based on format. |
| `-of, --output-format [JSON\|XLSX]` | Output format. |
| `-rr, --raw-report` | Raw output format (JSON only). |
| `-mr, --max-report-rows INTEGER` | Max rows in the Issue Details tab of Excel output (default: 1000; 0 = unlimited). Also via `MAX_REPORT_ROWS` env var. |
| `-me, --max-errors-per-rule INTEGER BOOLEAN` | Limit errors per rule. Format: `-me <limit> <per_dataset_flag>`. See below. |
| `-rt, --report-template TEXT` | Path to a custom Excel report template. |
| Flag | Description |
| -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| `-o, --output TEXT` | Output file path (without extension). Extension is added automatically based on format. |
| `-of, --output-format [JSON\|XLSX\|CSV]` | Output format. `CSV` writes issue rows directly (Dataset, Record, Variable, Value) compatible with the open-rules test harness. |
| `-rr, --raw-report` | Raw output format (JSON only). |
| `-mr, --max-report-rows INTEGER` | Max rows in the Issue Details tab of Excel output (default: 1000; 0 = unlimited). Also via `MAX_REPORT_ROWS` env var. |
| `-me, --max-errors-per-rule INTEGER BOOLEAN` | Limit errors per rule. Format: `-me <limit> <per_dataset_flag>`. See below. |
| `-rt, --report-template TEXT` | Path to a custom Excel report template. |

#### `--max-errors-per-rule` Detail

Expand Down
Loading
Loading