Skip to content

MGPOCKY/Action-Analyzer

Repository files navigation

Action-Analyzer

A TypeScript-based static analysis tool that parses GitHub Actions workflow YAML files using @actions/workflow-parser and analyzes CI patterns across top-starred GitHub repositories.


Project Structure

Action-Analyzer/
├── src/
│   ├── index.ts              # Main analyzer (version, library, self-hosted, readability checks)
│   ├── use-analyzer.ts       # Validates reusable action commit existence via GitHub API
│   ├── package-parser.ts     # Matches Node.js version against package.json declarations
│   └── package-parser.py     # Matches Python version against pyproject.toml declarations
├── result/
│   ├── python/               # Analysis results for Python repositories
│   └── node/                 # Analysis results for TypeScript repositories
├── patches/                  # patch-package patch files
├── dist/                     # esbuild bundle output
└── .github/workflows/
    └── sample.yml            # Sample workflow for local single-file testing

How to Run

1. Install dependencies

npm install

The postinstall hook automatically runs patch-package, applying the required patches to @actions/workflow-parser and @actions/expressions.

2. Run the analyzer

npm run start

By default, this parses .github/workflows/sample.yml.

3. Select an analysis function

Uncomment the desired function at the bottom of src/index.ts:

// pythonVersionAnalyzer(false);    // Extract Python versions → python_version_result.txt
// nodeVersionAnalyzer(false);      // Extract Node.js versions → node_version_result.txt
// libraryInstallAnalyzer(false);   // Check missing package install in TS workflows
pythonLibraryInstallAnalyzer(false); // Check missing pip install before python script (currently active)
// useCommandExtractor(false);      // Collect all uses: values → use_command_result.txt
// selfHostChecker(false);          // Check self-hosted runner label ordering
// nameChecker(false);              // Check readability (missing step names)
// parseChecker(false);             // Generate list of successfully parsed workflows

Set isLocal: true to run against the local analyzer/ directory for testing.

4. Python version mismatch analysis

# Run after python_version_result.txt is generated
python src/package-parser.py

5. Node.js version mismatch analysis

Uncomment the nodeMain() call at the bottom of src/package-parser.ts, then run it. Requires node_version_result.txt to be present first.

6. Reusable action validation

# Run after use_command_result.txt is generated
npx tsx src/use-analyzer.ts

Calls the GitHub API, so the Personal Access Token embedded in the source must be valid.

7. Update CSV aggregations

python result/python/update.py
python result/node/update.py

Reads each txt result file and updates the aggregation columns in py-github-stars.csv / ts-github-stars.csv.


Analysis Pipeline

python-repos / typescript-repos (locally cloned repositories)
        │
        ▼
 src/index.ts  ──────────────────────────────────────────────────────────────┐
  · nodeVersionAnalyzer / pythonVersionAnalyzer                              │
  · libraryInstallAnalyzer / pythonLibraryInstallAnalyzer                    │
  · useCommandExtractor / selfHostChecker / nameChecker / parseChecker       │
        │                                                                    │
        ▼                                                                    │
*_version_result.txt / use_command_result.txt                               │
        │                                                                    │
        ├─► src/package-parser.py  → result/python/*.txt                    │
        ├─► src/package-parser.ts  → result/node/*.txt                      │
        └─► src/use-analyzer.ts    → result/python|node/reuse_*.txt ◄───────┘
                │
                ▼
        result/python/update.py  →  py-github-stars.csv
        result/node/update.py    →  ts-github-stars.csv

Core Code

src/index.ts — Main Analyzer

Provides shared utilities to recursively scan workflow directories and parse each YAML file.

Function What it checks Output
nodeVersionAnalyzer Extracts node-version from actions/setup-node, evaluates matrix expressions node_version_result.txt
pythonVersionAnalyzer Extracts python-version from actions/setup-python python_version_result.txt
libraryInstallAnalyzer Detects missing npm/yarn/pnpm/bun install before lint, build, test, deploy, release Console output
pythonLibraryInstallAnalyzer Detects missing pip install before python *.py execution Console output
useCommandExtractor Collects all uses: values across all workflows use_command_result.txt
selfHostChecker Checks whether self-hosted is the first element in the runs-on array self_host_checker_result.txt
nameChecker Detects jobs containing a step longer than 300 characters with no name, and at least 2 steps result/readability_issue_workflows.txt
parseChecker Generates a list of successfully parsed workflow file paths result/python/parsed_workflows_list.txt

Since the base types in @actions/workflow-parser do not include with or step_len fields, the patches in patches/ add _with and step_len respectively.

src/package-parser.py — Python Version Matching

Compares python_version_result.txt (versions declared in workflows) against each repository's pyproject.toml / .python-version (declared required version).

  • Mismatch: result/python/version_not_matched_workflows.txt
  • Pre-release versions: result/python/prerelease_workflow_versions.txt
  • Unparseable versions: result/python/invalid_workflow_versions.txt

src/package-parser.ts — Node.js Version Matching

Compares node_version_result.txt against each repository's package.json engines.node / volta.node / .nvmrc.

  • Version mismatch: result/node/1. version_not_found_workflows.txt
  • Odd major version (non-LTS): result/node/2. odd_version_workflows.txt

src/use-analyzer.ts — Reusable Action Commit Validation

Reads use_command_result.txt and validates each owner/repo@sha action reference against the GitHub Commits API. References whose commits cannot be found are written to result.txt.

result/*/update.py — CSV Aggregation

Extracts repository folder names from each txt result file and appends per-category counts as columns to py-github-stars.csv / ts-github-stars.csv.


Result Folder Contents

result/python/ — Python Repository Analysis Results

File Contents
parsed_workflows_list.txt Absolute paths of all successfully parsed workflow files (6,545 entries)
1. version_not_matched_workflows.txt Workflows where the Python version in the workflow does not match requires-python in pyproject.toml
3. reuse_action_not_found_workflows copy.txt Workflows containing reusable actions whose commits could not be found via the GitHub API
4. reuse_action_not_exist_workflows.txt Workflows referencing reusable action repositories that do not exist
5. invalid_self_host_workflows.txt Workflows where self-hosted is not the first element in the runs-on array
6. readability_issue_workflows.txt Workflows containing jobs with unnamed steps longer than 300 characters
py-github-stars.csv Metadata and per-category issue counts for 1,000 Python repositories
update.py Aggregation script that reads txt results and updates CSV columns

CSV aggregation columns: workflow_count, version_not_matched_workflows, prerequisite_violation_workflows, reuse_action_not_found_workflows, reuse_action_not_exist_workflows, invalid_self_host_workflows, readability_issue_workflows

result/node/ — TypeScript Repository Analysis Results

File Contents
parsed_workflows_list.txt Absolute paths of all successfully parsed workflow files (7,081 entries)
1. version_not_found_workflows.txt Workflows where the Node.js version does not match engines in package.json or .nvmrc
2. odd_version_workflows.txt Workflows using an odd (non-LTS) Node.js major version
3. library_workflows.txt Workflows running lint/build/test without a preceding package install step
4. reuse_action_not_found_workflows.txt Workflows containing reusable actions that failed GitHub API validation
5. invalid_self_host_workflows.txt Workflows where self-hosted is not the first element in the runs-on array
6. readability_issue_workflows.txt Workflows containing jobs with unnamed steps longer than 300 characters
ts-github-stars.csv Metadata and per-category issue counts for 1,000 TypeScript repositories
update.py Aggregation script that reads txt results and updates CSV columns

CSV aggregation columns: workflow_count, version_not_found_workflows, odd_version_workflows, prerequisite_violation_workflows, reuse_action_not_found_workflows, invalid_self_host_workflows, readability_issue_workflows

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors