Skip to content

File list and settings file extensions#208

Merged
Alex-1089 merged 4 commits into
mainfrom
feat/alex/SP-4349_file-list-input
May 21, 2026
Merged

File list and settings file extensions#208
Alex-1089 merged 4 commits into
mainfrom
feat/alex/SP-4349_file-list-input

Conversation

@Alex-1089
Copy link
Copy Markdown
Contributor

@Alex-1089 Alex-1089 commented May 20, 2026

Summary by CodeRabbit

  • New Features

    • Added --files-from / -ff flag to scan a newline-delimited list of file paths from a text file
    • Scan input validation updated so exactly one input method is required; --files and --files-from cannot be combined
  • Tests

    • Added tests covering file-list parsing (whitespace trimming, comments, empty lines, CRLF, UTF-8) and error/empty-file handling
  • Chores

    • Version bumped to 1.53.0

Review Change Stack

@Alex-1089 Alex-1089 requested a review from eeisegn May 20, 2026 16:08
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Warning

Rate limit exceeded

@Alex-1089 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 39 minutes and 38 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1132110a-a148-443e-a209-647b657fddc2

📥 Commits

Reviewing files that changed from the base of the PR and between 907067c and fecd9f0.

📒 Files selected for processing (1)
  • src/scanoss/cli.py
📝 Walkthrough

Walkthrough

Version bumped to 1.53.0 and scan gains --files-from/-ff to read newline-delimited file paths. A new load_files_from_file() parses paths (trim, skip blanks and # comments). CLI validation enforces a single input source and routes parsed lists into scanner.scan_files_with_options. Tests and changelog updated.

Changes

Files-from CLI feature

Layer / File(s) Summary
Version bump to 1.53.0
src/scanoss/__init__.py
__version__ constant updated from 1.52.1 to 1.53.0.
CLI imports and option
src/scanoss/cli.py
Adds Optional typing import and new --files-from/-ff argument on the scan subcommand.
Argument validation and execution routing
src/scanoss/cli.py
Validation extended to require exactly one input source and to reject --files + --files-from; execution branch loads file list via load_files_from_file() and forwards to scanner.scan_files_with_options(), exiting on empty lists.
File-list loader implementation
src/scanoss/cli.py
Adds load_files_from_file(filepath) that reads newline-delimited paths, trims whitespace, skips blank and # comment lines, and returns an empty list on read failure.
File-list loader test suite
tests/cli-test.py
New TestLoadFilesFromFile test class and import updates; tests cover trimming, skipping blanks/comments, empty/nonexistent/None/empty-string inputs, Windows CRLF lines, and UTF‑8 paths.
CHANGELOG update
CHANGELOG.md
Adds v1.53.0 Unreleased changelog entry for --files-from and appends [1.52.1] / [1.53.0] compare links.

Sequence Diagram

sequenceDiagram
  participant CLI as Scan CLI
  participant Validator as Arg Validator
  participant Loader as load_files_from_file
  participant Scanner as scanner.scan_files_with_options

  CLI->>Validator: parse args (files/files_from/dir/stdin/dep/wfp)
  Validator->>Validator: ensure exactly one input mode
  Validator-->>CLI: validation passed or help/error
  CLI->>Loader: read files_from filepath
  Loader->>Loader: open file, read lines, trim, skip blanks/comments
  Loader-->>CLI: file_list (or empty)
  CLI->>Scanner: scanner.scan_files_with_options(file_list)
  Scanner-->>CLI: scan results
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Suggested reviewers

  • eeisegn
  • ortizjeronimo
  • scanossmining

Poem

🐰 A list of paths in a file so neat,
I trim the edges, skip blank lines, skip the sheet,
Comments I ignore, UTF‑8 I embrace,
Tests hop around to confirm every case,
Version bumped up — a tidy little trace.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'File list and settings file extensions' is vague and overly broad. While the PR does add file-list functionality, the actual changes focus on adding a --files-from CLI option for scanning files from a list, not on 'settings file extensions'. Consider a more specific title like 'Add --files-from CLI option to scan files from a list' that clearly describes the primary feature being added.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/alex/SP-4349_file-list-input

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 111
  • Detected files undeclared: 0
  • Detected files declared: 111
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/scanoss/cli.py`:
- Around line 2359-2366: The file-list parsing currently opens files with
open(..., encoding='utf-8') and only catches (OSError, IOError), so a
UnicodeDecodeError will crash the CLI; update the error handling around the
open/for loop to also catch UnicodeDecodeError (e.g., except (OSError, IOError,
UnicodeDecodeError) as e) and call print_stderr with the same formatted message
using filepath and error e, or alternatively open the file with errors='replace'
to avoid decoding exceptions; ensure the code paths that append to files and
call print_stderr remain unchanged.
- Around line 1565-1568: The current conflict check uses "if args.files and
args.files_from" which treats an empty --files list as false and misses the
conflict; update the condition to explicitly detect a provided-but-empty --files
(e.g., treat args.files as a provided value when it's not None) by changing the
check to something like "if args.files_from and args.files is not None" or "if
args.files_from and (args.files is not None and len(args.files) >= 0)" so that
when args.files was supplied (even as an empty list) and args.files_from is set,
the branch runs; adjust the block around print_stderr, parser.parse_args and
sys.exit accordingly to ensure the error triggers for empty --files inputs.
- Around line 1714-1717: When load_files_from_file(args.files_from) returns no
usable entries, the code currently calls sys.exit(1) silently; update the block
around load_files_from_file and the subsequent check so that if file_list is
falsy you print a clear user-facing error to stderr (or use existing logger)
mentioning the provided args.files_from and that no valid file paths were found
(handle empty/comment-only files), then exit with code 1; keep the call to
scanner.scan_files_with_options(file_list, args.dep, scanner.winnowing.file_map)
unchanged when file_list is valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c42a109-ebe7-42a9-a749-41f812f8aa35

📥 Commits

Reviewing files that changed from the base of the PR and between 33df277 and da3fd78.

📒 Files selected for processing (3)
  • src/scanoss/__init__.py
  • src/scanoss/cli.py
  • tests/cli-test.py

Comment thread src/scanoss/cli.py
Comment thread src/scanoss/cli.py Outdated
Comment thread src/scanoss/cli.py
@github-actions
Copy link
Copy Markdown

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 111
  • Detected files undeclared: 0
  • Detected files declared: 111
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/scanoss/cli.py (2)

2366-2367: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle Unicode decode failures in file-list parsing.

Opening with encoding='utf-8' can raise UnicodeDecodeError when the file contains invalid UTF-8 sequences. The current exception handler catches RuntimeError, which is unrelated to file decoding errors. This will cause the CLI to crash with an unhandled exception instead of printing the graceful error message on line 2367.

🔧 Suggested fix
-        except (OSError, IOError, RuntimeError) as e:
+        except (OSError, IOError, UnicodeError) as e:
             print_stderr(f'ERROR: Failed to read input file; {filepath}: {e}')
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/scanoss/cli.py` around lines 2366 - 2367, The file-list parsing error
handler currently catches (OSError, IOError, RuntimeError) but misses
UnicodeDecodeError, so add UnicodeDecodeError to the exception tuple (e.g.,
except (OSError, IOError, UnicodeDecodeError) as e) in the block that reads the
input file (the code around the file-list parsing / read logic and the except
that calls print_stderr with filepath) so decoding failures produce the same
graceful error print instead of an unhandled crash; remove the unrelated
RuntimeError if desired.

1565-1568: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix --files/--files-from conflict detection for empty --files input.

The condition if args.files and args.files_from: treats an empty --files list as falsy and misses the conflict. When a user provides --files without values, args.files becomes [], which is falsy, so both modes may be accepted together.

🔧 Suggested fix
-    if args.files and args.files_from:
+    if args.files is not None and args.files_from is not None:
         print_stderr('Please specify only one of --files or --files-from')
         parser.parse_args([args.subparser, '-h'])
         sys.exit(1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/scanoss/cli.py` around lines 1565 - 1568, Change the conflict check to
detect the presence of the --files flag even when it yielded an empty list:
replace the current condition using truthiness with an explicit None check, e.g.
use "if args.files is not None and args.files_from is not None" (referencing
args.files and args.files_from in the block that currently calls print_stderr
and parser.parse_args) so an empty [] for args.files is treated as a provided
value and the mutual-exclusion error is triggered.
🧹 Nitpick comments (1)
src/scanoss/cli.py (1)

2342-2368: 💤 Low value

Consider returning None on read failure to distinguish from empty input.

When the file cannot be read, the function prints an error and returns an empty list. The caller then prints a second, generic error message ("No valid file paths found"). This produces two error messages for a single failure, which may confuse users into thinking the file was read successfully but contained no valid paths.

Consider returning None on exception (matching the Optional[list] return type hint on line 1714) and updating the caller to distinguish between read failure and empty/comment-only input.

♻️ Proposed fix
         except (OSError, IOError, UnicodeError) as e:
             print_stderr(f'ERROR: Failed to read input file; {filepath}: {e}')
+            return None
-    return files
+    return files if files or filepath else None

Then update caller to handle None:

     elif args.files_from:
         file_list: Optional[list] = load_files_from_file(args.files_from)
-        if not file_list:
+        if file_list is None:
+            sys.exit(1)
+        if not file_list:
             print_stderr(f'ERROR: No valid file paths found in: {args.files_from}')
             sys.exit(1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/scanoss/cli.py` around lines 2342 - 2368, The function
load_files_from_file should return None on read failure instead of an empty list
so callers can distinguish a read error from a legitimately empty or
comment-only file; modify load_files_from_file to catch exceptions and return
None (and update any type hints to Optional[list]) and keep printing the
detailed error via print_stderr, then update the caller(s) that call
load_files_from_file to check for None (treat as read failure and emit a single,
specific error message) versus an empty list (treat as no valid paths) so you
don't produce two error messages for one failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/scanoss/cli.py`:
- Around line 2366-2367: The file-list parsing error handler currently catches
(OSError, IOError, RuntimeError) but misses UnicodeDecodeError, so add
UnicodeDecodeError to the exception tuple (e.g., except (OSError, IOError,
UnicodeDecodeError) as e) in the block that reads the input file (the code
around the file-list parsing / read logic and the except that calls print_stderr
with filepath) so decoding failures produce the same graceful error print
instead of an unhandled crash; remove the unrelated RuntimeError if desired.
- Around line 1565-1568: Change the conflict check to detect the presence of the
--files flag even when it yielded an empty list: replace the current condition
using truthiness with an explicit None check, e.g. use "if args.files is not
None and args.files_from is not None" (referencing args.files and
args.files_from in the block that currently calls print_stderr and
parser.parse_args) so an empty [] for args.files is treated as a provided value
and the mutual-exclusion error is triggered.

---

Nitpick comments:
In `@src/scanoss/cli.py`:
- Around line 2342-2368: The function load_files_from_file should return None on
read failure instead of an empty list so callers can distinguish a read error
from a legitimately empty or comment-only file; modify load_files_from_file to
catch exceptions and return None (and update any type hints to Optional[list])
and keep printing the detailed error via print_stderr, then update the caller(s)
that call load_files_from_file to check for None (treat as read failure and emit
a single, specific error message) versus an empty list (treat as no valid paths)
so you don't produce two error messages for one failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3512a71c-8365-47d4-bb9c-f2384fd704de

📥 Commits

Reviewing files that changed from the base of the PR and between da3fd78 and 907067c.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • src/scanoss/cli.py
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md

@github-actions
Copy link
Copy Markdown

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 111
  • Detected files undeclared: 0
  • Detected files declared: 111
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@github-actions
Copy link
Copy Markdown

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 111
  • Detected files undeclared: 0
  • Detected files declared: 111
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@Alex-1089 Alex-1089 merged commit d24f64a into main May 21, 2026
6 checks passed
@Alex-1089 Alex-1089 deleted the feat/alex/SP-4349_file-list-input branch May 21, 2026 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants