Skip to content

feat(version-scanner): support target list inputs via --targets#17478

Draft
chalmerlowe wants to merge 1 commit into
mainfrom
feat/version-scanner-targets
Draft

feat(version-scanner): support target list inputs via --targets#17478
chalmerlowe wants to merge 1 commit into
mainfrom
feat/version-scanner-targets

Conversation

@chalmerlowe

Copy link
Copy Markdown
Contributor

This pull request adds support for scanning multiple dependency target packages and versions in a single recursive scan run.

Key changes:

  • Adds a unified --targets parameter supporting YAML/JSON targets files or inline configuration mappings (e.g. python: [3.7, 3.8]).
  • Integrates a consolidated parser which resolves multiple target tuples.
  • Updates repository scanning logic to perform single-pass multi-target scanning.
  • Adds comprehensive unit tests validating parser string/file inputs and scanning engine checks.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for scanning multiple dependency targets simultaneously via a new --targets CLI argument, which accepts either a YAML/JSON file path or an inline string. It updates the scanning, rule resolution, and CSV reporting logic to handle multiple targets, and adds comprehensive unit tests. The review feedback highlights two critical issues where calling os.path.exists() on inline JSON/YAML strings containing invalid path characters can cause the script to crash on Windows. Additionally, the reviewer recommends improving error handling for non-existent target files and validating parsed version types to prevent unexpected scanning behavior.

Comment on lines +590 to +623
raw_targets = {}
content = targets_input

# Check if the input is a file path
if os.path.exists(targets_input):
try:
with open(targets_input, 'r', encoding='utf-8') as f:
content = f.read()
except PermissionError:
print(f"Error: Permission denied reading targets file: {targets_input}", file=sys.stderr)
sys.exit(1)
except Exception as e:
print(f"Error reading targets file {targets_input}: {e}", file=sys.stderr)
sys.exit(1)

try:
raw_targets = yaml.safe_load(content)
except Exception as e:
print(f"Error parsing targets YAML/JSON content: {e}", file=sys.stderr)
sys.exit(1)

if not isinstance(raw_targets, dict):
print("Error: Targets input must resolve to a JSON object or YAML mapping", file=sys.stderr)
sys.exit(1)

targets = []
for dep, versions in raw_targets.items():
if isinstance(versions, list):
for v in versions:
targets.append((str(dep), str(v)))
else:
targets.append((str(dep), str(versions)))

return targets

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This implementation has a few issues:

  1. Windows Crash: Calling os.path.exists() on Windows with an inline JSON/YAML string (which contains invalid path characters like {, ", :, [, ]) will raise an OSError and crash the script.
  2. Confusing Error Message: If a user passes a file path that does not exist, os.path.exists() returns False, and the script falls back to parsing the path string as YAML. This succeeds (parsing as a plain string) but then fails the isinstance(raw_targets, dict) check, printing a confusing error about the input not resolving to a JSON object or YAML mapping instead of a "File not found" error.
  3. Invalid Version Types: If a dependency has a null/None version or a nested dictionary/list structure, it is converted directly to a string (e.g., "None" or "{'major': 3}"), leading to incorrect scanning behavior.

We can resolve all of these by checking if the input looks like a file path first, handling non-existent files gracefully, and validating the parsed version types. Avoid broad except-blocks that silently pass; instead, log the exception to aid in debugging.

    raw_targets = {}
    content = targets_input
    
    # Check if the input is a file path
    is_file = False
    looks_like_file = not any(char in targets_input for char in ('{', '\\n', ':')) or targets_input.endswith(('.json', '.yaml', '.yml'))
    if looks_like_file:
        try:
            if os.path.exists(targets_input):
                is_file = True
            else:
                print(f"Error: Targets file not found: {targets_input}", file=sys.stderr)
                sys.exit(1)
        except Exception as e:
            logging.warning(f"Failed to check if path exists: {e}")

    if is_file:
        try:
            with open(targets_input, 'r', encoding='utf-8') as f:
                content = f.read()
        except PermissionError:
            print(f"Error: Permission denied reading targets file: {targets_input}", file=sys.stderr)
            sys.exit(1)
        except Exception as e:
            print(f"Error reading targets file {targets_input}: {e}", file=sys.stderr)
            sys.exit(1)

    try:
        raw_targets = yaml.safe_load(content)
    except Exception as e:
        print(f"Error parsing targets YAML/JSON content: {e}", file=sys.stderr)
        sys.exit(1)
        
    if not isinstance(raw_targets, dict):
        print("Error: Targets input must resolve to a JSON object or YAML mapping", file=sys.stderr)
        sys.exit(1)
        
    targets = []
    for dep, versions in raw_targets.items():
        if isinstance(versions, list):
            for v in versions:
                if v is None or isinstance(v, (dict, list)):
                    print(f"Error: Invalid version '{v}' for dependency '{dep}'", file=sys.stderr)
                    sys.exit(1)
                targets.append((str(dep), str(v)))
        elif versions is not None and not isinstance(versions, dict):
            targets.append((str(dep), str(versions)))
        else:
            print(f"Error: Invalid version '{versions}' for dependency '{dep}'", file=sys.stderr)
            sys.exit(1)
            
    return targets
References
  1. Avoid broad except Exception: blocks that silently return None. Instead, log the exception (e.g., using logger.warning) to aid in debugging and prevent masking underlying issues.

Comment on lines +803 to +808
if has_targets_list:
if os.path.exists(args.targets):
base_name = os.path.splitext(os.path.basename(args.targets))[0]
else:
base_name = "targets"
output_path = os.path.join(results_dir, f"{base_name}-{timestamp}.csv")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Calling os.path.exists(args.targets) when args.targets is an inline JSON/YAML string can raise an OSError on Windows due to invalid path characters (such as {, ", [). This will crash the script at the very end of its execution. Wrapping this check in a try-except block ensures it fails gracefully and defaults to "targets" as the base name. Be sure to log the exception rather than letting it pass silently.

Suggested change
if has_targets_list:
if os.path.exists(args.targets):
base_name = os.path.splitext(os.path.basename(args.targets))[0]
else:
base_name = "targets"
output_path = os.path.join(results_dir, f"{base_name}-{timestamp}.csv")
if has_targets_list:
base_name = "targets"
try:
if os.path.exists(args.targets):
base_name = os.path.splitext(os.path.basename(args.targets))[0]
except Exception as e:
logging.warning(f"Failed to check if targets path exists: {e}")
output_path = os.path.join(results_dir, f"{base_name}-{timestamp}.csv")
References
  1. Avoid broad except Exception: blocks that silently return None. Instead, log the exception (e.g., using logger.warning) to aid in debugging and prevent masking underlying issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant