Skip to content

⚡ Create batch processing mode #11

@oscarvalenzuelab

Description

@oscarvalenzuelab

Description

Implement batch processing capabilities to analyze multiple directories or repositories efficiently in a single operation, with parallel processing and progress tracking.

Acceptance Criteria

  • Create BatchProcessor class in shpi/core/batch.py
  • Support multiple input methods:
    • List of directory paths
    • File containing directory paths
    • Git repository with multiple projects
  • Implement parallel processing with configurable worker count
  • Add progress tracking and reporting
  • Aggregate results across multiple directories
  • Export batch results in multiple formats (JSON, CSV, XML)
  • Handle individual directory failures gracefully
  • Add resume capability for interrupted batch jobs
  • Implement batch job configuration and scheduling
  • Add comprehensive unit tests for batch processing
  • Document batch processing usage and best practices

Technical Notes

  • Use asyncio for concurrent processing
  • Consider memory usage with large batch jobs
  • Implement proper error isolation between directories
  • Add rate limiting coordination across parallel workers

Priority

Low (Advanced Feature)

Labels

enhancement, batch-processing, performance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions