Skip to content

Feature Request: parallel matching for prescored variants #82

@yangyxt

Description

@yangyxt

When I try to run CADD on a VCF file with 200k variants, I found the prescore match step executed by extract_scored.py is pretty time consuming. I think maybe this step can be accelerated by parallel matching per chromosome.

I suggest split the prescore file to 24 pieces by chromosome and split the input VCF to pieces by chormosome as well. For each chromosome, perform the extract_scored.py once and let them perform in parallel.

If it is OK for you, I can offer a PR later. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions