Skip to content

PathoGenOmics-Lab/get_MNV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

269 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

get_MNV logo

License: AGPL v3 Bioconda Version DOI PGO

Multi-Nucleotide Variant detection - codon-level annotation from VCF or iVar TSV. Pure Rust Β· no C dependencies Β· cross-platform (macOS, Linux, Windows)

Quick Start Β· GUI Β· Features Β· Docs Β· Citation

Paula Ruiz-Rodriguez1 and Mireia Coscolla1
1. Institute for Integrative Systems Biology, I2SysBio, University of Valencia-CSIC, Valencia, Spain


What is get_MNV?

get_MNV finds cases where two or more SNVs fall in the same codon and should be interpreted together. These combined changes can produce a different amino acid effect than the individual SNVs alone.

The tool takes:

  • Variant calls: VCF or iVar variants.tsv
  • Reference sequence: FASTA
  • Gene annotation: GFF/GFF3/GTF or a simple TSV file
  • Optional aligned reads: BAM, used to count SNP and MNV read support

It writes annotated variants as TSV, VCF, or both.

Warning

get_MNV is designed for SNV/MNV interpretation. VCF insertions and deletions can be reported as INDEL, but their codon and amino-acid annotation is limited. iVar TSV indel rows, such as +A or -N, are skipped. For detailed indel consequence annotation, use a dedicated variant annotation tool.

MNV amino acid reclassification

Main features:

  • Groups SNVs by codon and reports SNP, MNV, or SNP/MNV calls
  • Recalculates amino acid changes from the full codon haplotype
  • Reads VCF and iVar TSV variant calls
  • Uses BAM reads, when provided, to count SNP/MNV support and strand bias
  • Supports 9 NCBI genetic code tables
  • Includes a desktop GUI for drag-and-drop analysis

Installation

Desktop GUI

Download the latest release for your platform:

Platform Download
🍎 macOS (Apple Silicon) get_MNV_1.1.3_aarch64.dmg
🍎 macOS (Intel) get_MNV_1.1.3_x64.dmg
🐧 Linux Releases page
πŸͺŸ Windows Releases page

Note

macOS users: The app is not signed with an Apple Developer certificate. On first launch, right-click the app β†’ Open β†’ click Open in the dialog. See Apple support for details.

All releases are available on the Releases page.

Command line

conda install -c bioconda get_mnv

or download a pre-built binary:

wget https://github.com/PathoGenOmics-Lab/get_MNV/releases/latest/download/get_mnv
chmod +x get_mnv
./get_mnv --help

or build from source:

git clone https://github.com/PathoGenOmics-Lab/get_MNV.git
cd get_MNV
cargo install --path .

Quick Start

VCF input

get_mnv \
  --vcf variants.vcf \
  --fasta reference.fasta \
  --gff genes.gff3

iVar TSV input

get_mnv \
  --tsv sample_variants.tsv \
  --bam reads.bam \
  --fasta reference.fasta \
  --gff genes.gff3

Use --tsv for the variants.tsv file produced by ivar variants.

With BAM read support

get_mnv \
  --vcf variants.vcf \
  --bam reads.bam \
  --fasta reference.fasta \
  --gff genes.gff3

TSV and VCF output

get_mnv \
  --vcf variants.vcf \
  --bam reads.bam \
  --fasta reference.fasta \
  --gff genes.gff3 \
  --both \
  --summary-json run.summary.json \
  --run-manifest run.manifest.json

Run get_mnv --help for the full list of options.

Common Arguments

Argument What it does
--vcf <FILE> Variant input file in VCF/BCF format.
--tsv <FILE> iVar variants.tsv input file.
--bam <FILE> Optional sorted and indexed BAM for read support.
--fasta <FILE> Reference FASTA. Contig names must match the variant file.
--gff <FILE> Gene annotation in GFF/GFF3/GTF format.
--genes <FILE> Simple gene annotation TSV. Use instead of --gff.
--gff-features <LIST> Feature types to analyze, for example CDS or gene,pseudogene.
--quality <N> Minimum variant quality. Default: 20.
--min-mapq <N> Minimum read mapping quality when using BAM. Default: 0.
--snp <N> Minimum SNP-supporting reads. Default: 0.
--min-snp-frequency <F> Minimum BAM-derived SNP frequency, from 0 to 1. Default: 0.
--min-snp-strand <N> Minimum SNP-supporting reads required on each strand. Default: 0.
--mnv <N> Minimum MNV-supporting reads. Default: 0.
--min-mnv-frequency <F> Minimum BAM-derived MNV haplotype frequency, from 0 to 1. Default: 0.
--min-mnv-strand <N> Minimum MNV-supporting reads required on each strand. Default: 0.
--both Write both TSV and VCF outputs.
--summary-json <FILE> Write a machine-readable run summary.
--run-manifest <FILE> Write command, version, inputs, outputs, and checksums.

Frequency filters use read support recalculated from --bam, not the original OFREQ value from VCF/iVar input. Use values such as 0.05 for 5% or 0.20 for 20%. When VCF output is requested, low-frequency records are skipped by default or marked with FILTER=LowFrequency when --emit-filtered is enabled. SNP and MNV frequency filters are independent: --min-snp-frequency applies to individual SNP observations, while --min-mnv-frequency applies to the phased MNV haplotype. In mixed SNP/MNV calls, a strong MNV haplotype is not removed just because the individual SNP observations are below the SNP threshold. The read-count and strand-support filters are independent in the same way: --snp and --min-snp-strand apply to SNP observations, while --mnv and --min-mnv-strand apply to the MNV haplotype.

Outputs

By default, get_MNV writes:

<input_name>.MNV.tsv

With --convert or --both, it also writes:

<input_name>.MNV.vcf

The most important output fields are:

Column Meaning
Chromosome Contig name
Gene Gene or feature name
Positions One position for SNPs, multiple positions for MNVs
Base Changes Alternative bases
AA Changes Amino acid change after combining SNVs in the codon
Variant Type SNP, MNV, SNP/MNV, or INDEL
Change Type Synonymous, non-synonymous, stop gained/lost, unknown, etc.

When a BAM is provided, extra columns report read depth, SNP support, MNV support, frequency, and strand counts.

Features

Feature Description
🧬 MNV detection Groups SNVs in the same codon and reclassifies as MNVs
πŸ”¬ Accurate AA changes Computes amino acid changes from the full codon haplotype
πŸ“Š Read support BAM-based SNP/MNV read counts with strand-specific metrics
πŸ” Strand bias Fisher exact test (SB, FS, SOR) with configurable filtering
πŸ“ Multiple outputs TSV, VCF (plain/BGZF+Tabix), BCF, JSON summary, run manifest
⚑ Parallel Multi-threaded contig processing with Rayon
πŸ§ͺ Genetic codes 9 NCBI translation tables (1, 2, 3, 4, 5, 6, 11, 12, 25)
🧩 Flexible input VCF or iVar TSV variant calls; GFF3/GTF or TSV annotations; multi-contig and multi-sample VCFs
βœ… Validation Dry-run mode, strict metrics, input checksums, error JSON
πŸ–₯️ Desktop GUI Native Tauri app with drag-and-drop, genomic track viewer, dark mode

Desktop GUI

The desktop app gives the same analysis workflow in a visual interface:

  • Drop VCF or iVar TSV variant files
  • Drop FASTA, GFF/GTF/GFF3, and optional BAM files
  • Choose common parameters from the form
  • Run one sample or multiple matched samples
  • Inspect, filter, and export results
bash scripts/dev.sh   # development
bash scripts/build_gui_bundle.sh  # production .app + .dmg bundle

Example Output

Chromosome  Gene      Positions       Base Changes  AA Changes  Variant Type  Change Type
MTB_anc     Rv0095c   104838          T             Asp126Glu   SNP           Non-synonymous
MTB_anc     Rv0095c   104941,104942   T,G           Gly92Gln    SNP/MNV       Non-synonymous
MTB_anc     esxL      1341102,1341103 T,C           Arg33Ser    MNV           Non-synonymous

Variant types:

  • SNP: single nucleotide change, one SNV per codon
  • MNV: all reads carry multiple SNVs together (Multi-Nucleotide Variant)
  • SNP/MNV: some reads carry individual SNVs, others carry the MNV combination
  • INDEL: insertion or deletion; detected/reported with limited codon and amino-acid annotation

Documentation

Document Description
Usage Full CLI reference and examples
Input formats VCF, FASTA, GFF, TSV, BAM specifications
Output formats TSV, VCF, BCF, JSON output details
Troubleshooting Common errors and solutions
Benchmarking Performance testing
Changelog Version history

For Developers

The core CLI and library live in src/. The desktop app uses Tauri in src-tauri/ and React/TypeScript in frontend/.

Useful commands:

cargo test --workspace
npm run build --prefix frontend
bash scripts/build_get_mnv.sh
bash scripts/build_gui_bundle.sh

Limitations

  • Designed for SNVs against a reference sequence
  • VCF insertions and deletions are detected but not fully codon-annotated; iVar TSV indel notation such as +A or -N is skipped
  • Multiallelic VCF records require --split-multiallelic or pre-splitting (bcftools norm -m -)
  • Variant contig names must match FASTA and GFF exactly
  • Multiple transcripts per gene: when using --gff-features CDS with a GFF file that contains multiple transcripts for the same gene, each transcript is annotated independently, producing one output line per transcript per variant. If you want a single line per variant, filter your GFF to keep only the canonical transcript before running get_MNV (e.g., using AGAT agat_sp_keep_longest_isoform.pl or a similar tool)

Citation

If you use get_MNV in your research, please cite:

Ruiz-Rodriguez P, Coscolla M. get_MNV: Multi-Nucleotide Variant detection tool. Zenodo. doi: 10.5281/zenodo.13907423

@software{ruiz-rodriguez_get_mnv_2026,
  title     = {get\_MNV: Multi-Nucleotide Variant detection tool},
  author    = {Ruiz-Rodriguez, Paula and Coscoll{\'a}, Mireia},
  year      = {2026},
  doi       = {10.5281/zenodo.13907423},
  url       = {https://github.com/PathoGenOmics-Lab/get_MNV},
  version   = {1.1.3},
  license   = {AGPL-3.0}
}

License

GNU Affero General Public License v3.0

Fun

Click for the 3D printable logo:

get_MNV 3D logo


✨ Contributors

About

Identifies multiple SNVs within the same codon, reclassifies them as MNVs, and accurately computes resulting amino acid changes from genomic reads

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors