@@ -49,10 +49,10 @@ Input:
4949 Example input file:
5050 sample1 /path/to/your/file.vcf
5151 sample2 /path/to/your/file2.vcf
52- * --reference: path to the FASTA genome reference (indexes expected *.fai, *.dict)
53- * --vcf-without-ad: indicate when the VCFs to normalize do not have the FORMAT/AD annotation
5452
5553Optional input:
54+ * --reference: path to the FASTA genome reference (indexes expected *.fai, *.dict) [required for normalization]
55+ * --vcf-without-ad: indicate when the VCFs to normalize do not have the FORMAT/AD annotation
5656 * --output: the folder where to publish output
5757 * --skip_normalization: flag indicating to skip all normalization steps
5858 * --skip_decompose_complex: flag indicating not to split complex variants (ie: MNVs and combinations of SNVs and indels)
@@ -117,13 +117,13 @@ The normalization pipeline consists of the following steps:
117117 * Decomposition of multiallelic variants into biallelic variants (ie: A > C,G is decomposed into two variants A > C and A > G)
118118 * Trim redundant sequence and left align indels, indels in repetitive sequences can have multiple representations
119119 * Remove duplicated variants
120-
121- Optionally if BAM files are provided (through ` --bam_files ` ) VCFs are annotated with allele frequencies and depth of
122- coverage by Vafator (https://github.com/TRON-Bioinformatics/vafator ).
123120
124121The output consists of:
125122 * The normalized VCF
126123 * Summary statistics before and after normalization
124+
125+ The parameter ` --reference ` pointing to the reference genome FASTA file is required for normalization.
126+ The reference genome requires * .fai and * .dict indexes.
127127
128128Normalization is not applied if the parameter ` --skip_normalization ` is passed.
129129
0 commit comments