You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* --reference: path to the FASTA genome reference (indexes expected *.fai, *.dict) [required for normalization]
70
+
* --reference: absolute path to the FASTA genome reference (indexes expected *.fai, *.dict) [required for normalization and for functional annotation with BCFtools]
71
+
* --gff: absolute path to a GFF gene annotations file [required for functional annotation with BCFtools, only Ensembl-like GFF files]
71
72
* --vcf-without-ad: indicate when the VCFs to normalize do not have the FORMAT/AD annotation
72
73
* --output: the folder where to publish output
73
74
* --skip_normalization: flag indicating to skip all normalization steps
@@ -269,7 +270,10 @@ No technical annotations are performed if the parameter `--input_bams` is not pa
269
270
## Functional annotations
270
271
271
272
The functional annotations provide a biological context for every variant. Such as the overlapping genes or the effect
272
-
of the variant in a protein. These annotations are provided by SnpEff (Cingolani, 2012).
273
+
of the variant in a protein. These annotations are provided by SnpEff (Cingolani, 2012) or by BCFtools csq (Danecek, 2017).
274
+
Only one of the previous can be used.
275
+
276
+
### Using SnpEff
273
277
274
278
The SnpEff available human annotations are:
275
279
* GRCh37.75
@@ -289,7 +293,15 @@ To provide any additional SnpEff arguments use `--snpeff_args` such as
No functional annotations are performed if the parameters `--snpeff_organism` and `--snpeff_datadir` are not passed.
296
+
No SnpEff functional annotations are performed if the parameters `--snpeff_organism` and `--snpeff_datadir` are not passed.
297
+
298
+
### Using BCFtools csq
299
+
300
+
BCFtools does not require any previous preparation. It expects two parameters:
301
+
*`--reference`: absolute path to the FASTA reference genome
302
+
*`--gff`: absolute path to the Ensembl-like GFF annotations (ie: Gencode GFF files do not work https://github.com/samtools/bcftools/issues/1078)
303
+
304
+
Importantly, BCFtools does use the available phasing information to evaluate all mutations affecting any given transcript together.
293
305
294
306
295
307
## References
@@ -298,3 +310,4 @@ No functional annotations are performed if the parameters `--snpeff_organism` an
298
310
* Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. Twelve years of SAMtools and BCFtools. Gigascience. 2021 Feb 16;10(2):giab008. doi: 10.1093/gigascience/giab008. PMID: 33590861; PMCID: PMC7931819.
299
311
* Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316–319. 10.1038/nbt.3820
300
312
* Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.". Fly (Austin). 2012 Apr-Jun;6(2):80-92. PMID: 22728672
313
+
* Danecek, P., & McCarthy, S. A. (2017). BCFtools/csq: haplotype-aware variant consequences. Bioinformatics, 33(13), 2037–2039. https://doi.org/10.1093/bioinformatics/btx100
Copy file name to clipboardExpand all lines: nextflow.config
+7-6Lines changed: 7 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -40,15 +40,15 @@ manifest {
40
40
}
41
41
42
42
params.help_message ="""
43
-
TronFlow variant normalization v${VERSION}
43
+
TronFlow VCF postprocessing v${VERSION}
44
44
45
45
Usage:
46
-
nextflow run main.nf -profile conda --input_vcfs input_vcfs --reference reference.fasta
46
+
nextflow run main.nf --input_vcfs input_vcfs --reference reference.fasta
47
47
48
48
49
49
Input:
50
50
* --input_vcf: the path to a single VCF to normalize (not compatible with --input_files)
51
-
* --input_vcfs: a tab-separated values file containing in each row the sample name and path to the VCF file (not compatible with --input_vcf)
51
+
* --input_vcfs: the path to a tab-separated values file containing in each row the sample name and path to the VCF file (not compatible with --input_vcf)
* --reference: path to the FASTA genome reference (indexes expected *.fai, *.dict) [required for normalization]
74
+
* --reference: absolute path to the FASTA genome reference (indexes expected *.fai, *.dict) [required for normalization and for functional annotation with BCFtools]
75
+
* --gff: absolute path to a GFF gene annotations file [required for functional annotation with BCFtools, only Ensembl-like GFF files]
76
+
* --vcf-without-ad: indicate when the VCFs to normalize do not have the FORMAT/AD annotation
75
77
* --output: the folder where to publish output
76
78
* --skip_normalization: flag indicating to skip all normalization steps
77
79
* --skip_decompose_complex: flag indicating not to split complex variants (ie: MNVs and combinations of SNVs and indels)
78
-
* --filter: specify a comma-separated list of filters to apply (e.g.: PASS,.), only variants with these values will be kept. If not provided all varianst are kept
79
-
* --vcf-without-ad: indicate when the VCFs to normalize do not have the FORMAT/AD annotation
80
+
* --filter: specify the filter to apply if any (e.g.: PASS), only variants with this value will be kept
80
81
* --input_bams: a tab-separated values file containing in each row the sample name, tumor and normal BAM files for annotation with Vafator
81
82
* --skip_multiallelic_filter: after VAFator annotations if any multiallelic variant is present (ie: two different
82
83
mutations in the same position) only the highest VAF variant is kept unless this flag is passed
0 commit comments