Skip to content

Commit 8cee886

Browse files
committed
denovo nf-core pipeline is running, working on adding google batch
1 parent 2a9d943 commit 8cee886

188 files changed

Lines changed: 15762 additions & 1 deletion

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

GoogleCloud/denovotranscript

Lines changed: 0 additions & 1 deletion
This file was deleted.
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# nf-core/denovotranscript: Changelog
2+
3+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
4+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
5+
6+
## v1.2.0 - [2025-01-30]
7+
8+
- [PR #25](https://github.com/nf-core/denovotranscript/pull/25) - Bump version to 1.2.0dev
9+
- [PR #27](https://github.com/nf-core/denovotranscript/pull/27) - Update spades resources and usage instructions
10+
- [PR #28](https://github.com/nf-core/denovotranscript/pull/28) - Template update for nf-core/tools v3.1.0
11+
- [PR #29](https://github.com/nf-core/denovotranscript/pull/29) - Template update for nf-core/tools v3.1.1
12+
- [PR #30](https://github.com/nf-core/denovotranscript/pull/30) - Fix ch_transcripts and ch_pubids for AWS
13+
- [PR #32](https://github.com/nf-core/denovotranscript/pull/32) - Fix transrate dependencies issue
14+
- [PR #33](https://github.com/nf-core/denovotranscript/pull/33) - Bump version to 1.2.0
15+
- [PR #36](https://github.com/nf-core/denovotranscript/pull/36) - Template update for nf-core/tools v3.2.0
16+
17+
## v1.1.0 - [2024-11-28]
18+
19+
Enhancements & fixes:
20+
21+
- [PR #14](https://github.com/nf-core/denovotranscript/pull/14) - Template update for nf-core/tools v3.0.2
22+
- [PR #15](https://github.com/nf-core/denovotranscript/pull/15) - Fix logic for skip_assembly parameter
23+
- [PR #16](https://github.com/nf-core/denovotranscript/pull/16) - Increase resources for Trinity
24+
- [PR #19](https://github.com/nf-core/denovotranscript/pull/21) - Fix Salmon quantification when running in skip_assembly mode
25+
- [PR #24](https://github.com/nf-core/denovotranscript/pull/24) - Update Trinity module and fix dependency issue
26+
27+
### Software dependencies
28+
29+
| Dependency | Old version | New version |
30+
| ---------- | ----------- | ----------- |
31+
| `Trinity` | 2.15.1 | 2.15.2 |
32+
33+
## v1.0.0 - [2024-08-15]
34+
35+
Initial release of nf-core/denovotranscript, created with the [nf-core](https://nf-co.re/) template.
36+
37+
This pipeline was created using code from [avani-bhojwani/TransFuse](https://github.com/avani-bhojwani/TransFuse) and [timslittle/nf-core-transcriptcorral](https://github.com/timslittle/nf-core-transcriptcorral/). It also incorporates modules and subworkflows developed by the nf-core community.
38+
39+
The main features of this pipeline are:
40+
41+
- Pre-processing of paired-end RNA-seq reads with FastQC, fastp, and SortMeRNA
42+
- _De novo_ transcriptome assembly with Trinity and rnaSPAdes
43+
- Redundancy reduction with EvidentialGene
44+
- Quality assessment of the assembly with BUSCO, rnaQUAST, and TransRate
45+
- Transcript-level quantification with Salmon
46+
- Reporting of the results with MultiQC
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# nf-core/denovotranscript: Citations
2+
3+
## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)
4+
5+
> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.
6+
7+
## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)
8+
9+
> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.
10+
11+
## Pipeline tools
12+
13+
- [BUSCO](https://busco.ezlab.org/)
14+
15+
> Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol Biol Evol. 2021 Sep 27;38(10):4647-4654. doi: 10.1093/molbev/msab199. PMID: 34320186; PMCID: PMC8476166.
16+
17+
- [Evidential Gene](http://arthropods.eugenes.org/EvidentialGene/)
18+
19+
> Gilbert, D. G. (2019). Longest protein, longest transcript or most expression, for accurate gene reconstruction of transcriptomes? bioRxiv. https://doi.org/10.1101/829184
20+
21+
- [fastp](https://github.com/OpenGene/fastp)
22+
23+
> Chen S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. Imeta. 2023 May 8;2(2):e107. doi: 10.1002/imt2.107. PMID: 38868435; PMCID: PMC10989850.
24+
25+
- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
26+
27+
> Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online].
28+
29+
- [gawk](https://www.gnu.org/software/gawk/)
30+
31+
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
32+
33+
> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
34+
35+
- [rnaQUAST](https://github.com/ablab/rnaquast)
36+
37+
> Bushmanova E, Antipov D, Lapidus A, Suvorov V, Prjibelski AD. rnaQUAST: a quality assessment tool for de novo transcriptome assemblies. Bioinformatics. 2016 Jul 15;32(14):2210-2. doi: 10.1093/bioinformatics/btw218. Epub 2016 Apr 23. PMID: 27153654.
38+
39+
- [rnaSPAdes](https://ablab.github.io/spades/rna.html)
40+
41+
> Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Curr Protoc Bioinformatics. 2020 Jun;70(1):e102. doi: 10.1002/cpbi.102. PMID: 32559359.
42+
43+
- [Salmon](https://salmon.readthedocs.io/en/latest/salmon.html)
44+
45+
> Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017 Apr;14(4):417-419. doi: 10.1038/nmeth.4197. Epub 2017 Mar 6. PMID: 28263959; PMCID: PMC5600148.
46+
47+
- [SortMeRNA](https://github.com/sortmerna/sortmerna)
48+
49+
> Kopylova E, Noé L, Touzet H. SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 2012 Dec 15;28(24):3211-7. doi: 10.1093/bioinformatics/bts611. Epub 2012 Oct 15. PMID: 23071270.
50+
51+
- [TransRate](https://hibberdlab.com/transrate/)
52+
53+
> Smith-Unna R, Boursnell C, Patro R, Hibberd JM, Kelly S. TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res. 2016 Aug;26(8):1134-44. doi: 10.1101/gr.196469.115. Epub 2016 Jun 1. PMID: 27252236; PMCID: PMC4971766.
54+
55+
This pipeline uses TransRate from the [Oyster River Protocol](https://github.com/macmanes-lab/Oyster_River_Protocol/tree/master/software).
56+
57+
> MacManes MD. The Oyster River Protocol: a multi-assembler and kmer approach for de novo transcriptome assembly. PeerJ. 2018 Aug 3;6:e5428. doi: 10.7717/peerj.5428. PMID: 30083482; PMCID: PMC6078068.
58+
59+
- [Trinity](https://github.com/trinityrnaseq/trinityrnaseq/wiki)
60+
61+
> Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, MacManes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, LeDuc RD, Friedman N, Regev A. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013 Aug;8(8):1494-512. doi: 10.1038/nprot.2013.084. Epub 2013 Jul 11. PMID: 23845962; PMCID: PMC3875132.
62+
63+
## Software packaging/containerisation tools
64+
65+
- [Anaconda](https://anaconda.com)
66+
67+
> Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.
68+
69+
- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)
70+
71+
> Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.
72+
73+
- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)
74+
75+
> da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.
76+
77+
- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)
78+
79+
> Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.
80+
81+
- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)
82+
83+
> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.

0 commit comments

Comments
 (0)