Skip to content

Commit 28c3fcd

Browse files
Modified AWS directory
1 parent 94ece16 commit 28c3fcd

86 files changed

Lines changed: 6740 additions & 34 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AWS/Submodule_00_background.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,7 @@
109109
"metadata": {},
110110
"outputs": [],
111111
"source": [
112-
"display_quiz(\"Transcriptome-Assembly-Refinement-and-Applications/quiz-material/00-pc1.json\")"
112+
"display_quiz(\"../quiz-material/00-pc1.json\")"
113113
]
114114
},
115115
{
@@ -281,7 +281,7 @@
281281
"metadata": {},
282282
"outputs": [],
283283
"source": [
284-
"display_quiz(\"Transcriptome-Assembly-Refinement-and-Applications/quiz-material/00-cp1.json\", shuffle_questions = True)"
284+
"display_quiz(\"../quiz-material/00-cp1.json\", shuffle_questions = True)"
285285
]
286286
},
287287
{
@@ -407,7 +407,7 @@
407407
"metadata": {},
408408
"outputs": [],
409409
"source": [
410-
"display_quiz(\"Transcriptome-Assembly-Refinement-and-Applications/quiz-material/00-cp2.json\", shuffle_questions = True)"
410+
"display_quiz(\"../quiz-material/00-cp2.json\", shuffle_questions = True)"
411411
]
412412
},
413413
{

AWS/Submodule_01_prog_setup.ipynb

Lines changed: 52 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,12 @@
3030
"\n",
3131
"2. **Set up the necessary software:** Students will install and configure essential tools including:\n",
3232
" * Java (a prerequisite for Nextflow).\n",
33-
" * Mambaforge (a package manager for bioinformatics tools).\n",
33+
" * Miniforge (a package manager for bioinformatics tools).\n",
3434
" * `sra-tools`, `perl-dbd-sqlite`, and `perl-dbi` (specific bioinformatics packages).\n",
3535
" * Nextflow (a workflow management system).\n",
36-
" * `aws s3` (for interacting with AWS S3 Storage).\n",
36+
" * `gsutil` (for interacting with Google Cloud Storage).\n",
3737
"\n",
38-
"3. **Download and organize necessary data:** Students will download the TransPi transcriptome assembly software and its associated resources (databases, scripts, configuration files) from an S3 bucket. This includes understanding the directory structure and file organization.\n",
38+
"3. **Download and organize necessary data:** Students will download the TransPi transcriptome assembly software and its associated resources (databases, scripts, configuration files) from a Google Cloud Storage bucket. This includes understanding the directory structure and file organization.\n",
3939
"\n",
4040
"4. **Manage file permissions:** Students will use the `chmod` command to set executable permissions for the necessary files and directories within the TransPi software.\n",
4141
"\n",
@@ -53,7 +53,7 @@
5353
"* **Shell Access:** The ability to execute shell commands from within the Jupyter Notebook environment (using `!` and `%`).\n",
5454
"* **Java Development Kit (JDK):** Required for Nextflow.\n",
5555
"* **Miniforge** A package manager for installing bioinformatics tools.\n",
56-
"* **`aws s3`:** The AWS command-line tool. This is crucial for downloading data from an S3 storage bucket."
56+
"* **`gsutil`:** The Google Cloud Storage command-line tool. This is crucial for downloading data from Google Cloud Storage."
5757
]
5858
},
5959
{
@@ -104,7 +104,7 @@
104104
"## Time to begin!\n",
105105
"\n",
106106
"**Step 1:** To start, make sure that you are in the right starting place with a `cd`.\n",
107-
"> `pwd` prints our current local working directory. Make sure the output from the command is: `/home/ec2-user/SageMaker`"
107+
"> `pwd` prints our current local working directory. Make sure the output from the command is: `/home/jupyter`"
108108
]
109109
},
110110
{
@@ -114,7 +114,7 @@
114114
"metadata": {},
115115
"outputs": [],
116116
"source": [
117-
"%cd /home/ec2-user/SageMaker"
117+
"%cd /home/jupyter"
118118
]
119119
},
120120
{
@@ -147,12 +147,50 @@
147147
"! java -version"
148148
]
149149
},
150+
{
151+
"cell_type": "markdown",
152+
"id": "7b3ffb16-3395-4c01-9774-ee568e815490",
153+
"metadata": {},
154+
"source": [
155+
"**Step 3:** Install Miniforge (a package manager), which is needed to support the information held within the TransPi databases."
156+
]
157+
},
158+
{
159+
"cell_type": "code",
160+
"execution_count": null,
161+
"id": "ac5b204a-f0db-4ceb-bf37-57eca6d77974",
162+
"metadata": {},
163+
"outputs": [],
164+
"source": [
165+
"! curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh\n",
166+
"! bash Miniforge3-$(uname)-$(uname -m).sh -b -p $HOME/miniforge"
167+
]
168+
},
169+
{
170+
"cell_type": "markdown",
171+
"id": "c5584e2e",
172+
"metadata": {},
173+
"source": [
174+
"Next, add it to the path."
175+
]
176+
},
177+
{
178+
"cell_type": "code",
179+
"execution_count": null,
180+
"id": "ad030cd1",
181+
"metadata": {},
182+
"outputs": [],
183+
"source": [
184+
"import os\n",
185+
"os.environ[\"PATH\"] += os.pathsep + os.environ[\"HOME\"]+\"/miniforge/bin\""
186+
]
187+
},
150188
{
151189
"cell_type": "markdown",
152190
"id": "7b930ad7",
153191
"metadata": {},
154192
"source": [
155-
"**Step 3:** Using Mamba and bioconda, install the tools that will be used in this tutorial."
193+
"Next, using Miniforge and bioconda, install the tools that will be used in this tutorial."
156194
]
157195
},
158196
{
@@ -201,7 +239,7 @@
201239
"metadata": {},
202240
"outputs": [],
203241
"source": [
204-
"! aws s3 cp --recursive s3://nigms-sandbox/nosi-inbremaine-storage/TransPi ./TransPi"
242+
"! gsutil -m cp -r gs://nigms-sandbox/nosi-inbremaine-storage/TransPi ./"
205243
]
206244
},
207245
{
@@ -211,10 +249,10 @@
211249
"source": [
212250
"<div class=\"alert alert-block alert-success\">\n",
213251
" <i class=\"fa fa-hand-paper-o\" aria-hidden=\"true\"></i>\n",
214-
" <b>Note: </b> aws\n",
252+
" <b>Note: </b> gsutil\n",
215253
"</div>\n",
216254
"\n",
217-
">`aws s3` is a tool allows you to interact with S3 Storage through the command line."
255+
">`gsutil` is a tool allows you to interact with Google Cloud Storage through the command line."
218256
]
219257
},
220258
{
@@ -239,7 +277,7 @@
239277
"metadata": {},
240278
"outputs": [],
241279
"source": [
242-
"! aws s3 cp --recursive s3://nigms-sandbox/nosi-inbremaine-storage/resources ./resources"
280+
"! gsutil -m cp -r gs://nigms-sandbox/nosi-inbremaine-storage/resources ./"
243281
]
244282
},
245283
{
@@ -264,7 +302,8 @@
264302
"> - They can also be stacked so `../../` will take you two layers up.\n",
265303
">\n",
266304
">- If you were to type `!ls ./nextWeek/` it would return the contents of the `nextWeek` directory which is one layer down from the current directory, so it would return `manyThings.txt`.\n",
267-
">"
305+
">\n",
306+
">**This means that in the second line of the code cell above, the file `TransPi.nf` will be copied from the Google Cloud Storage bucket to the current directory.**"
268307
]
269308
},
270309
{
@@ -338,7 +377,7 @@
338377
"outputs": [],
339378
"source": [
340379
"from jupyterquiz import display_quiz\n",
341-
"display_quiz(\"Transcriptome-Assembly-Refinement-and-Applications/quiz-material/01-cp1.json\", shuffle_questions = True)"
380+
"display_quiz(\"../quiz-material/01-cp1.json\", shuffle_questions = True)"
342381
]
343382
},
344383
{

AWS/Submodule_02_basic_assembly.ipynb

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -180,8 +180,7 @@
180180
"metadata": {},
181181
"outputs": [],
182182
"source": [
183-
"!NXF_VER=22.10.1 ./nextflow run ./TransPi/TransPi.nf \\\n",
184-
"-profile docker --k 17,25,43 --maxReadLen 50 --all "
183+
"!nextflow run main.nf --input test_samplesheet.csv -profile test,docker -c conf/test_params.config --run_mode full "
185184
]
186185
},
187186
{
@@ -345,7 +344,11 @@
345344
]
346345
}
347346
],
348-
"metadata": {},
347+
"metadata": {
348+
"language_info": {
349+
"name": "python"
350+
}
351+
},
349352
"nbformat": 4,
350353
"nbformat_minor": 5
351354
}

AWS/Submodule_03_annotation_only.ipynb

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@
108108
"metadata": {},
109109
"outputs": [],
110110
"source": [
111-
"!pwd"
111+
"! pwd"
112112
]
113113
},
114114
{
@@ -141,7 +141,7 @@
141141
"metadata": {},
142142
"outputs": [],
143143
"source": [
144-
"!grep -c \">\" ./resources/trans/Oncorhynchus_mykiss_GGBN01.1.fa"
144+
"! grep -c \">\" ./resources/trans/Oncorhynchus_mykiss_GGBN01.1.fa"
145145
]
146146
},
147147
{
@@ -176,7 +176,7 @@
176176
"metadata": {},
177177
"outputs": [],
178178
"source": [
179-
"!perl -i.allloc -pe 's/basicRun/onlyAnnRun/g' ./TransPi/nextflow.config"
179+
"! perl -i.allloc -pe 's/basicRun/onlyAnnRun/g' ./TransPi/nextflow.config"
180180
]
181181
},
182182
{
@@ -229,7 +229,7 @@
229229
"metadata": {},
230230
"outputs": [],
231231
"source": [
232-
"!NXF_VER=22.10.1 ./nextflow run ./TransPi/TransPi.nf \\\n",
232+
"! NXF_VER=22.10.1 ./nextflow run ./TransPi/TransPi.nf \\\n",
233233
"-profile docker --onlyAnn "
234234
]
235235
},
@@ -248,7 +248,7 @@
248248
"metadata": {},
249249
"outputs": [],
250250
"source": [
251-
"!ls -l ./onlyAnnRun/output"
251+
"! ls -l ./onlyAnnRun/output"
252252
]
253253
},
254254
{
@@ -266,7 +266,7 @@
266266
"metadata": {},
267267
"outputs": [],
268268
"source": [
269-
"!cat ./onlyAnnRun/output/RUN_INFO.txt"
269+
"! cat ./onlyAnnRun/output/RUN_INFO.txt"
270270
]
271271
},
272272
{
@@ -297,7 +297,7 @@
297297
"metadata": {},
298298
"outputs": [],
299299
"source": [
300-
"!docker images"
300+
"! docker images"
301301
]
302302
},
303303
{
@@ -321,7 +321,7 @@
321321
"metadata": {},
322322
"outputs": [],
323323
"source": [
324-
"!docker run -it --rm --volume /home:/home quay.io/biocontainers/busco:5.4.3--pyhdfd78af_0 busco --help"
324+
"! docker run -it --rm --volume /home:/home quay.io/biocontainers/busco:5.4.3--pyhdfd78af_0 busco --help"
325325
]
326326
},
327327
{
@@ -365,7 +365,7 @@
365365
"source": [
366366
"numthreads=!lscpu | grep '^CPU(s)'| awk '{print $2-1}'\n",
367367
"THREADS = int(numthreads[0])\n",
368-
"!echo $THREADS"
368+
"! echo $THREADS"
369369
]
370370
},
371371
{
@@ -375,7 +375,7 @@
375375
"metadata": {},
376376
"outputs": [],
377377
"source": [
378-
"!docker run -it --rm --volume /home:/home quay.io/biocontainers/busco:5.4.3--pyhdfd78af_0 busco \\\n",
378+
"! docker run -it --rm --volume /home:/home quay.io/biocontainers/busco:5.4.3--pyhdfd78af_0 busco \\\n",
379379
"-i /home/jupyter/resources/trans/Oncorhynchus_mykiss_GGBN01.1.fa \\\n",
380380
"-l vertebrata_odb10 -o GGBN01_busco_vertebrata \\\n",
381381
"--out_path /home/jupyter/buscoOutput -m tran -c $THREADS"
@@ -425,7 +425,7 @@
425425
"outputs": [],
426426
"source": [
427427
"from jupytercards import display_flashcards\n",
428-
"display_flashcards('Transcriptome-Assembly-Refinement-and-Applications/quiz-material/03-cp1-1.json')"
428+
"display_flashcards('../quiz-material/03-cp1-1.json')"
429429
]
430430
},
431431
{

AWS/Submodule_04_google_batch_assembly.ipynb

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -193,12 +193,31 @@
193193
"myBucketName"
194194
]
195195
},
196+
{
197+
"cell_type": "markdown",
198+
"id": "73e9985d",
199+
"metadata": {},
200+
"source": [
201+
"**Step 4b:** Replace names in files with personal bucket and project names. "
202+
]
203+
},
204+
{
205+
"cell_type": "code",
206+
"execution_count": null,
207+
"id": "860238b5",
208+
"metadata": {},
209+
"outputs": [],
210+
"source": [
211+
"! sed -i \"s/<YOUR-PROJECT-ID>/$myProject/g\" nextflow.config\n",
212+
"! sed -i \"s/<YOUR-BUCKET-NAME>/$myBucketName/g\" conf/test_params.config"
213+
]
214+
},
196215
{
197216
"cell_type": "markdown",
198217
"id": "5676e5d2-f556-463e-93c2-5add30f1fff8",
199218
"metadata": {},
200219
"source": [
201-
"**Step 4b:** Create a new GCS bucket. *If you are using an existing bucket, you can skip this step.*\n",
220+
"**Step 4c:** Create a new GCS bucket. *If you are using an existing bucket, you can skip this step.*\n",
202221
"> To do this, we can use a new `gsutil` command: `mb` which stands for make bucket."
203222
]
204223
},
@@ -397,8 +416,7 @@
397416
"metadata": {},
398417
"outputs": [],
399418
"source": [
400-
"!NXF_VER=22.10.1 ./nextflow run ./TransPi/TransPi.nf \\\n",
401-
"-profile gcb --k 17,25,43 --maxReadLen 50 --all -resume"
419+
"!nextflow run main.nf --input test_samplesheet.csv -profile test,docker,gbatch -c conf/test_params.config --run_mode full "
402420
]
403421
},
404422
{
@@ -598,7 +616,11 @@
598616
]
599617
}
600618
],
601-
"metadata": {},
619+
"metadata": {
620+
"language_info": {
621+
"name": "python"
622+
}
623+
},
602624
"nbformat": 4,
603625
"nbformat_minor": 5
604626
}

AWS/denovotranscript/CHANGELOG.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# nf-core/denovotranscript: Changelog
2+
3+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
4+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
5+
6+
## v1.2.0 - [2025-01-30]
7+
8+
- [PR #25](https://github.com/nf-core/denovotranscript/pull/25) - Bump version to 1.2.0dev
9+
- [PR #27](https://github.com/nf-core/denovotranscript/pull/27) - Update spades resources and usage instructions
10+
- [PR #28](https://github.com/nf-core/denovotranscript/pull/28) - Template update for nf-core/tools v3.1.0
11+
- [PR #29](https://github.com/nf-core/denovotranscript/pull/29) - Template update for nf-core/tools v3.1.1
12+
- [PR #30](https://github.com/nf-core/denovotranscript/pull/30) - Fix ch_transcripts and ch_pubids for AWS
13+
- [PR #32](https://github.com/nf-core/denovotranscript/pull/32) - Fix transrate dependencies issue
14+
- [PR #33](https://github.com/nf-core/denovotranscript/pull/33) - Bump version to 1.2.0
15+
- [PR #36](https://github.com/nf-core/denovotranscript/pull/36) - Template update for nf-core/tools v3.2.0
16+
17+
## v1.1.0 - [2024-11-28]
18+
19+
Enhancements & fixes:
20+
21+
- [PR #14](https://github.com/nf-core/denovotranscript/pull/14) - Template update for nf-core/tools v3.0.2
22+
- [PR #15](https://github.com/nf-core/denovotranscript/pull/15) - Fix logic for skip_assembly parameter
23+
- [PR #16](https://github.com/nf-core/denovotranscript/pull/16) - Increase resources for Trinity
24+
- [PR #19](https://github.com/nf-core/denovotranscript/pull/21) - Fix Salmon quantification when running in skip_assembly mode
25+
- [PR #24](https://github.com/nf-core/denovotranscript/pull/24) - Update Trinity module and fix dependency issue
26+
27+
### Software dependencies
28+
29+
| Dependency | Old version | New version |
30+
| ---------- | ----------- | ----------- |
31+
| `Trinity` | 2.15.1 | 2.15.2 |
32+
33+
## v1.0.0 - [2024-08-15]
34+
35+
Initial release of nf-core/denovotranscript, created with the [nf-core](https://nf-co.re/) template.
36+
37+
This pipeline was created using code from [avani-bhojwani/TransFuse](https://github.com/avani-bhojwani/TransFuse) and [timslittle/nf-core-transcriptcorral](https://github.com/timslittle/nf-core-transcriptcorral/). It also incorporates modules and subworkflows developed by the nf-core community.
38+
39+
The main features of this pipeline are:
40+
41+
- Pre-processing of paired-end RNA-seq reads with FastQC, fastp, and SortMeRNA
42+
- _De novo_ transcriptome assembly with Trinity and rnaSPAdes
43+
- Redundancy reduction with EvidentialGene
44+
- Quality assessment of the assembly with BUSCO, rnaQUAST, and TransRate
45+
- Transcript-level quantification with Salmon
46+
- Reporting of the results with MultiQC

0 commit comments

Comments
 (0)