PSEUDO PR - DO NOT MERGE by ewels · Pull Request #1 · nf-core/ssds

ewels · 2021-12-20T20:16:00Z

DO NOT MERGE THIS PULL REQUEST

This PR is a pseudo-PR designed to be used for a community review of the pipeline prior to first release. Feel free to add comments and review content here - as new commits are pushed to dev the PR will update automatically.

mashehu

I had a first look through it. Good work! Some smaller things and then I'll give my 👍🏻. The output docs could use a bit more love (it is not so clear to me what the important files are, from where in the workflow they come, some more explanation of the SSDS plots in the multiqc report would be helpful).

mashehu · 2022-01-03T11:48:01Z


+* [Trimgalore](https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/)
+
+* [BWA](https://github.com/lh3/bwa)


Please check for these tools if there is really no citeable publication. BWA for example lists some in their readme.

mashehu · 2022-01-03T11:48:28Z

 ## Introduction

-<!-- TODO nf-core: Write a 1-2 sentence summary of what data the pipeline is for and what it does -->
 **nf-core/ssds** is a bioinformatics best-practice analysis pipeline for Single-Stranded DNA Sequencing (SSDS) pipeline.


Suggested change

**nf-core/ssds** is a bioinformatics best-practice analysis pipeline for Single-Stranded DNA Sequencing (SSDS) pipeline.

**nf-core/ssds** is a bioinformatics best-practice analysis pipeline for Single-Stranded DNA Sequencing (SSDS).

mashehu · 2022-01-03T11:56:02Z

    }
+
+    withName:SAMPLESHEET_CHECK {
+        executor = 'local'


why do you need this executor?

mashehu · 2022-01-03T11:57:29Z

+    }
+
+    withName:PICARD_SORTSAM_QRY {
+        cpus   = { check_max( 12    * task.attempt, 'cpus'    ) }


why don't you use the cpu and memory specifications from the process_* labels?

mashehu · 2022-01-03T11:59:36Z

@@ -0,0 +1,121 @@
+/*


Is this a remnant of an older nf-core template? Should be merged with modules.config imo.

mashehu · 2022-01-03T14:35:03Z

+                },
+                "parse_extra_types": {
+                    "type": "boolean",
+                    "description": "SSDS identifies ssDNA using the micro-homology structure at the 5' end of reads. When set to true, the SSDS pipeline also infers three additional, but low confidence fragment types; importantly, reads of these thress types are a best guess, but could be incorrect (i.e. fragments classified as dsDNA may instead come from ssDNA in a sufficiently enriched library).",


Suggested change

"description": "SSDS identifies ssDNA using the micro-homology structure at the 5' end of reads. When set to true, the SSDS pipeline also infers three additional, but low confidence fragment types; importantly, reads of these thress types are a best guess, but could be incorrect (i.e. fragments classified as dsDNA may instead come from ssDNA in a sufficiently enriched library).",

"description": "SSDS identifies ssDNA using the micro-homology structure at the 5' end of reads. When set to true, the SSDS pipeline also infers three additional, but low confidence fragment types; importantly, reads of these three types are a best guess, but could be incorrect (i.e. fragments classified as dsDNA may instead come from ssDNA in a sufficiently enriched library).",

mashehu · 2022-01-03T14:42:13Z

+                    "format": "file-path",
+                    "mimetype": "text/plain",
+                    "description": "Path to the BWA index folder.",
+                    "help_text": "This parameter is *mandatory* if `--genome` is not specified. If you don't have a BWA index available this will be generated for you automatically. Combine with `--save_reference` to save BWA index for future runs.",


I don't see the save_reference parameter implemented in your workflow. Either update this section (and the one for --fasta) or don't mention it here.

mashehu · 2022-01-03T14:50:58Z

@@ -0,0 +1,42 @@
+//


I guess this file a left over and can be deleted? 🙂

mashehu · 2022-01-03T14:52:40Z

+    ch_versions = ch_versions.mix(UCSC_BEDGRAPHTOBIGWIG.out.versions.first())
+
+    emit:
+    bigwig   = UCSC_BEDGRAPHTOBIGWIG.out.bigwig    // channel: [ val(meta), [ bai ] ]


Suggested change

bigwig = UCSC_BEDGRAPHTOBIGWIG.out.bigwig // channel: [ val(meta), [ bai ] ]

bigwig = UCSC_BEDGRAPHTOBIGWIG.out.bigwig // channel: [ val(meta), [ bigwig ] ]

mashehu · 2022-01-03T15:03:37Z

+    emit:
+    bam       = PICARD_MARKDUPLICATES.out.bam              // channel: [ val(meta), [ bam ] ]
+    bai       = SAMTOOLS_INDEX.out.bai                     // channel: [ val(meta), [ bai ] ]
+    mdmetrics = PICARD_MARKDUPLICATES.out.metrics          // channel: [ val(meta), [ bam ] ]


Suggested change

mdmetrics = PICARD_MARKDUPLICATES.out.metrics // channel: [ val(meta), [ bam ] ]

mdmetrics = PICARD_MARKDUPLICATES.out.metrics // channel: [ val(meta), [ metrics ] ]

kevbrick added 6 commits December 15, 2021 13:38

Working pipeline v1.0

b28b596

SPoT intervals update

17cd76b

Clean up prefixes & other small edge cases

b7c82bf

Update multiqc report name

4d6d739

Change parse_extra_types to param

6c945e0

Increase runtime for BEDGRAPHS

b211a44

ewels added the WIP Work in progress label Dec 20, 2021

kevbrick added 7 commits December 20, 2021 21:46

Update citations & add genome idx to sort in calculateSPOT

e4ed659

Update citations

b314475

Add categories for filtered read-pairs

b39d2f8

Update time for bedgraph making

d22984a

Restore parse module

5632dc4

update multiqcSSDS image

8d47f97

Update samplesheet.csv

0cc3ee6

mashehu suggested changes Jan 3, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PSEUDO PR - DO NOT MERGE#1

PSEUDO PR - DO NOT MERGE#1
ewels wants to merge 13 commits into
initial_commitfrom
dev

ewels commented Dec 20, 2021

Uh oh!

mashehu left a comment

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

mashehu Jan 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		* [Trimgalore](https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/)

		* [BWA](https://github.com/lh3/bwa)

	nf-core/ssds is a bioinformatics best-practice analysis pipeline for Single-Stranded DNA Sequencing (SSDS) pipeline.
	nf-core/ssds is a bioinformatics best-practice analysis pipeline for Single-Stranded DNA Sequencing (SSDS).

	"description": "SSDS identifies ssDNA using the micro-homology structure at the 5' end of reads. When set to true, the SSDS pipeline also infers three additional, but low confidence fragment types; importantly, reads of these thress types are a best guess, but could be incorrect (i.e. fragments classified as dsDNA may instead come from ssDNA in a sufficiently enriched library).",
	"description": "SSDS identifies ssDNA using the micro-homology structure at the 5' end of reads. When set to true, the SSDS pipeline also infers three additional, but low confidence fragment types; importantly, reads of these three types are a best guess, but could be incorrect (i.e. fragments classified as dsDNA may instead come from ssDNA in a sufficiently enriched library).",

	bigwig = UCSC_BEDGRAPHTOBIGWIG.out.bigwig // channel: [ val(meta), [ bai ] ]
	bigwig = UCSC_BEDGRAPHTOBIGWIG.out.bigwig // channel: [ val(meta), [ bigwig ] ]

	mdmetrics = PICARD_MARKDUPLICATES.out.metrics // channel: [ val(meta), [ bam ] ]
	mdmetrics = PICARD_MARKDUPLICATES.out.metrics // channel: [ val(meta), [ metrics ] ]

Conversation

ewels commented Dec 20, 2021

DO NOT MERGE THIS PULL REQUEST

Uh oh!

mashehu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants