Skip to content

BANKSY notebook text: PR 1/2#1000

Open
sjspielman wants to merge 20 commits into
masterfrom
sjspielman/banksy-text
Open

BANKSY notebook text: PR 1/2#1000
sjspielman wants to merge 20 commits into
masterfrom
sjspielman/banksy-text

Conversation

@sjspielman
Copy link
Copy Markdown
Member

This the first of 2 PRs to flesh out the text in the BANKSY notebook. I'm particularly looking for feedback on the level of details I presented about how BANKSY itself works. I do figure some of the finer points will appear in forthcoming slides as well, e.g. how exactly lambda is used, so I didn't want to get into all of the weeds here (just some weeds!), so I tried to achieve a middle ground - let me know what you might add or remove!

The next PR will pick up from the "with AGF" section and go through the end, but please let me know if you'd prefer I split that up for review ease.

As needed for reference:

@sjspielman
Copy link
Copy Markdown
Member Author

Happy to see whatever was going on with #991 seems to have sorted itself out with no intervention.

@sjspielman sjspielman requested a review from jashapiro May 20, 2026 17:34
Copy link
Copy Markdown
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, and I think you have the level of detail just about right.

As far as text goes, I think the biggest thing I suggested is removing the interpretation of the non-spatial plots/clustering. Part of that is that I somewhat disagree with one of your statements, but the bigger thing is that I don't think we really need to say anything definitive there; we can try to make it more interactive. For the in-person training, I expect that to work well, though it is always a bit more a challenge in virtual trainings!

The other thing that we might want to explore just a bit more is how the feature selection affects results. I would suggest we should try to do the same thing for both PCA/clustering analyses so we can have a more apples-to-apples comparison. I'd probably try both not filtering the non-spatial PCA (just pass in the full gene list, I think?) and filtering the BANKSY input. For either case, you probably want to note that what we are doing is "non-standard" for one or the other, but it makes the comparison more fair.

Comment thread spatial/02-spatial_clustering.Rmd Outdated
Comment thread spatial/02-spatial_clustering.Rmd Outdated
Comment thread spatial/02-spatial_clustering.Rmd Outdated
Comment thread spatial/02-spatial_clustering.Rmd Outdated
Comment thread spatial/02-spatial_clustering.Rmd
Comment thread spatial/02-spatial_clustering.Rmd
Comment thread spatial/02-spatial_clustering.Rmd Outdated
Comment thread spatial/02-spatial_clustering.Rmd Outdated
Comment thread spatial/02-spatial_clustering.Rmd
Comment thread spatial/02-spatial_clustering.Rmd Outdated
Comment thread spatial/02-spatial_clustering.Rmd Outdated
@sjspielman
Copy link
Copy Markdown
Member Author

Thank you for the thorough review, including noting parameters like jaccard which I had been forgetting to include! With this added in finally, I am quite pleased with where this is landing with HVGs as well. Should be ready for another look.

Here's the HTML to see the plots, but again the coordindate plots tend to land at different sizing in the Rmd view vs rendered view (🤷‍♀️) and I optimized plot sizing for Rmd viewing. 02-spatial_clustering.nb.html

Some notes:

  • Right now for BANKSY feature selection, I am wholesale subsetting the SPE and resaving it to spe, which was a choice! The honest answer for why I saved to the same variable is for typo avoidance, but there's a great argument to put it in spe_subset or so and I will work on typing skills ;). To pair with this choice, I changed the export to be a TSV with columns for clusters only. Let me know if you'd prefer an RDS export (or a TSV with more columns?) and/or different variable handling.
  • Not specifically part of this PR, but wanted to bring it up: As part of revising this to include jaccard & svgs, we now have a few more clusters in the output and the heatmap is less straightforward to interpret just because there are more rows/cols and more blobs of color. How would you feel about just plotting the col1a1 expression and removing the heatmap, or alternatively any other (non-tricky) heatmap plot ideas?
  • I plan to final.finalize the section headers in the next PR.

@sjspielman sjspielman requested a review from jashapiro May 22, 2026 14:12
@jashapiro
Copy link
Copy Markdown
Member

Quick response before looking at code:

  • Right now for BANKSY feature selection, I am wholesale subsetting the SPE and resaving it to spe, which was a choice! The honest answer for why I saved to the same variable is for typo avoidance, but there's a great argument to put it in spe_subset or so and I will work on typing skills ;).

You already know what I am going to say here. How about hvg_spe for this?

To pair with this choice, I changed the export to be a TSV with columns for clusters only. Let me know if you'd prefer an RDS export (or a TSV with more columns?) and/or different variable handling.

I think a TSV output is fine here. Haven't looked specifically at the code, but I might just save out the whole colData to make reading it in and joining with an existing spe as simple as possible?

@sjspielman
Copy link
Copy Markdown
Member Author

You already know what I am going to say here. How about hvg_spe for this?

😂 facts

I think a TSV output is fine here. Haven't looked specifically at the code, but I might just save out the whole colData to make reading it in and joining with an existing spe as simple as possible?

this plus renamed SPE incoming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants