Skip to content

SarderLab/DigitAb_Master

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DigitAb_Master

Overview

DigitAb_Master is a computational pipeline for analyzing multiplexed microscopy images using nuclear features derived from DAPI staining.

The workflow processes tissue images through the following steps:

  1. DAPI Segmentation – Detect and segment nuclei in Phenocycler images
  2. Feature Extraction – Compute intensity features
  3. Clustering – Group nuclei based on feature similarity
  4. Registration – Spatially align images across modalities
  5. Segmentation – Segment and classify nuclei in histology

This pipeline supports digital pathology and spatial biology analysis, enabling quantitative characterization of cellular populations in tissue images.


Pipeline Overview

Raw Microscopy Images
        │
        ▼
DAPI Segmentation
        │
        ▼
Feature Extraction
        │
        ▼
Clustering
        │
        ▼
Registration
        │
        ▼
Segmentation

Installation

1. Clone the Repository

git clone https://github.com/njlucare/DigitAb_Master.git
cd DigitAb_Master

2. Create a Python Environment

Using conda (recommended):

conda create -n digitab python=3.9
conda activate digitab

Or using venv:

python -m venv digitab_env
source digitab_env/bin/activate

3. Install Dependencies

Dependencies include:

  • python==3.8
  • torch==1.10.0
  • torchvision==0.11.0
  • torchaudio==0.10.0
  • cudatoolkit=11.3
  • deepcell
  • deepcell_toolbox
  • numpy
  • scipy
  • opencv-python
  • scikit-image
  • pandas
  • scikit-learn
  • matplotlib
  • tifffile
  • tiffslide
  • pystackreg
  • pyyaml
  • lxml
  • tqdm
  • glob

Data Requirements

The pipeline expects Phenocycler images with DAPI as the first channel.

Supported formats include:

  • .tif
  • .qptiff

Example directory structure:

data/
   sample1.svs
   sample1.tif
   sample1_ChannelKey.csv
   sample2.svs
   sample2.tif
   sample2_ChannelKey.csv

Step 1 — DAPI Segmentation

Purpose

DAPI staining highlights DNA, allowing identification of cell nuclei.

This step segments nuclei from the DAPI channel.

Processing Steps

Segmentation pipeline:

  1. Image binarization
  2. Deepcell segmentation
  3. Logical "AND" operation
  4. Size threshold
  5. Hole filling

Example Command

First, update DeepCell access token. Then update glob function with path to your files.

python predict.py \

Output

data/
   sample1_nuclei.tif

Step 2 — Feature Extraction

Purpose

After segmentation, features describing each nucleus are calculated.

These features capture intensity, and spatial properties.

Example Features

Feature Type Examples
Intensity mean intensity, max intensity
Spatial centroid position

Example Command

First, update glob function with path to codex images.

python codex_features_extract.py \

Output

dapi/
    sample1.csv

Each row corresponds to one nucleus.


Step 3 — Clustering

Purpose

Clustering groups nuclei with similar feature profiles to identify distinct cellular populations.

Example Command

R CMD BATCH clustering_DigitAb.R

Output

data/data/
    clusters.csv
    clusters.rds
    fullObject.rds
    slide_labels.csv
    slide_labels.rds
data/plots/
    UMAP.png
data/plots/violin/
    LRP2_mean.png
    UMOD_mean.png
    ...
data/plots/bar/
    label_0.png
    label_1.png
    ...

Each nucleus receives a cluster label.


Step 4 — Image Registration

Purpose

To spatially align Phenocycler segmentations with corresponding histological sections.

Registration Steps

  1. Segment hematoxylin deconvolution
  2. Estimate spatial transformation
  3. Warp dapi segmentation image to histological domain

Example Command

First, update the glob paths to the location of your .svs images. Next, update the mapping.xlsx with correct paths to items listed in columns. Flip and rotate represent the proper flip (flipud=1,fliplr=2), and rotation (1=90 degree ccw) required to align DAPI segmentation to hematoxylin orientation. exp_factor represents a scale factor to match the size of the dapi to hematoxylin segmentation.

python hematoxylin_segmentation.py \
python map_cells_IU.py

Output

data/
    sample1_Registered.tif

Finally, the cluster values can be remapped according to your label structure.


Step 5 — Segmentation / Cell Classification

Purpose

The final stage trains a segmentation model on the ground truths registered to your histological image.

Data Preparation

  1. Place slides and _Registered.tif files in the same folders as each other (only matching files need to be in the same folders)
  2. Update glob path with these locations, run prep_annotations.py
  3. Update the exps/citys_semi372/config_semi_wsi_DA.py file with paths to directory with txt files. labeled.txt = paths to labeled .svs or .tif slides for training unlabeled.txt = paths to unlabeled .svs or .tif slides for training val.txt = paths to labeled .svs or .tif slides for validation pred.txt = paths to .svs or .tif slides for testing
  4. Update config file with num_classes (background counts as a class)
  5. Update config file with desired parameters

Commands

Training

sbatch run.sh\

Testing

sbatch s_run.sh\

Output

data/
   sample1_prediction.tif

Citation

If you use this repository in your research, please cite the associated publication.

About

For all of the codes needed for DigitAb cell mapping, clustering, and segmentations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages