DigitAb_Master is a computational pipeline for analyzing multiplexed microscopy images using nuclear features derived from DAPI staining.
The workflow processes tissue images through the following steps:
- DAPI Segmentation – Detect and segment nuclei in Phenocycler images
- Feature Extraction – Compute intensity features
- Clustering – Group nuclei based on feature similarity
- Registration – Spatially align images across modalities
- Segmentation – Segment and classify nuclei in histology
This pipeline supports digital pathology and spatial biology analysis, enabling quantitative characterization of cellular populations in tissue images.
Raw Microscopy Images
│
▼
DAPI Segmentation
│
▼
Feature Extraction
│
▼
Clustering
│
▼
Registration
│
▼
Segmentation
git clone https://github.com/njlucare/DigitAb_Master.git
cd DigitAb_MasterUsing conda (recommended):
conda create -n digitab python=3.9
conda activate digitabOr using venv:
python -m venv digitab_env
source digitab_env/bin/activateDependencies include:
- python==3.8
- torch==1.10.0
- torchvision==0.11.0
- torchaudio==0.10.0
- cudatoolkit=11.3
- deepcell
- deepcell_toolbox
- numpy
- scipy
- opencv-python
- scikit-image
- pandas
- scikit-learn
- matplotlib
- tifffile
- tiffslide
- pystackreg
- pyyaml
- lxml
- tqdm
- glob
The pipeline expects Phenocycler images with DAPI as the first channel.
Supported formats include:
.tif.qptiff
Example directory structure:
data/
sample1.svs
sample1.tif
sample1_ChannelKey.csv
sample2.svs
sample2.tif
sample2_ChannelKey.csv
DAPI staining highlights DNA, allowing identification of cell nuclei.
This step segments nuclei from the DAPI channel.
Segmentation pipeline:
- Image binarization
- Deepcell segmentation
- Logical "AND" operation
- Size threshold
- Hole filling
First, update DeepCell access token. Then update glob function with path to your files.
python predict.py \data/
sample1_nuclei.tif
After segmentation, features describing each nucleus are calculated.
These features capture intensity, and spatial properties.
| Feature Type | Examples |
|---|---|
| Intensity | mean intensity, max intensity |
| Spatial | centroid position |
First, update glob function with path to codex images.
python codex_features_extract.py \dapi/
sample1.csv
Each row corresponds to one nucleus.
Clustering groups nuclei with similar feature profiles to identify distinct cellular populations.
R CMD BATCH clustering_DigitAb.Rdata/data/
clusters.csv
clusters.rds
fullObject.rds
slide_labels.csv
slide_labels.rds
data/plots/
UMAP.png
data/plots/violin/
LRP2_mean.png
UMOD_mean.png
...
data/plots/bar/
label_0.png
label_1.png
...
Each nucleus receives a cluster label.
To spatially align Phenocycler segmentations with corresponding histological sections.
- Segment hematoxylin deconvolution
- Estimate spatial transformation
- Warp dapi segmentation image to histological domain
First, update the glob paths to the location of your .svs images. Next, update the mapping.xlsx with correct paths to items listed in columns. Flip and rotate represent the proper flip (flipud=1,fliplr=2), and rotation (1=90 degree ccw) required to align DAPI segmentation to hematoxylin orientation. exp_factor represents a scale factor to match the size of the dapi to hematoxylin segmentation.
python hematoxylin_segmentation.py \
python map_cells_IU.pydata/
sample1_Registered.tif
Finally, the cluster values can be remapped according to your label structure.
The final stage trains a segmentation model on the ground truths registered to your histological image.
- Place slides and _Registered.tif files in the same folders as each other (only matching files need to be in the same folders)
- Update glob path with these locations, run prep_annotations.py
- Update the exps/citys_semi372/config_semi_wsi_DA.py file with paths to directory with txt files. labeled.txt = paths to labeled .svs or .tif slides for training unlabeled.txt = paths to unlabeled .svs or .tif slides for training val.txt = paths to labeled .svs or .tif slides for validation pred.txt = paths to .svs or .tif slides for testing
- Update config file with num_classes (background counts as a class)
- Update config file with desired parameters
Training
sbatch run.sh\Testing
sbatch s_run.sh\data/
sample1_prediction.tif
If you use this repository in your research, please cite the associated publication.