Skip to content

ZBox1005/MDMF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Detecting AI-Generated Images via Local Distributional Shifts

Project Page arXiv License


Overview

MDMF (Micro-Defects expose Macro-Fakes) reframes AI-generated image detection as a local distributional problem rather than an image-level classification one. Instead of compressing each image into a single representation that tends to over-rely on global semantics, MDMF treats every image as a collection of patches, projects each patch into a learnable forensic latent space — the Patch Forensic Signature (PFS) — and measures the distributional discrepancy between the test image and a small reference bank of clean real images via a Maximum Mean Discrepancy (MMD) score.

We release the source code, the trained MDMF checkpoint, and the scripts needed to reproduce the main paper results.

Key Highlights

  • Patch Forensic Signature (PFS) — A learnable forensic reparameterization of frozen DINOv2 patch tokens that suppresses semantic invariances and amplifies generation-induced statistical irregularities.
  • MDMF detector — A distribution-aware detection framework that aggregates patch-level evidence into a stable image-level score via MMD between PFS distributions, avoiding the per-patch decision boundary that destabilizes hard-voting baselines.
  • Theory-grounded — We prove that patch-wise PFS modeling yields a provably larger MMD signal than global pooling whenever localized forensic cues are present, and we derive a finite-sample separation guarantee with a finite optimal patch count $K^\star$.
  • Strong cross-generator generalization — Trained on a single 4-class ProGAN split, MDMF reaches 95.65 average AUROC on the ImageNet benchmark (9 generators spanning diffusion, GAN, and AR families) and remains state-of-the-art on LSUN-Bedroom, GenImage, WildRF, LDMFakeDetect, and an OpenSora video-frame stress test, with markedly gentler degradation under JPEG, blur, and noise post-processing than the strongest training-based baseline.

Installation

git clone https://github.com/ZBox1005/MDMF.git
cd MDMF
conda create -n mdmf python=3.10 -y
conda activate mdmf
pip install -r requirements.txt

The code targets Python 3.10 + PyTorch 2.0+ on a single GPU. The DINOv2 ViT-L/14 backbone is fetched on first use via torch.hub.

Quickstart

We organize the workflow into three steps: (1) precompute DINOv2 patch embeddings, (2) train the MDMF detector, and (3) evaluate on unseen generators. Each step uses one self-contained script under src/.

Dataset Layout

Organize datasets following the structure below; the scripts read from these paths via CLI flags.

data/
├── train/
│   ├── 0_real/        # training real images (e.g., LSUN classes used by ProGAN)
│   └── 1_fake/        # training fake images (e.g., 4-class ProGAN)
├── val/
│   ├── 0_real/
│   └── 1_fake/
├── ref/
│   └── 0_real/        # reference real images (3K is enough; 5K for the strongest setting)
└── test/
    ├── real/
    ├── adm/           # one folder per test generator
    ├── ldm/
    └── ...

We follow CNNDetection for the training split (4-class ProGAN) and the standard cross-generator evaluation suites for testing (ImageNet, LSUN-Bedroom, GenImage, WildRF, LDMFakeDetect).

Step 1 — Precompute Patch Embeddings

# Training pairs
python src/precompute_embeddings.py \
    --real_dir   data/train/0_real \
    --fake_dir   data/train/1_fake \
    --output     embeddings/train_patch32.pkl \
    --patch_size 32 --batch_size 256

# Validation pairs
python src/precompute_embeddings.py \
    --real_dir   data/val/0_real \
    --fake_dir   data/val/1_fake \
    --output     embeddings/val_patch32.pkl \
    --patch_size 32 --batch_size 256

# Reference bank (real-only)
python src/precompute_embeddings.py \
    --real_dir   data/ref/0_real \
    --output     embeddings/ref_3k_patch32.pkl \
    --patch_size 32 --batch_size 256

Step 2 — Train MDMF

Edit config.json to point at the embeddings produced in Step 1, then:

python src/train_MDMF.py \
    --config     config.json \
    --output_dir checkpoints/mdmf_imagenet

We provide the trained checkpoint at checkpoints/model.pth so Step 2 can be skipped.

Step 3 — Evaluate

python src/test_MDMF.py \
    --model           checkpoints/model.pth \
    --ref_embeddings  embeddings/ref_3k_patch32.pkl \
    --test_real       data/test/real \
    --test_fake       data/test/adm data/test/ldm data/test/biggan ... \
    --output_dir      results/mdmf_imagenet \
    --batch_size      256

Pass any number of fake generator directories to --test_fake; the script reports per-generator AUROC, AP, and FPR@95TPR plus the average over all listed generators.

Main Results

MDMF reaches an average AUROC of 95.65 / AP of 97.07 on the 9-generator ImageNet benchmark, outperforming every training-based detector we compare against. The advantage is consistent across diffusion, GAN, and AR generator families, and the per-generator gains are largest precisely on the hardest diffusion targets (ADM, ADMG, LDM, DiT-XL/2). See the paper for results on LSUN-Bedroom, GenImage, WildRF, LDMFakeDetect, and the OpenSora video-frame stress test.

Repository Structure

MDMF/
├── README.md
├── LICENSE
├── requirements.txt
├── config.json                       # training hyperparameters template
├── src/
│   ├── precompute_embeddings.py      # DINOv2 patch-token extraction
│   ├── train_MDMF.py                 # MDMF training loop (PFS + MMD objective)
│   └── test_MDMF.py                  # evaluation against reference bank
├── checkpoints/
│   └── model.pth                     # released MDMF checkpoint
└── assets/
    ├── logo_full.png
    ├── framework.png
    └── main_table.png

Citation

If you find this work useful, please cite:

@article{zhang2026mdmf,
  title={Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts},
  author={Zhang, Boxuan and Zhu, Jianing and Wang, Qifan and Liu, Jiang and Tang, Ruixiang},
  journal={arXiv preprint arXiv:2605.09296},
  year={2026}
}

License

Code released under the MIT License. Datasets used in this paper are released under their own original licenses; please follow the terms of the corresponding releases (ImageNet, LSUN-Bedroom, GenImage, WildRF, LDMFakeDetect, OpenSora, MSR-VTT).


About

Micro-Defects expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages