We've compiled a list of related work in this not-so-popular multi-label field. We hope it will be helpful to the researchers involved.
| Dataset | # Class | # Training data | # Test data | #ALP |
|---|---|---|---|---|
Pascal VOC 2007 (VOC07) |
20 | 5,011 | 4952 | 2.5 |
Pascal VOC 2012 (VOC12) |
20 | 11,540 | 10,991 | 2.5 |
MS-COCO 2014 (COCO) |
80 | 82,081 | 40,504 | 2.9 |
MS-COCO 2017 (COCO17) |
80 | 118,287 | 40,670 | - |
NUS-WIDE (NW) |
81 | 125,449 | 83898 | 2.4 |
WIDER Attribute (WA) |
14 | 28,345 | 29,179 | - |
PA-100K (PA) |
26 | 9,000 | 1,000 | - |
Fashion550K (F550) |
66 | 3300 | 2000 | - |
Charades (Cha) |
157 | 8,000 | 1,800 | 6.8 |
Visual Genome (VG500) |
500 | 75,774 | 32,475 | - |
Visual Genome (VG256) |
256 | 75,774 | 32,475 | - |
IAPRTC-12 (IA12) |
275 | 13,989 | 6,011 | - |
DeepFashion (DF) |
26 | 16,000 | 4,000 | - |
CUB-200-2011 (CUB) |
200 | 5,994 | 5,794 | - |
| Title | Venue | Year | Datasets | Code |
|---|---|---|---|---|
| [PLA]Orderless Recurrent Models for Multi-label Classification | CVPR | 2020 | NW,COCO,WA,PA |
Official |
| [TSGCN]Joint Input and Output Space Learning for Multi-Label Image Classification | TMM | 2020 | COCO,VOC07 |
- |
| [CMA]Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification | AAAI | 2020 | NW,COCO |
- |
| [KSSNet]Multi-Label Classification with Label Graph Superimposing | AAAI | 2020 | COCO,Cha |
Official |
| [A-GCN]Learning Class Correlations for Multi-label Image Recognition with Graph Networks | PRL | 2020 | COCO,F550 |
Official |
| [ADD-GCN]Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition | ECCV | 2020 | COCO,VOC07,VOC12 |
Official |
| [F-GCN]Fast Graph Convolution Network Based Multi-label Image Recognition via Cross-modal Fusion | CIKM | 2020 | COCO,VOC07 |
- |
| [MSSAF]Multi-scale Cross-modal Spatial Attention Fusion for Multi-label Image Recognition | ICANN | 2020 | - |
| Title | Venue | Year | Datasets | Code |
|---|---|---|---|---|
| [FFTran]Transformer-driven Feature Fusion Network and Visual Feature Coding for Multi-label Image Classification | PR | 2025 | NW,COCO,VOC07,VOC12 |
- |
| [SpliceMix]SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for Multi-label Image Classification | TMM | 2025 | COCO,VOC07 |
Official |
| [TsSAN]Two-stream Semantic Alignment Networks for Multi-label Image Classification | ICASSP | 2025 | NW,COCO,VOC07 |
- |
| [-]Multi-Label Few-Shot Image Classification via Pairwise Feature Augmentation and Flexible Prompt Learning | AAAI | 2025 | COCO,VOC12 |
- |
| [HP-DVAL]Dual-View Alignment Learning With Hierarchical-Prompt for Class-Imbalance Multi-Label Image Classification | TIP | 2025 | COCO,VOC07 |
Official |
| [DRTN]DRTN: Dual Relation Transformer Network with feature erasure and contrastive learning for multi-label image classification | NN | 2025 | NW,COCO,VOC07 |
- |
| Official |