Skip to content

Add TabICL-v2 solver and UEA multivariate dataset loader#19

Open
ievred wants to merge 3 commits into
benchopt:mainfrom
ievred:main
Open

Add TabICL-v2 solver and UEA multivariate dataset loader#19
ievred wants to merge 3 commits into
benchopt:mainfrom
ievred:main

Conversation

@ievred
Copy link
Copy Markdown

@ievred ievred commented May 29, 2026

Adds a TabICL-v2 solver (tabicl.py) that uses the TabICL tabular foundation model for in-context learning classification on UCR datasets.

  • UCR samples (T, 1) are flattened to (N, T) tabular rows and fed directly to TabICLClassifier
  • Model loading (and checkpoint download) happens in set_objective — excluded from benchmark timing
  • sampling_strategy = "run_once" — classification is solved in a single ICL forward pass
  • CUDA/CPU device auto-detected via torch.cuda.is_available(), matching Mantis
  • Validated on ECG5000: accuracy 94.6 %, f1_weighted 0.940 in 74 s on CPU
  • 4 unit tests added in test_tabicl.py

@ievred
Copy link
Copy Markdown
Author

ievred commented May 29, 2026

UPD:

Changes

tabicl.py — TabICL-v2 solver for UCR/UEA classification

  • Uses tabicl (ICML 2026), a tabular in-context learning foundation model
  • UCR/UEA samples (T, C) are flattened to (N, T×C) tabular rows — no interpolation needed for fixed-length archives
  • Model instantiation and checkpoint download happen in set_objective (excluded from timing); fit runs in run
  • sampling_strategy = "run_once" — classification is a single ICL forward pass
  • CUDA/CPU device auto-detected via torch.cuda.is_available(), matching the Mantis solver pattern
  • Checkpoint reload guard prevents re-downloading across consecutive datasets

uea.py — UEA multivariate dataset loader

  • Mirrors ucr.py exactly, backed by the same tslearn.UCR_UEA_datasets loader
  • Default dataset: BasicMotions (T=100, C=6, 4 classes)
  • Covers all 30 multivariate UEA datasets available via tslearn

test_tabicl.py — 4 unit tests (offline, fake tabicl)

  • fit shapes, predict shapes, classifier reuse across datasets, skip on unsupported tasks

Results

Dataset Accuracy Balanced Acc F1 weighted Time
ECG5000 (UCR, univariate) 0.946 0.601 0.941 74 s (CPU)
BasicMotions (UEA, multivariate) 1.000 1.000 1.000 2 s (cached)

@ievred ievred closed this May 29, 2026
@ievred ievred reopened this May 29, 2026
@ievred ievred changed the title TabICL-v2 solver for UCR classification Add TabICL-v2 solver and UEA multivariate dataset loader May 29, 2026
- Add solvers/tabpfn.py: TabPFN-v2 local-checkpoint solver for UCR/UEA
  classification. Uses ModelVersion.V2 (Apache 2.0 weights), device=auto,
  ignore_pretraining_limits=True. Follows same structure as tabicl.py.
- Add tests/solvers/test_tabpfn.py: 5 offline unit tests (fake tabpfn
  module) covering univariate/multivariate shapes, classifier reuse,
  and task-skip logic.
- Fix solvers/tabicl.py: reload guard now keys on (checkpoint_version,
  n_estimators) so changing n_estimators correctly triggers reinstantiation.
- Fix tests/solvers/test_tabicl.py: update reuse test docstring; stub
  benchopt and torch so tests run without those packages installed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant