Skip to content

A3M Router Submission for LLMRouterBench #2

@Das-rebel

Description

@Das-rebel

A3M Router Submission - LLMRouterBench (ACL'26)

Summary

A3M Router is a parallel multi-LLM gateway that achieves:

  • 62.9% cost savings vs all-premium routing
  • 67% exact tier accuracy with 96% ±1 tier accuracy
  • 0.8524 robustness score (highest on RouterArena)
  • 40% latency reduction via Quickselect O(n)

NPM: npm install adaptive-memory-multi-model-router@2.14.23

Key Features

1. Parallel Multi-LLM Execution

A3M executes multiple providers simultaneously and uses confidence-weighted voting to select the best response. This is fundamentally different from sequential fallback routers.

2. Memory-Enhanced Routing

  • MemoryTree - Hierarchical context storage
  • EMA Updates - No retraining needed, learns from routing history
  • Adaptive provider selection - Tracks provider performance

3. Research-Backed Complexity Scoring

5 complexity signals derived from academic research:

  • Jargon Density (+15%) - professional terminology detection
  • Task Formality (+10%) - protocol, audit, brief identification
  • Depth Markers (+8%) - comprehensive, expert-level signals
  • Stakes Language (+5%) - critical, liability, regulatory language
  • Multi-Step Structure (+5%) - sequential reasoning patterns

4. Mathematical Optimization

  • Thompson Sampling - Bayesian exploration/exploitation
  • UCB1 Bandits - Optimal exploration bounds
  • Pareto Optimization - Multi-objective cost-accuracy-robustness

Benchmark Results

Metric Value
Exact Tier 67%
±1 Tier 96%
Cost Savings 62.9%
Robustness 0.8524
Premium Accuracy 57.5%
Free Tier Accuracy 96%

Cost Comparison

Router Cost/1K Accuracy
A3M $0.05 67%
RouteLLM $0.27 63%
GPT-4o $10.02 85%

A3M is 5.4× cheaper than RouteLLM while achieving higher accuracy.

How to Test

npm install adaptive-memory-multi-model-router@2.14.23

# Route a query
node -e "const {routeQuery} = require('adaptive-memory-multi-model-router'); console.log(routeQuery('What is quantum entanglement?'));"

# Run benchmark
node eval/run_eval.js

Comparison with Other Routers

A3M is the only router that:

  1. Executes multiple providers in parallel
  2. Uses confidence-weighted voting for ensemble decisions
  3. Has adaptive memory that learns from routing history
  4. Achieves highest robustness score while maintaining low cost

Links

Request

We request evaluation on LLMRouterBench to demonstrate A3M's cost-accuracy-robustness balance, especially for the parallel ensemble approach that differentiates it from sequential fallback routers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions