Skip to content

Releases: ReadyPixels/AI_Models_Matrix

Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers

18 Jun 06:45

Choose a tag to compare

Release v3.29

Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers

Latest Changelog Entry

[2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers

Added

  • Kimi K2.7 Code (Frontier Models, Free-Source Models, Coding Models, Open-Source Coding Models, Autonomous Coding Agents): Open-weight coding-focused agentic model by Moonshot AI; 1T total params, 32B active (MoE); 262K context; 30% fewer thinking tokens vs K2.6; SWE-bench Verified 78.2%, AIME 2025 91.5%; $0.95/$4.00 per 1M tokens; Modified MIT license; released 2026-06-12.
  • MiniMax M3 (Free-Source Models, Open-Source Coding Models): Open weights now released; frontier coding (SWE-Bench Pro 59.0%, SWE-Bench Verified 80.5%), 1M context, native multimodality; MIT license. Previously listed as API-only.

Updated

  • Claude Fable 5 / Mythos 5 (Frontier Models, Coding Models): Marked with ⚠️ suspension notice. US government export control directive on June 12, 2026 forced Anthropic to disable global access to both models. Access may be restored for US-based users around July 1, 2026. All other Claude models unaffected.
  • Arena Elo scores (Frontier Models table): Updated Claude Opus 4.8 to 1483, Claude Opus 4.7 to 1502, GPT-5.5 to 1481, Gemini 3.1 Pro to 1486 based on arena.ai June 2026 data.
  • SWE-bench Verified Leaderboard: Added Kimi K2.7 Code (#21, 78.2%), MiniMax M3 (#13, 80.5%). Re-ranked entries #13-#26 accordingly.
  • AIME 2025 Leaderboard: Updated with BenchLM.ai June 2026 data. MAI-Thinking-1 now #1 (97%), Kimi K2.5 variants at #2 (96.1%).
  • Top Models by Category: Coding #1 changed to Claude Opus 4.8 (Fable 5 suspended); Agentic Performance #1 changed to Claude Opus 4.8; Open Source #1 changed to MiniMax M3.
  • Codex CLI (Autonomous Coding Agents): Updated description to reflect Rust-native rewrite, v0.140+, /goal persistent workflows, Chrome extension support.
  • Gemini CLI deprecation note (Assisted CLI Tools): Google confirmed shutdown date of June 18, 2026; migrate to Antigravity CLI (agy).

Removed / Deprecated

  • Stale ⭐ markers removed: Removed ⭐ from entries older than 30 days from 2026-06-17: Gemini 3.5 Flash, Gemini Omni Flash, Qwen3.7-Max, Qwen3.7-Plus-Preview, Grok 4 Fast, Grok 4.20, MiniMax M3 (re-added as still within 30 days of GA), NVIDIA Nemotron 3 Ultra, MAI-Thinking-1, MAI-Code-1-Flash, Kimi Code CLI. Retained ⭐ on: MiniMax M3 (2026-06-01, still within 30 days), Kimi K2.7 Code (2026-06-12), Claude Fable 5 (2026-06-09), Claude Mythos 5 (2026-06-09).
  • Document Version: 3.28 -> 3.29; Last Updated badge updated to 2026-06-17.

Files in Release

  • docs/readme.md
  • docs/readme.pdf

Generated on 2026-06-18T06:44:50.318Z

Release v3.26: [2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

07 Jun 23:01

Choose a tag to compare

Release vv3.26

Release v3.26: [2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

Latest Changelog Entry

[2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

Added

Updated

  • Kimi K2.6 (Frontier Models, Output Limits): Context window corrected from 256K to 262K tokens (262,144); max output updated to 262K.
  • GPT-5.5 (Frontier Models, Benchmark Reference): Added Arena Elo 1495, SWE-bench Verified 88.5%, AIME 2025 99.9% from leaderboard data; corrected GPQA Diamond to 93.6%.
  • OpenRouter (Unified APIs): Updated model count to 370+; added Speech & Transcription APIs, Model Fusion, private models, enterprise workspace controls (June 4, 2026).
  • Microsoft Agent Framework (Multi-Agent Platforms): Renamed from AutoGen 1.0 to reflect merger with Semantic Kernel; v1.0 GA April 2026.
  • Stale ⭐ removed: GPT-5.5 Instant (2026-05-05), Grok 4.3 (2026-05-06), Mistral Medium 3.5 (2026-04-29) — all older than 30 days from 2026-06-08.

Files in Release

  • docs/readme.md
  • docs/readme.pdf

Generated on 2026-06-07T23:01:20.917Z

Release v3.25: [2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

07 Jun 19:59

Choose a tag to compare

Release vv3.25

Release v3.25: [2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

Latest Changelog Entry

[2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

Added

Updated

  • GitHub Copilot Agent Mode (IDE Add-ons): Added note that it auto-opens PRs with self-review.

Files in Release

  • docs/readme.md
  • docs/readme.pdf

Generated on 2026-06-07T19:58:57.745Z

Release v3.15: [2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

29 May 10:54

Choose a tag to compare

Release vv3.15

Release v3.15: [2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

Latest Changelog Entry

[2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

Added

  • Claude Opus 4.8: New entry across all relevant tables — Frontier Models, Cached & Batch Pricing, Benchmark Reference, SWE-bench Verified Leaderboard, Commercial Coding Models, and Cost Analysis. Released 2026-05-28 by Anthropic; $5.00/$25.00 per 1M tokens (same as Opus 4.7), fast mode $10/$50 (3x cheaper than Opus 4.7 fast mode). SWE-bench Verified 88.6%, SWE-bench Pro 69.2%. Added to Sort by Latest Update comparison table and Primary Sources.
  • GPU Clouds pricing table: Replaced the bare provider listing with detailed A100 80GB and H100 80GB hourly pricing for RunPod, Vast.ai, Lambda Labs, DigitalOcean, Vultr, Hyperbolic, Cerebrium, Modal Labs, Replicate, and other providers.

Changed

  • Grok 4.20 pricing: Updated from $2.00/$6.00 to $1.25/$2.50 across Frontier Models, Cached & Batch Pricing, Regional Availability, Output Token Limits, and Cost Analysis tables (83% output price cut per xAI developer docs).
  • Grok 4 Fast pricing: Output price updated from $1.50 to $0.50 (aliased to Grok 4.3 per xAI API). Updated across Frontier Models, Cached & Batch Pricing, Output Token Limits, Commercial Coding Models, and Sort by Price tables. Note: Grok 4 Fast is now an alias for Grok 4.3 with unified pricing.
  • xAI price cuts effective 2026-05-28: Grok 4.20 (reasoning, non-reasoning, multi-agent), Grok 4.3, and Grok Build all now at $1.25/$2.50 per 1M tokens; cached input $0.20/1M.
  • Frontier Models Top Table: Updated Coding #1 to Claude Opus 4.8; Agentic Performance #1 to Claude Opus 4.8; Free & Budget reordered with cheapest first (Gemini 3.1 Flash-Lite $0.25, Grok 4.3 $1.25, Claude Haiku 4.5 $1.00).
  • Model Labs (Direct): Added GPT-5.5, GPT-5.5 Instant, GPT-5.5 Pro to OpenAI row; added Claude Opus 4.8 to Anthropic row.
  • Google Gemini API free tier: Updated references from Gemini 2.5 Pro/Flash/Flash-Lite to Gemini 3.1 Pro/3 Flash/3.1 Flash-Lite in Free AI APIs for Coding section.
  • Project Mariner pricing: Updated Google AI Ultra reference from $249.99 to $100/mo (5× limits tier added at I/O 2026).
  • Cost Analysis Pricing Tiers: Added Grok 4.3 to Mid-range tier ($0.60–$15.00/1M); updated Premium tier to include GPT-5.5 Pro and Claude Opus 4.8/4.7; added note about Grok 4 Fast aliasing to 4.3 in Budget tier.
  • Sort by Latest Update: Added Claude Opus 4.8 entry; added "Grok 4.20 / 4.3 / 4 Fast" row noting major price cuts.
  • Release Windows table: Updated Grok entry to note price cuts; expanded Claude Opus 4.8 entry with SWE-bench and fast mode details.
  • Document metadata: Version 3.14 → 3.15; no change to Last Updated (already 2026-05-28).

Fixed

  • Grok 4.20 context note: Updated from "Largest context window among Western closed models" to note all variants at $1.25/$2.50 pricing.
  • Grok 4 Fast context: Updated from "2M context" note to "128K context; aliased to Grok 4.3; output $0.50/1M" across all relevant tables.

Sources (verification, 2026-05-28 UTC)

Changed

  • Document metadata: docs/readme.md document version 3.13 → 3.14; Last Updated set to 2026-05-28 00:00 UTC.
  • Pricing format standardization: Corrected $X / $X format to $X.XX / $X.XX per CLAUDE.md formatting standards for:
    • Gemini 3.1 Pro ($2 / $12 → $2.00 / $12.00)
    • Claude Mythos Preview ($25 / $125 → $25.00 / $125.00)
  • ⭐ flag corrections: Removed incorrect ⭐ flag from Claude Mythos Preview (2026-04-07 is more than 30 days before 2026-05-28).

Files in Release

  • docs/readme.md
  • docs/readme.pdf

Generated on 2026-05-29T10:54:34.249Z

Version 3.0 Major Update

05 May 14:37

Choose a tag to compare

🚀 Version 3.0 Major Update

Added

  • Browser Automation Enhancements: Added API Access, Multi-Agent, and Parallel Sessions columns to Standalone AI Browsers table
  • New Browser: Sidekick Browser (Windows, macOS, Linux) with AI assistant and natural language tab management
  • New Developer Libraries: AgentQL (natural language web querying), ScrapeGraphAI (natural language scraping), WebVoyager (autonomous browsing research)
  • New Local Agent: Khoj (supports Ollama, LM Studio, OpenAI API)
  • New Cloud Agents: Perplexity Computer (Pro /mo), OpenAI Computer Use (API: /M input, /M output)
  • Multi-Agent & Parallel Execution Summary: New dedicated section categorizing tools by parallel execution capability
  • Pricing Column: Added to Developer Libraries table with detailed pricing

Changed

  • Document Version: 2.9 → 3.0
  • SWE-bench Rankings: Updated with GPT-5.5 Pro (#1, 92.3%), GPT-5.5 (#2, 88.5%), Claude Opus 4.7 (#3, 87.6%)
  • Reasoning Benchmarks: Updated with GPT-5.5 Pro (#1, 100% AIME, 78.5% ARC-AGI-2)
  • Model Updates: Kimi K2.5 → Kimi K2.6, DeepSeek-V3.1 → DeepSeek-V3.2
  • IDE Updates: Cursor 3.1 → 3.2, Windsurf 1.9552+ → 2.0.0
  • Claude Code: Updated to 2.2.1+ (added multi-session support, Opus 4.7 support)
  • Commercial Coding Models: Added GPT-5.5 Pro (.00/.00 per 1M, 92.3% SWE-bench)

Sources

  • Official provider documentation (OpenAI, Anthropic, Microsoft, Perplexity)
  • SWE-bench Verified benchmark results (May 2026)
  • AIME 2025 and ARC-AGI-2 benchmark data

Full Changelog: https://github.com/Shaerif/AI_Models_Matrix/blob/main/CHANGELOG.md

Awesome List Compliance Update

03 Apr 02:54

Choose a tag to compare

Changes

  • GitHub Actions workflows added (link-check, lint, awesome-lint)
  • 7 model entries flagged as unverified
  • GPT-5.3-Codex pricing corrected
  • CHANGELOG.md, CODE_OF_CONDUCT.md, SECURITY.md added
  • CONTRIBUTING.md expanded

v2.1 - Model Specifications & Awesome-List Compliance

02 Apr 03:59

Choose a tag to compare

What's New in v2.1

Model Specifications Section

  • Output token limits for all frontier models
  • Cached & batch pricing columns
  • Speed & latency metrics (tokens/sec, TTFT)
  • Training data cutoff dates
  • Multilingual support details
  • Structured output & function calling support
  • Regional availability info

Awesome-List Compliance

  • Repository logo added
  • TOC flattened to max 1 level of nesting
  • TOC renamed from 'Table of Contents' to 'Contents'
  • License section removed (badge + LICENSE file sufficient)

Data Quality

  • Fixed model naming inconsistencies across all tables
  • Removed Perplexity pricing link (data rows remain)
  • Cleaned up all cross-references

Previous (v2.0)

  • 10 new AI Infrastructure sections (Embedding, Video Gen, Speech/TTS, Safety, RAG, Fine-tuning, Eval/Observability, MCP, Routers, SLMs)
  • Comprehensive Benchmark Reference (12 benchmarks x 24 models)
  • Real benchmark scores replaced all subjective labels
  • GitHub Discussions enabled
  • CONTRIBUTING.md, issue templates, repo metadata