Releases: ReadyPixels/AI_Models_Matrix
Releases · ReadyPixels/AI_Models_Matrix
Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers
Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers
Latest
Release v3.29
Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers
Latest Changelog Entry
[2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers
Added
- Kimi K2.7 Code (Frontier Models, Free-Source Models, Coding Models, Open-Source Coding Models, Autonomous Coding Agents): Open-weight coding-focused agentic model by Moonshot AI; 1T total params, 32B active (MoE); 262K context; 30% fewer thinking tokens vs K2.6; SWE-bench Verified 78.2%, AIME 2025 91.5%; $0.95/$4.00 per 1M tokens; Modified MIT license; released 2026-06-12.
- MiniMax M3 (Free-Source Models, Open-Source Coding Models): Open weights now released; frontier coding (SWE-Bench Pro 59.0%, SWE-Bench Verified 80.5%), 1M context, native multimodality; MIT license. Previously listed as API-only.
Updated
- Claude Fable 5 / Mythos 5 (Frontier Models, Coding Models): Marked with
⚠️ suspension notice. US government export control directive on June 12, 2026 forced Anthropic to disable global access to both models. Access may be restored for US-based users around July 1, 2026. All other Claude models unaffected. - Arena Elo scores (Frontier Models table): Updated Claude Opus 4.8 to 1483, Claude Opus 4.7 to 1502, GPT-5.5 to 1481, Gemini 3.1 Pro to 1486 based on arena.ai June 2026 data.
- SWE-bench Verified Leaderboard: Added Kimi K2.7 Code (#21, 78.2%), MiniMax M3 (#13, 80.5%). Re-ranked entries #13-#26 accordingly.
- AIME 2025 Leaderboard: Updated with BenchLM.ai June 2026 data. MAI-Thinking-1 now #1 (97%), Kimi K2.5 variants at #2 (96.1%).
- Top Models by Category: Coding #1 changed to Claude Opus 4.8 (Fable 5 suspended); Agentic Performance #1 changed to Claude Opus 4.8; Open Source #1 changed to MiniMax M3.
- Codex CLI (Autonomous Coding Agents): Updated description to reflect Rust-native rewrite, v0.140+, /goal persistent workflows, Chrome extension support.
- Source: https://github.com/openai/codex
- Gemini CLI deprecation note (Assisted CLI Tools): Google confirmed shutdown date of June 18, 2026; migrate to Antigravity CLI (
agy).
Removed / Deprecated
- Stale ⭐ markers removed: Removed ⭐ from entries older than 30 days from 2026-06-17: Gemini 3.5 Flash, Gemini Omni Flash, Qwen3.7-Max, Qwen3.7-Plus-Preview, Grok 4 Fast, Grok 4.20, MiniMax M3 (re-added as still within 30 days of GA), NVIDIA Nemotron 3 Ultra, MAI-Thinking-1, MAI-Code-1-Flash, Kimi Code CLI. Retained ⭐ on: MiniMax M3 (2026-06-01, still within 30 days), Kimi K2.7 Code (2026-06-12), Claude Fable 5 (2026-06-09), Claude Mythos 5 (2026-06-09).
- Document Version: 3.28 -> 3.29; Last Updated badge updated to 2026-06-17.
Files in Release
- docs/readme.md
- docs/readme.pdf
Generated on 2026-06-18T06:44:50.318Z
Release v3.26: [2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes
Release vv3.26
Release v3.26: [2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes
Latest Changelog Entry
[2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes
Added
- NVIDIA Nemotron 3 Ultra (Frontier Models + Free-Source Models): 550B/55B active MoE; 262K context (BF16) / 1M (NVFP4 Blackwell); open weight; launched at Computex June 2026; orchestration-focused, Mamba+Transformer hybrid.
- Google Colab CLI (Assisted CLI Tools): Connect local terminal to remote Colab GPU/TPU runtimes; AI agent compatible; released 2026-06-06.
Updated
- Kimi K2.6 (Frontier Models, Output Limits): Context window corrected from 256K to 262K tokens (262,144); max output updated to 262K.
- GPT-5.5 (Frontier Models, Benchmark Reference): Added Arena Elo 1495, SWE-bench Verified 88.5%, AIME 2025 99.9% from leaderboard data; corrected GPQA Diamond to 93.6%.
- OpenRouter (Unified APIs): Updated model count to 370+; added Speech & Transcription APIs, Model Fusion, private models, enterprise workspace controls (June 4, 2026).
- Microsoft Agent Framework (Multi-Agent Platforms): Renamed from AutoGen 1.0 to reflect merger with Semantic Kernel; v1.0 GA April 2026.
- Stale ⭐ removed: GPT-5.5 Instant (2026-05-05), Grok 4.3 (2026-05-06), Mistral Medium 3.5 (2026-04-29) — all older than 30 days from 2026-06-08.
Files in Release
- docs/readme.md
- docs/readme.pdf
Generated on 2026-06-07T23:01:20.917Z
Release v3.25: [2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates
Release vv3.25
Release v3.25: [2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates
Latest Changelog Entry
[2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates
Added
- JetBrains AI Assistant for VS Code (IDE Add-ons / VS Code Specific): Public preview; multi-file edits, Mellum LLM; JetBrains AI in VS Code/Cursor.
- ReSharper for VS Code (IDE Add-ons / VS Code Specific): C# code analysis + refactoring inside VS Code/Cursor; released 2026-03-05.
- Mistral Medium 3.5 (Output Token Limits + Benchmark Reference): Added spec rows for 256K context, 8K output, $1.50/$7.50.
Updated
- GitHub Copilot Agent Mode (IDE Add-ons): Added note that it auto-opens PRs with self-review.
Files in Release
- docs/readme.md
- docs/readme.pdf
Generated on 2026-06-07T19:58:57.745Z
Release v3.15: [2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates
Release vv3.15
Release v3.15: [2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates
Latest Changelog Entry
[2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates
Added
- Claude Opus 4.8: New entry across all relevant tables — Frontier Models, Cached & Batch Pricing, Benchmark Reference, SWE-bench Verified Leaderboard, Commercial Coding Models, and Cost Analysis. Released 2026-05-28 by Anthropic; $5.00/$25.00 per 1M tokens (same as Opus 4.7), fast mode $10/$50 (3x cheaper than Opus 4.7 fast mode). SWE-bench Verified 88.6%, SWE-bench Pro 69.2%. Added to Sort by Latest Update comparison table and Primary Sources.
- GPU Clouds pricing table: Replaced the bare provider listing with detailed A100 80GB and H100 80GB hourly pricing for RunPod, Vast.ai, Lambda Labs, DigitalOcean, Vultr, Hyperbolic, Cerebrium, Modal Labs, Replicate, and other providers.
Changed
- Grok 4.20 pricing: Updated from $2.00/$6.00 to $1.25/$2.50 across Frontier Models, Cached & Batch Pricing, Regional Availability, Output Token Limits, and Cost Analysis tables (83% output price cut per xAI developer docs).
- Grok 4 Fast pricing: Output price updated from $1.50 to $0.50 (aliased to Grok 4.3 per xAI API). Updated across Frontier Models, Cached & Batch Pricing, Output Token Limits, Commercial Coding Models, and Sort by Price tables. Note: Grok 4 Fast is now an alias for Grok 4.3 with unified pricing.
- xAI price cuts effective 2026-05-28: Grok 4.20 (reasoning, non-reasoning, multi-agent), Grok 4.3, and Grok Build all now at $1.25/$2.50 per 1M tokens; cached input $0.20/1M.
- Frontier Models Top Table: Updated Coding #1 to Claude Opus 4.8; Agentic Performance #1 to Claude Opus 4.8; Free & Budget reordered with cheapest first (Gemini 3.1 Flash-Lite $0.25, Grok 4.3 $1.25, Claude Haiku 4.5 $1.00).
- Model Labs (Direct): Added GPT-5.5, GPT-5.5 Instant, GPT-5.5 Pro to OpenAI row; added Claude Opus 4.8 to Anthropic row.
- Google Gemini API free tier: Updated references from Gemini 2.5 Pro/Flash/Flash-Lite to Gemini 3.1 Pro/3 Flash/3.1 Flash-Lite in Free AI APIs for Coding section.
- Project Mariner pricing: Updated Google AI Ultra reference from $249.99 to $100/mo (5× limits tier added at I/O 2026).
- Cost Analysis Pricing Tiers: Added Grok 4.3 to Mid-range tier ($0.60–$15.00/1M); updated Premium tier to include GPT-5.5 Pro and Claude Opus 4.8/4.7; added note about Grok 4 Fast aliasing to 4.3 in Budget tier.
- Sort by Latest Update: Added Claude Opus 4.8 entry; added "Grok 4.20 / 4.3 / 4 Fast" row noting major price cuts.
- Release Windows table: Updated Grok entry to note price cuts; expanded Claude Opus 4.8 entry with SWE-bench and fast mode details.
- Document metadata: Version 3.14 → 3.15; no change to Last Updated (already 2026-05-28).
Fixed
- Grok 4.20 context note: Updated from "Largest context window among Western closed models" to note all variants at $1.25/$2.50 pricing.
- Grok 4 Fast context: Updated from "2M context" note to "128K context; aliased to Grok 4.3; output $0.50/1M" across all relevant tables.
Sources (verification, 2026-05-28 UTC)
- Claude Opus 4.8 announcement — anthropic.com/news/claude-opus-4-8 (released 2026-05-28, $5/$25, 88.6% SWE-bench, 69.2% SWE-bench Pro)
- xAI Developer Models — docs.x.ai/developers/models (Grok 4.3 $1.25/$2.50; Grok 4.20 variants $1.25/$2.50; Grok 4 Fast → Grok 4.3 alias)
- Grok 4 Fast pricing — docs.x.ai/developers/models/grok-4-fast (confirms alias to Grok 4.3)
- DeepSeek API Pricing — api-docs.deepseek.com/quick_start/pricing (75% discount V4-Pro promo until 2026-05-31; V4-Flash permanent low rates)
- RunPod, Vast.ai, Lambda Labs, DigitalOcean, Vultr, Hyperbolic, Cerebrium, Modal Labs, Reify, Together AI pricing pages (May 2026)
Changed
- Document metadata:
docs/readme.mddocument version 3.13 → 3.14; Last Updated set to 2026-05-28 00:00 UTC. - Pricing format standardization: Corrected
$X / $Xformat to$X.XX / $X.XXper CLAUDE.md formatting standards for:- Gemini 3.1 Pro ($2 / $12 → $2.00 / $12.00)
- Claude Mythos Preview ($25 / $125 → $25.00 / $125.00)
- ⭐ flag corrections: Removed incorrect ⭐ flag from Claude Mythos Preview (2026-04-07 is more than 30 days before 2026-05-28).
Files in Release
- docs/readme.md
- docs/readme.pdf
Generated on 2026-05-29T10:54:34.249Z
Version 3.0 Major Update
🚀 Version 3.0 Major Update
Added
- Browser Automation Enhancements: Added API Access, Multi-Agent, and Parallel Sessions columns to Standalone AI Browsers table
- New Browser: Sidekick Browser (Windows, macOS, Linux) with AI assistant and natural language tab management
- New Developer Libraries: AgentQL (natural language web querying), ScrapeGraphAI (natural language scraping), WebVoyager (autonomous browsing research)
- New Local Agent: Khoj (supports Ollama, LM Studio, OpenAI API)
- New Cloud Agents: Perplexity Computer (Pro /mo), OpenAI Computer Use (API: /M input, /M output)
- Multi-Agent & Parallel Execution Summary: New dedicated section categorizing tools by parallel execution capability
- Pricing Column: Added to Developer Libraries table with detailed pricing
Changed
- Document Version: 2.9 → 3.0
- SWE-bench Rankings: Updated with GPT-5.5 Pro (#1, 92.3%), GPT-5.5 (#2, 88.5%), Claude Opus 4.7 (#3, 87.6%)
- Reasoning Benchmarks: Updated with GPT-5.5 Pro (#1, 100% AIME, 78.5% ARC-AGI-2)
- Model Updates: Kimi K2.5 → Kimi K2.6, DeepSeek-V3.1 → DeepSeek-V3.2
- IDE Updates: Cursor 3.1 → 3.2, Windsurf 1.9552+ → 2.0.0
- Claude Code: Updated to 2.2.1+ (added multi-session support, Opus 4.7 support)
- Commercial Coding Models: Added GPT-5.5 Pro (.00/.00 per 1M, 92.3% SWE-bench)
Sources
- Official provider documentation (OpenAI, Anthropic, Microsoft, Perplexity)
- SWE-bench Verified benchmark results (May 2026)
- AIME 2025 and ARC-AGI-2 benchmark data
Full Changelog: https://github.com/Shaerif/AI_Models_Matrix/blob/main/CHANGELOG.md
Awesome List Compliance Update
Changes
- GitHub Actions workflows added (link-check, lint, awesome-lint)
- 7 model entries flagged as unverified
- GPT-5.3-Codex pricing corrected
- CHANGELOG.md, CODE_OF_CONDUCT.md, SECURITY.md added
- CONTRIBUTING.md expanded
v2.1 - Model Specifications & Awesome-List Compliance
What's New in v2.1
Model Specifications Section
- Output token limits for all frontier models
- Cached & batch pricing columns
- Speed & latency metrics (tokens/sec, TTFT)
- Training data cutoff dates
- Multilingual support details
- Structured output & function calling support
- Regional availability info
Awesome-List Compliance
- Repository logo added
- TOC flattened to max 1 level of nesting
- TOC renamed from 'Table of Contents' to 'Contents'
- License section removed (badge + LICENSE file sufficient)
Data Quality
- Fixed model naming inconsistencies across all tables
- Removed Perplexity pricing link (data rows remain)
- Cleaned up all cross-references
Previous (v2.0)
- 10 new AI Infrastructure sections (Embedding, Video Gen, Speech/TTS, Safety, RAG, Fine-tuning, Eval/Observability, MCP, Routers, SLMs)
- Comprehensive Benchmark Reference (12 benchmarks x 24 models)
- Real benchmark scores replaced all subjective labels
- GitHub Discussions enabled
- CONTRIBUTING.md, issue templates, repo metadata