18 Jun 06:45

Shaerif

1da0b4b

Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers Latest

Latest

Release v3.29

Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers

Latest Changelog Entry

[2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers

Added

Kimi K2.7 Code (Frontier Models, Free-Source Models, Coding Models, Open-Source Coding Models, Autonomous Coding Agents): Open-weight coding-focused agentic model by Moonshot AI; 1T total params, 32B active (MoE); 262K context; 30% fewer thinking tokens vs K2.6; SWE-bench Verified 78.2%, AIME 2025 91.5%; $0.95/$4.00 per 1M tokens; Modified MIT license; released 2026-06-12.
- Source: https://huggingface.co/moonshotai/Kimi-K2.7-Code
MiniMax M3 (Free-Source Models, Open-Source Coding Models): Open weights now released; frontier coding (SWE-Bench Pro 59.0%, SWE-Bench Verified 80.5%), 1M context, native multimodality; MIT license. Previously listed as API-only.
- Source: https://www.minimax.io/news/minimax-m3

Updated

Claude Fable 5 / Mythos 5 (Frontier Models, Coding Models): Marked with ⚠️ suspension notice. US government export control directive on June 12, 2026 forced Anthropic to disable global access to both models. Access may be restored for US-based users around July 1, 2026. All other Claude models unaffected.
- Source: https://www.anthropic.com/news/fable-mythos-access
Arena Elo scores (Frontier Models table): Updated Claude Opus 4.8 to 1483, Claude Opus 4.7 to 1502, GPT-5.5 to 1481, Gemini 3.1 Pro to 1486 based on arena.ai June 2026 data.
- Source: https://arena.ai/leaderboard/text
SWE-bench Verified Leaderboard: Added Kimi K2.7 Code (#21, 78.2%), MiniMax M3 (#13, 80.5%). Re-ranked entries #13-#26 accordingly.
- Source: https://benchlm.ai/benchmarks/sweVerified
AIME 2025 Leaderboard: Updated with BenchLM.ai June 2026 data. MAI-Thinking-1 now #1 (97%), Kimi K2.5 variants at #2 (96.1%).
- Source: https://benchlm.ai/benchmarks/aime2025
Top Models by Category: Coding #1 changed to Claude Opus 4.8 (Fable 5 suspended); Agentic Performance #1 changed to Claude Opus 4.8; Open Source #1 changed to MiniMax M3.
Codex CLI (Autonomous Coding Agents): Updated description to reflect Rust-native rewrite, v0.140+, /goal persistent workflows, Chrome extension support.
- Source: https://github.com/openai/codex
Gemini CLI deprecation note (Assisted CLI Tools): Google confirmed shutdown date of June 18, 2026; migrate to Antigravity CLI (agy).
- Source: https://blog.google/innovation-and-ai/technology/developers-tools/google-io-2026-developer-highlights/

Removed / Deprecated

Stale ⭐ markers removed: Removed ⭐ from entries older than 30 days from 2026-06-17: Gemini 3.5 Flash, Gemini Omni Flash, Qwen3.7-Max, Qwen3.7-Plus-Preview, Grok 4 Fast, Grok 4.20, MiniMax M3 (re-added as still within 30 days of GA), NVIDIA Nemotron 3 Ultra, MAI-Thinking-1, MAI-Code-1-Flash, Kimi Code CLI. Retained ⭐ on: MiniMax M3 (2026-06-01, still within 30 days), Kimi K2.7 Code (2026-06-12), Claude Fable 5 (2026-06-09), Claude Mythos 5 (2026-06-09).
Document Version: 3.28 -> 3.29; Last Updated badge updated to 2026-06-17.

Files in Release

docs/readme.md
docs/readme.pdf

Generated on 2026-06-18T06:44:50.318Z

Assets 4

07 Jun 23:01

Shaerif

vv3.26

94140fe

Release v3.26: [2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

Release vv3.26

Release v3.26: [2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

Latest Changelog Entry

[2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

Added

NVIDIA Nemotron 3 Ultra (Frontier Models + Free-Source Models): 550B/55B active MoE; 262K context (BF16) / 1M (NVFP4 Blackwell); open weight; launched at Computex June 2026; orchestration-focused, Mamba+Transformer hybrid.
- Source: https://developer.nvidia.com/blog/nvidia-nemotron-3-ultra-powers-faster-more-efficient-reasoning-for-long-running-agents/
Google Colab CLI (Assisted CLI Tools): Connect local terminal to remote Colab GPU/TPU runtimes; AI agent compatible; released 2026-06-06.
- Source: https://www.marktechpost.com/2026/06/06/googles-new-colab-cli-lets-developers-and-ai-agents-run-python-on-remote-colab-gpus-and-tpus-from-the-terminal/

Updated

Kimi K2.6 (Frontier Models, Output Limits): Context window corrected from 256K to 262K tokens (262,144); max output updated to 262K.
- Source: https://huggingface.co/moonshotai/Kimi-K2.6
GPT-5.5 (Frontier Models, Benchmark Reference): Added Arena Elo 1495, SWE-bench Verified 88.5%, AIME 2025 99.9% from leaderboard data; corrected GPQA Diamond to 93.6%.
- Source: https://benchlm.ai/benchmarks/sweVerified
OpenRouter (Unified APIs): Updated model count to 370+; added Speech & Transcription APIs, Model Fusion, private models, enterprise workspace controls (June 4, 2026).
- Source: https://openrouter.ai/announcements/all
Microsoft Agent Framework (Multi-Agent Platforms): Renamed from AutoGen 1.0 to reflect merger with Semantic Kernel; v1.0 GA April 2026.
- Source: https://uvik.net/blog/agentic-ai-frameworks/
Stale ⭐ removed: GPT-5.5 Instant (2026-05-05), Grok 4.3 (2026-05-06), Mistral Medium 3.5 (2026-04-29) — all older than 30 days from 2026-06-08.

Files in Release

docs/readme.md
docs/readme.pdf

Generated on 2026-06-07T23:01:20.917Z

Assets 4

07 Jun 19:59

Shaerif

vv3.25

b194e54

Release v3.25: [2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

Release vv3.25

Release v3.25: [2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

Latest Changelog Entry

[2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

Added

JetBrains AI Assistant for VS Code (IDE Add-ons / VS Code Specific): Public preview; multi-file edits, Mellum LLM; JetBrains AI in VS Code/Cursor.
- Source: https://www.jetbrains.com/aia-vscode/
ReSharper for VS Code (IDE Add-ons / VS Code Specific): C# code analysis + refactoring inside VS Code/Cursor; released 2026-03-05.
- Source: https://blog.jetbrains.com/dotnet/2026/03/05/resharper-for-visual-studio-code-cursor-and-compatible-editors-is-out/
Mistral Medium 3.5 (Output Token Limits + Benchmark Reference): Added spec rows for 256K context, 8K output, $1.50/$7.50.

Updated

GitHub Copilot Agent Mode (IDE Add-ons): Added note that it auto-opens PRs with self-review.

Files in Release

docs/readme.md
docs/readme.pdf

Generated on 2026-06-07T19:58:57.745Z

Assets 4

29 May 10:54

Shaerif

vv3.15

d553ca8

Release v3.15: [2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

Release vv3.15

Release v3.15: [2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

Latest Changelog Entry

[2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

Added

Claude Opus 4.8: New entry across all relevant tables — Frontier Models, Cached & Batch Pricing, Benchmark Reference, SWE-bench Verified Leaderboard, Commercial Coding Models, and Cost Analysis. Released 2026-05-28 by Anthropic; $5.00/$25.00 per 1M tokens (same as Opus 4.7), fast mode $10/$50 (3x cheaper than Opus 4.7 fast mode). SWE-bench Verified 88.6%, SWE-bench Pro 69.2%. Added to Sort by Latest Update comparison table and Primary Sources.
GPU Clouds pricing table: Replaced the bare provider listing with detailed A100 80GB and H100 80GB hourly pricing for RunPod, Vast.ai, Lambda Labs, DigitalOcean, Vultr, Hyperbolic, Cerebrium, Modal Labs, Replicate, and other providers.

Changed

Grok 4.20 pricing: Updated from $2.00/$6.00 to $1.25/$2.50 across Frontier Models, Cached & Batch Pricing, Regional Availability, Output Token Limits, and Cost Analysis tables (83% output price cut per xAI developer docs).
Grok 4 Fast pricing: Output price updated from $1.50 to $0.50 (aliased to Grok 4.3 per xAI API). Updated across Frontier Models, Cached & Batch Pricing, Output Token Limits, Commercial Coding Models, and Sort by Price tables. Note: Grok 4 Fast is now an alias for Grok 4.3 with unified pricing.
xAI price cuts effective 2026-05-28: Grok 4.20 (reasoning, non-reasoning, multi-agent), Grok 4.3, and Grok Build all now at $1.25/$2.50 per 1M tokens; cached input $0.20/1M.
Frontier Models Top Table: Updated Coding #1 to Claude Opus 4.8; Agentic Performance #1 to Claude Opus 4.8; Free & Budget reordered with cheapest first (Gemini 3.1 Flash-Lite $0.25, Grok 4.3 $1.25, Claude Haiku 4.5 $1.00).
Model Labs (Direct): Added GPT-5.5, GPT-5.5 Instant, GPT-5.5 Pro to OpenAI row; added Claude Opus 4.8 to Anthropic row.
Google Gemini API free tier: Updated references from Gemini 2.5 Pro/Flash/Flash-Lite to Gemini 3.1 Pro/3 Flash/3.1 Flash-Lite in Free AI APIs for Coding section.
Project Mariner pricing: Updated Google AI Ultra reference from $249.99 to $100/mo (5× limits tier added at I/O 2026).
Cost Analysis Pricing Tiers: Added Grok 4.3 to Mid-range tier ($0.60–$15.00/1M); updated Premium tier to include GPT-5.5 Pro and Claude Opus 4.8/4.7; added note about Grok 4 Fast aliasing to 4.3 in Budget tier.
Sort by Latest Update: Added Claude Opus 4.8 entry; added "Grok 4.20 / 4.3 / 4 Fast" row noting major price cuts.
Release Windows table: Updated Grok entry to note price cuts; expanded Claude Opus 4.8 entry with SWE-bench and fast mode details.
Document metadata: Version 3.14 → 3.15; no change to Last Updated (already 2026-05-28).

Fixed

Grok 4.20 context note: Updated from "Largest context window among Western closed models" to note all variants at $1.25/$2.50 pricing.
Grok 4 Fast context: Updated from "2M context" note to "128K context; aliased to Grok 4.3; output $0.50/1M" across all relevant tables.

Sources (verification, 2026-05-28 UTC)

Claude Opus 4.8 announcement — anthropic.com/news/claude-opus-4-8 (released 2026-05-28, $5/$25, 88.6% SWE-bench, 69.2% SWE-bench Pro)
xAI Developer Models — docs.x.ai/developers/models (Grok 4.3 $1.25/$2.50; Grok 4.20 variants $1.25/$2.50; Grok 4 Fast → Grok 4.3 alias)
Grok 4 Fast pricing — docs.x.ai/developers/models/grok-4-fast (confirms alias to Grok 4.3)
DeepSeek API Pricing — api-docs.deepseek.com/quick_start/pricing (75% discount V4-Pro promo until 2026-05-31; V4-Flash permanent low rates)
RunPod, Vast.ai, Lambda Labs, DigitalOcean, Vultr, Hyperbolic, Cerebrium, Modal Labs, Reify, Together AI pricing pages (May 2026)

Changed

Document metadata: docs/readme.md document version 3.13 → 3.14; Last Updated set to 2026-05-28 00:00 UTC.
Pricing format standardization: Corrected $X / $X format to $X.XX / $X.XX per CLAUDE.md formatting standards for:
- Gemini 3.1 Pro ($2 / $12 → $2.00 / $12.00)
- Claude Mythos Preview ($25 / $125 → $25.00 / $125.00)
⭐ flag corrections: Removed incorrect ⭐ flag from Claude Mythos Preview (2026-04-07 is more than 30 days before 2026-05-28).

Files in Release

docs/readme.md
docs/readme.pdf

Generated on 2026-05-29T10:54:34.249Z

Assets 4

05 May 14:37

Shaerif

v3.0

af4cbd9

Version 3.0 Major Update

🚀 Version 3.0 Major Update

Added

Browser Automation Enhancements: Added API Access, Multi-Agent, and Parallel Sessions columns to Standalone AI Browsers table
New Browser: Sidekick Browser (Windows, macOS, Linux) with AI assistant and natural language tab management
New Developer Libraries: AgentQL (natural language web querying), ScrapeGraphAI (natural language scraping), WebVoyager (autonomous browsing research)
New Local Agent: Khoj (supports Ollama, LM Studio, OpenAI API)
New Cloud Agents: Perplexity Computer (Pro /mo), OpenAI Computer Use (API: /M input, /M output)
Multi-Agent & Parallel Execution Summary: New dedicated section categorizing tools by parallel execution capability
Pricing Column: Added to Developer Libraries table with detailed pricing

Changed

Document Version: 2.9 → 3.0
SWE-bench Rankings: Updated with GPT-5.5 Pro (#1, 92.3%), GPT-5.5 (#2, 88.5%), Claude Opus 4.7 (#3, 87.6%)
Reasoning Benchmarks: Updated with GPT-5.5 Pro (#1, 100% AIME, 78.5% ARC-AGI-2)
Model Updates: Kimi K2.5 → Kimi K2.6, DeepSeek-V3.1 → DeepSeek-V3.2
IDE Updates: Cursor 3.1 → 3.2, Windsurf 1.9552+ → 2.0.0
Claude Code: Updated to 2.2.1+ (added multi-session support, Opus 4.7 support)
Commercial Coding Models: Added GPT-5.5 Pro (.00/.00 per 1M, 92.3% SWE-bench)

Sources

Official provider documentation (OpenAI, Anthropic, Microsoft, Perplexity)
SWE-bench Verified benchmark results (May 2026)
AIME 2025 and ARC-AGI-2 benchmark data

Full Changelog: https://github.com/Shaerif/AI_Models_Matrix/blob/main/CHANGELOG.md

Assets 2

03 Apr 02:54

Shaerif

v2.2

dee6a8f

Awesome List Compliance Update

Changes

GitHub Actions workflows added (link-check, lint, awesome-lint)
7 model entries flagged as unverified
GPT-5.3-Codex pricing corrected
CHANGELOG.md, CODE_OF_CONDUCT.md, SECURITY.md added
CONTRIBUTING.md expanded

Assets 2

02 Apr 03:59

Shaerif

v2.1

a72bfba

v2.1 - Model Specifications & Awesome-List Compliance

What's New in v2.1

Model Specifications Section

Output token limits for all frontier models
Cached & batch pricing columns
Speed & latency metrics (tokens/sec, TTFT)
Training data cutoff dates
Multilingual support details
Structured output & function calling support
Regional availability info

Awesome-List Compliance

Repository logo added
TOC flattened to max 1 level of nesting
TOC renamed from 'Table of Contents' to 'Contents'
License section removed (badge + LICENSE file sufficient)

Data Quality

Fixed model naming inconsistencies across all tables
Removed Perplexity pricing link (data rows remain)
Cleaned up all cross-references

Previous (v2.0)

10 new AI Infrastructure sections (Embedding, Video Gen, Speech/TTS, Safety, RAG, Fine-tuning, Eval/Observability, MCP, Routers, SLMs)
Comprehensive Benchmark Reference (12 benchmarks x 24 models)
Real benchmark scores replaced all subjective labels
GitHub Discussions enabled
CONTRIBUTING.md, issue templates, repo metadata

Assets 2

Releases: ReadyPixels/AI_Models_Matrix

Release v3.29: [2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers

Release v3.29

Latest Changelog Entry

[2026-06-17] - Version 3.29 - Fable/Mythos suspension, Kimi K2.7 Code, MiniMax M3 GA, Arena Elo refresh, stale markers

Added

Updated

Removed / Deprecated

Files in Release

Uh oh!

Release v3.26: [2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

Release vv3.26

Latest Changelog Entry

[2026-06-08] - Version 3.26 - Nemotron 3 Ultra, Colab CLI, OpenRouter Speech API, benchmark & staleness fixes

Added

Updated

Files in Release

Uh oh!

Release v3.25: [2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

Release vv3.25

Latest Changelog Entry

[2026-06-07] - Version 3.25 - IDE Add-ons, Output Specs, Benchmark Reference updates

Added

Updated

Files in Release

Uh oh!

Release v3.15: [2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

Release vv3.15

Latest Changelog Entry

[2026-05-28] - Version 3.15 - Claude Opus 4.8, Grok pricing cuts, GPU Cloud pricing, model data updates

Added

Changed

Fixed

Sources (verification, 2026-05-28 UTC)

Changed

Files in Release

Uh oh!

Version 3.0 Major Update

🚀 Version 3.0 Major Update

Added

Changed

Sources

Uh oh!

Awesome List Compliance Update

Changes

Uh oh!

v2.1 - Model Specifications & Awesome-List Compliance

What's New in v2.1

Model Specifications Section

Awesome-List Compliance

Data Quality

Previous (v2.0)

Uh oh!