4,223 12 hours ago

MiniMax M3: Coding & Agentic Frontier. 1M context window. Native Multimodality.

vision tools thinking cloud
Usage
high
Context
512K tokens
ollama run minimax-m3:cloud

Applications

Claude Code
Claude Code ollama launch claude --model minimax-m3:cloud
Codex App
Codex App ollama launch codex-app --model minimax-m3:cloud
OpenClaw
OpenClaw ollama launch openclaw --model minimax-m3:cloud
Hermes Agent
Hermes Agent ollama launch hermes --model minimax-m3:cloud
Codex
Codex ollama launch codex --model minimax-m3:cloud
OpenCode
OpenCode ollama launch opencode --model minimax-m3:cloud

Models

View all →

Readme

Ollama’s Cloud is officially licensed with MiniMax for commercial usage

In partnership with MiniMax, the M3 model on Ollama’s Cloud is US-based with zero data retention.

Highlights

  • MiniMax M3 achieves top-tier performance on coding and agentic benchmarks, with autonomous task decomposition, tool invocation, and multi-step reasoning capabilities — providing a reliable foundation for AI coding assistants and automated workflows.

  • Powered by the proprietary MiniMax Sparse Attention (MSA) architecture, M3 supports up to 1M tokens context window with a guaranteed minimum of 512K tokens. The 1M context is the infrastructure for long-range Agent tasks, long-range Coding, and long-video understanding.

  • A natively multimodal model. The entire data pipeline was rebuilt to scale pretraining data to 100T+, with multimodal training from step zero achieving deep alignment between textual and visual semantic spaces. Multimodal is a native core capability, not a superficial add-on.

  • On BrowseComp, M3 scores 83.5, surpassing Opus 4.7 (79.3), demonstrating strong autonomous browsing and information retrieval capabilities.

  • Until now, only a handful of closed-source models could simultaneously achieve frontier coding capabilities, million-token context, and Multimodal. M3 is the first to bring complete frontier capability to the open world.

Benchmark

Benchmark

Architecture

MiniMax Sparse Attention (MSA) Architecture

image.png

The MSA architecture enables native ultra-long context pretraining. M3 supports up to 1M tokens context window with a guaranteed minimum of 512K tokens, delivering excellent inference latency and throughput at extreme context lengths. The 1M context is the infrastructure for long-range Agent tasks, long-range Coding, and long-video understanding.

Full benchmarks

Reference

MiniMax M3 blog