minimax-m3

minimax-m3

316K Downloads Updated 1 month ago

MiniMax M3: Coding & Agentic Frontier. 1M context window. Native Multimodality.

vision tools thinking cloud

Usage

high

Context

512K tokens

ollama run minimax-m3:cloud

curl http://localhost:11434/api/chat \
  -d '{
    "model": "minimax-m3:cloud",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from ollama import chat

response = chat(
    model='minimax-m3:cloud',
    messages=[{'role': 'user', 'content': 'Hello!'}],
)
print(response.message.content)

import ollama from 'ollama'

const response = await ollama.chat({
  model: 'minimax-m3:cloud',
  messages: [{role: 'user', content: 'Hello!'}],
})
console.log(response.message.content)

Applications

Claude Code

Claude Code ollama launch claude --model minimax-m3:cloud

OpenCode

OpenCode ollama launch opencode --model minimax-m3:cloud

Hermes Agent

Hermes Agent ollama launch hermes --model minimax-m3:cloud

OpenClaw

OpenClaw ollama launch openclaw --model minimax-m3:cloud

Models

Name

1 model

Size / Usage

Context

Input

minimax-m3:cloud

High Usage · 512K context window · Text, Image · 1 month ago

minimax-m3:cloud

512K

Text, Image

Readme

Ollama’s Cloud is officially licensed with MiniMax for commercial usage

In partnership with MiniMax, the M3 model on Ollama’s Cloud is US-based with zero data retention.

Highlights

MiniMax M3 achieves top-tier performance on coding and agentic benchmarks, with autonomous task decomposition, tool invocation, and multi-step reasoning capabilities — providing a reliable foundation for AI coding assistants and automated workflows.
Powered by the proprietary MiniMax Sparse Attention (MSA) architecture, M3 supports up to 1M tokens context window with a guaranteed minimum of 512K tokens. The 1M context is the infrastructure for long-range Agent tasks, long-range Coding, and long-video understanding.
A natively multimodal model. The entire data pipeline was rebuilt to scale pretraining data to 100T+, with multimodal training from step zero achieving deep alignment between textual and visual semantic spaces. Multimodal is a native core capability, not a superficial add-on.
On BrowseComp, M3 scores 83.5, surpassing Opus 4.7 (79.3), demonstrating strong autonomous browsing and information retrieval capabilities.
Until now, only a handful of closed-source models could simultaneously achieve frontier coding capabilities, million-token context, and Multimodal. M3 is the first to bring complete frontier capability to the open world.

Benchmark

Architecture

MiniMax Sparse Attention (MSA) Architecture

The MSA architecture enables native ultra-long context pretraining. M3 supports up to 1M tokens context window with a guaranteed minimum of 512K tokens, delivering excellent inference latency and throughput at extreme context lengths. The 1M context is the infrastructure for long-range Agent tasks, long-range Coding, and long-video understanding.

Reference

MiniMax M3 blog

> Ollama's Cloud is officially licensed with MiniMax for commercial usage

> In partnership with MiniMax, the M3 model on Ollama's Cloud is US-based with zero data retention.

### Highlights

- MiniMax M3 achieves top-tier performance on coding and agentic benchmarks, with autonomous task decomposition, tool invocation, and multi-step reasoning capabilities — providing a reliable foundation for AI coding assistants and automated workflows.

- Powered by the proprietary MiniMax Sparse Attention (MSA) architecture, M3 supports up to 1M tokens context window with a guaranteed minimum of 512K tokens. The 1M context is the infrastructure for long-range Agent tasks, long-range Coding, and long-video understanding.

- A natively multimodal model. The entire data pipeline was rebuilt to scale pretraining data to 100T+, with multimodal training from step zero achieving deep alignment between textual and visual semantic spaces. Multimodal is a native core capability, not a superficial add-on.

- On BrowseComp, M3 scores 83.5, surpassing Opus 4.7 (79.3), demonstrating strong autonomous browsing and information retrieval capabilities.

- Until now, only a handful of closed-source models could simultaneously achieve frontier coding capabilities, million-token context, and Multimodal. M3 is the first to bring complete frontier capability to the open world.

### Benchmark
![Benchmark](/assets/library/minimax-m3/3d512ace-b6bf-4e00-8bc0-cbe9ff79627b)

### Architecture

MiniMax Sparse Attention (MSA) Architecture

![image.png](/assets/library/minimax-m3/de272c65-0490-4e05-93e4-4150d5763817)

The MSA architecture enables native ultra-long context pretraining. M3 supports up to 1M tokens context window with a guaranteed minimum of 512K tokens, delivering excellent inference latency and throughput at extreme context lengths. The 1M context is the infrastructure for long-range Agent tasks, long-range Coding, and long-video understanding.

![Full benchmarks](/assets/library/minimax-m3/a998096b-a024-4971-be2b-5aeae3cacbe1)

### Reference

[MiniMax M3 blog](https://www.minimax.io/models/text/m3)

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)