模型目录

为不同工作负载选择合适的 MiMo 模型

我们先把模型落地页架构搭起来，后续你可以持续往每个模型页里补场景、示例、对比和 FAQ，而不用再重做路由。

已经准备好的模型页骨架

下面每个卡片都对应独立、可持续扩展的 SEO 路由，后面直接补内容就行。

MiMo-V2.5-Pro

可用

Flagship reasoning for coding and agent workflows

The primary model for complex agent execution, coding, long-context reasoning, and tool-heavy workflows.

上下文窗口: 1M
输出窗口: 128K

能力标签

Text generationDeep reasoningStreamingFunction callingStructured outputWeb search

查看模型页查看文档

MiMo-V2.5

可用

Full-modal understanding with 1M context

Built for applications that need to understand text, images, video, and audio in one model.

上下文窗口: 1M
输出窗口: 128K

能力标签

Full-modal understandingDeep reasoningStreamingFunction callingStructured outputWeb search

查看模型页查看文档

MiMo-V2.5-TTS

可用

Expressive text-to-speech with built-in voices

Generates natural speech from assistant messages, with style control through instructions and audio tags.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechAudio outputSingingStyle control

查看模型页查看文档

MiMo-V2.5-TTS-VoiceClone

可用

Voice cloning from audio samples

Replicates a target voice from an audio sample and uses it for speech synthesis.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechVoice cloningAudio output

查看模型页查看文档

MiMo-V2.5-TTS-VoiceDesign

可用

Custom voice design from text descriptions

Creates a voice from a text description, then synthesizes speech in that custom voice.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechVoice designAudio output

查看模型页查看文档

MiMo-V2-Pro

可用

Reasoning and production-grade text generation

Built for agent workflows, structured output, and long-context reasoning tasks.

上下文窗口: 1M
输出窗口: 128K

能力标签

Text generationDeep reasoningStreamingFunction callingStructured outputWeb search

查看模型页查看文档

MiMo-V2-Omni

可用

Multimodal understanding for image, audio, and richer inputs

Designed for teams building assistants and applications that need multimodal perception.

上下文窗口: 256K
输出窗口: 128K

能力标签

Multimodal understandingDeep reasoningStreamingFunction callingWeb search

查看模型页查看文档

MiMo-V2-TTS

可用

Text-to-speech output for voice experiences

A focused speech model for teams adding natural voice output to products and workflows.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechAudio outputVoice experiences

查看模型页查看文档

模型目录

为不同工作负载选择合适的 MiMo 模型

我们先把模型落地页架构搭起来，后续你可以持续往每个模型页里补场景、示例、对比和 FAQ，而不用再重做路由。

已经准备好的模型页骨架

下面每个卡片都对应独立、可持续扩展的 SEO 路由，后面直接补内容就行。

MiMo-V2.5-Pro

可用

Flagship reasoning for coding and agent workflows

The primary model for complex agent execution, coding, long-context reasoning, and tool-heavy workflows.

上下文窗口: 1M
输出窗口: 128K

能力标签

Text generationDeep reasoningStreamingFunction callingStructured outputWeb search

查看模型页查看文档

MiMo-V2.5

可用

Full-modal understanding with 1M context

Built for applications that need to understand text, images, video, and audio in one model.

上下文窗口: 1M
输出窗口: 128K

能力标签

Full-modal understandingDeep reasoningStreamingFunction callingStructured outputWeb search

查看模型页查看文档

MiMo-V2.5-TTS

可用

Expressive text-to-speech with built-in voices

Generates natural speech from assistant messages, with style control through instructions and audio tags.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechAudio outputSingingStyle control

查看模型页查看文档

MiMo-V2.5-TTS-VoiceClone

可用

Voice cloning from audio samples

Replicates a target voice from an audio sample and uses it for speech synthesis.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechVoice cloningAudio output

查看模型页查看文档

MiMo-V2.5-TTS-VoiceDesign

可用

Custom voice design from text descriptions

Creates a voice from a text description, then synthesizes speech in that custom voice.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechVoice designAudio output

查看模型页查看文档

MiMo-V2-Pro

可用

Reasoning and production-grade text generation

Built for agent workflows, structured output, and long-context reasoning tasks.

上下文窗口: 1M
输出窗口: 128K

能力标签

Text generationDeep reasoningStreamingFunction callingStructured outputWeb search

查看模型页查看文档

MiMo-V2-Omni

可用

Multimodal understanding for image, audio, and richer inputs

Designed for teams building assistants and applications that need multimodal perception.

上下文窗口: 256K
输出窗口: 128K

能力标签

Multimodal understandingDeep reasoningStreamingFunction callingWeb search

查看模型页查看文档

MiMo-V2-TTS

可用

Text-to-speech output for voice experiences

A focused speech model for teams adding natural voice output to products and workflows.

上下文窗口: 8K
输出窗口: 8K

能力标签

Text-to-speechAudio outputVoice experiences

查看模型页查看文档