Text-to-Speech
Generate natural speech from text using the MiMo-V2.5-TTS series.
MiMo-V2.5-TTS converts text into natural speech. It follows the official Xiaomi MiMo chat completions format.
Text-to-Speech is currently free for a limited time.
API Endpoint
POST https://api.mimo-v2.com/v1/chat/completionsCall Rules
- Put the text to synthesize in an
assistantmessage. - Use an optional
usermessage for voice style instructions. - Set output audio options in the
audioobject.
Example
import base64
from openai import OpenAI
client = OpenAI(
api_key="your_mimo_api_key",
base_url="https://api.mimo-v2.com/v1"
)
completion = client.chat.completions.create(
model="mimo-v2.5-tts",
messages=[
{
"role": "assistant",
"content": "Hello! Welcome to Mimo API Provider. We are glad to have you here."
}
],
audio={
"format": "wav",
"voice": "mimo_default"
}
)
audio_bytes = base64.b64decode(completion.choices[0].message.audio.data)
with open("output.wav", "wb") as f:
f.write(audio_bytes)Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Use mimo-v2.5-tts, mimo-v2.5-tts-voicedesign, or mimo-v2.5-tts-voiceclone. |
messages | array | Yes | The speech text belongs in an assistant message. |
audio.format | string | No | Output format. Use wav, mp3, or pcm16. |
audio.voice | string | No | Built-in voice ID. Default: mimo_default. |
Compatibility
/v1/audio/speech is still supported for OpenAI speech clients, but the recommended official format is /v1/chat/completions.
MiMo API Docs