Model Parameters
Recommended parameters for MiMo-V2 series models
Recommended Parameters
The following table lists the recommended parameter values for each MiMo-V2 model:
| Parameter | Description | MiMo-V2-Pro | MiMo-V2-Omni | MiMo-V2-Flash |
|---|---|---|---|---|
temperature | Controls randomness. Higher = more creative | 1.0 | 1.0 | 1.0 |
top_p | Nucleus sampling threshold | 0.95 | 0.95 | 0.95 |
max_completion_tokens | Maximum tokens in the response | 1024-128000 | 1024-128000 | 1024-64000 |
frequency_penalty | Penalizes repeated tokens | 0 | 0 | 0 |
presence_penalty | Penalizes tokens already present | 0 | 0 | 0 |
stream | Enable streaming output | true/false | true/false | true/false |
stop | Stop sequences | null | null | null |
Parameter Details
temperature
Controls the randomness of the model's output. A value of 0 makes the output nearly deterministic, while higher values increase creativity and variation. The recommended default is 1.0 for all MiMo-V2 models.
- Range: 0.0 to 2.0
- Default: 1.0
- Tip: Use lower values (e.g., 0.2) for factual or deterministic tasks. Use higher values (e.g., 1.0-1.5) for creative writing or brainstorming.
top_p
Also known as nucleus sampling. The model considers tokens whose cumulative probability mass reaches top_p. A value of 0.95 means the model samples from the smallest set of tokens whose cumulative probability is at least 95%.
- Range: 0.0 to 1.0
- Default: 0.95
- Tip: Generally, adjust either
temperatureortop_p, but not both simultaneously.
max_completion_tokens
The maximum number of tokens the model can generate in a single response. This includes both the visible output and any internal reasoning tokens when thinking mode is enabled.
- Range: Varies by model (see table above)
- Default: 1024
- Tip: Set this high enough to accommodate your expected output length. For complex reasoning tasks, consider using higher values to allow the model sufficient space for thinking.
frequency_penalty
Penalizes tokens based on how frequently they appear in the generated text so far. Positive values reduce repetition.
- Range: -2.0 to 2.0
- Default: 0
- Tip: Use small positive values (e.g., 0.1-0.5) to reduce repetitive phrasing in longer outputs.
presence_penalty
Penalizes tokens based on whether they have appeared in the generated text at all, regardless of frequency. Positive values encourage the model to introduce new topics.
- Range: -2.0 to 2.0
- Default: 0
- Tip: Use small positive values to encourage more diverse outputs and topic exploration.
stream
When set to true, the model sends partial responses as server-sent events (SSE) as they are generated. This provides a better user experience for interactive applications by showing output incrementally.
- Values:
trueorfalse - Default:
false - Tip: Enable streaming for chat interfaces and real-time applications. Disable it for batch processing or when you need the complete response at once.
stop
A list of sequences where the model will stop generating further tokens. When the model encounters any of the specified stop sequences, it ends the response.
- Type:
nullor array of strings (up to 4 sequences) - Default:
null - Tip: Use stop sequences to control output format, such as stopping at a specific delimiter or marker.
MiMo API Docs