LogoMiMo API Docs
LogoMiMo API Docs
HomepageWelcome

Quick Start

Pricing & Rate Limits

API Reference

Guides

Support

FAQ

Model Parameters

Recommended parameters for MiMo-V2 series models

Recommended Parameters

The following table lists the recommended parameter values for each MiMo-V2 model:

ParameterDescriptionMiMo-V2-ProMiMo-V2-OmniMiMo-V2-Flash
temperatureControls randomness. Higher = more creative1.01.01.0
top_pNucleus sampling threshold0.950.950.95
max_completion_tokensMaximum tokens in the response1024-1280001024-1280001024-64000
frequency_penaltyPenalizes repeated tokens000
presence_penaltyPenalizes tokens already present000
streamEnable streaming outputtrue/falsetrue/falsetrue/false
stopStop sequencesnullnullnull

Parameter Details

temperature

Controls the randomness of the model's output. A value of 0 makes the output nearly deterministic, while higher values increase creativity and variation. The recommended default is 1.0 for all MiMo-V2 models.

  • Range: 0.0 to 2.0
  • Default: 1.0
  • Tip: Use lower values (e.g., 0.2) for factual or deterministic tasks. Use higher values (e.g., 1.0-1.5) for creative writing or brainstorming.

top_p

Also known as nucleus sampling. The model considers tokens whose cumulative probability mass reaches top_p. A value of 0.95 means the model samples from the smallest set of tokens whose cumulative probability is at least 95%.

  • Range: 0.0 to 1.0
  • Default: 0.95
  • Tip: Generally, adjust either temperature or top_p, but not both simultaneously.

max_completion_tokens

The maximum number of tokens the model can generate in a single response. This includes both the visible output and any internal reasoning tokens when thinking mode is enabled.

  • Range: Varies by model (see table above)
  • Default: 1024
  • Tip: Set this high enough to accommodate your expected output length. For complex reasoning tasks, consider using higher values to allow the model sufficient space for thinking.

frequency_penalty

Penalizes tokens based on how frequently they appear in the generated text so far. Positive values reduce repetition.

  • Range: -2.0 to 2.0
  • Default: 0
  • Tip: Use small positive values (e.g., 0.1-0.5) to reduce repetitive phrasing in longer outputs.

presence_penalty

Penalizes tokens based on whether they have appeared in the generated text at all, regardless of frequency. Positive values encourage the model to introduce new topics.

  • Range: -2.0 to 2.0
  • Default: 0
  • Tip: Use small positive values to encourage more diverse outputs and topic exploration.

stream

When set to true, the model sends partial responses as server-sent events (SSE) as they are generated. This provides a better user experience for interactive applications by showing output incrementally.

  • Values: true or false
  • Default: false
  • Tip: Enable streaming for chat interfaces and real-time applications. Disable it for batch processing or when you need the complete response at once.

stop

A list of sequences where the model will stop generating further tokens. When the model encounters any of the specified stop sequences, it ends the response.

  • Type: null or array of strings (up to 4 sequences)
  • Default: null
  • Tip: Use stop sequences to control output format, such as stopping at a specific delimiter or marker.

Table of Contents

Recommended Parameters
Parameter Details
temperature
top_p
max_completion_tokens
frequency_penalty
presence_penalty
stream
stop