LogoMiMo API Docs
LogoMiMo API Docs
HomepageWelcome

Quick Start

Pricing & Rate Limits

API Reference

Guides

Support

FAQ

Video Understanding

Use MiMo-V2-Omni for video understanding and analysis.

MiMo-V2-Omni supports video understanding, allowing you to send video content for analysis, description, and visual question answering. Videos can be provided via URL or base64-encoded data.

Using Video URL

from openai import OpenAI

client = OpenAI(
    api_key="your_mimo_api_key",
    base_url="https://api.mimo-v2.com/v1"
)

completion = client.chat.completions.create(
    model="mimo-v2-omni",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is happening in this video?"},
                {
                    "type": "video_url",
                    "video_url": {"url": "https://example.com/video.mp4"}
                }
            ]
        }
    ]
)

print(completion.choices[0].message.content)

Using Base64 Encoded Video

from openai import OpenAI
import base64

client = OpenAI(
    api_key="your_mimo_api_key",
    base_url="https://api.mimo-v2.com/v1"
)

with open("video.mp4", "rb") as f:
    video_data = base64.b64encode(f.read()).decode()

completion = client.chat.completions.create(
    model="mimo-v2-omni",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe what happens in this video"},
                {
                    "type": "video_url",
                    "video_url": {"url": f"data:video/mp4;base64,{video_data}"}
                }
            ]
        }
    ]
)

print(completion.choices[0].message.content)

Token Consumption

Video content consumes significantly more tokens than images or text due to the frame-by-frame analysis. Token usage depends on:

  • Video duration: Longer videos consume more tokens.
  • Resolution: Higher resolution videos are sampled at more detail.
  • Frame rate: The model samples frames at regular intervals from the video.

Video content can consume a large number of tokens. Consider using shorter clips or lower resolutions to manage costs. For long videos, consider extracting key frames as images instead.

Table of Contents

Using Video URL
Using Base64 Encoded Video
Token Consumption