Unified Multimodal: Text, image processing with real-time analysis
Million-Token Context: Handle massive documents and codebases
Advanced Reasoning: Multi-step problem-solving with RL optimization
High Performance: Sparse MoE architecture + Google TPU v6
qwen-image: It is a universal image generation model, mainly used to generate completely new images based on text, emphasizing the ability to create from scratch. It is suitable for scenarios such as creative generation, stylized drawing, and more.It is trained on large-scale vision-language models, supports multi-language prompts, but its core focus is on generation rather than editing.
k2-thinking: Moonshot AI's most advanced open reasoning model, extending the K2 series. It is a thinking model with universal Agentic capabilities and reasoning abilities. Supports 256K tokens context window.
k2-thinking-turbo: Based on k2-thinking, it provides faster response speeds and higher concurrency capabilities, supporting the same 256K context and reasoning functions, suitable for high-efficiency scenarios.
ā” Low Latency & High Throughput: Optimized for real-time, high-concurrency scenarios.
š§ Configurable Reasoning Depth: Supports "extended thinking" mode.
š Massive Context: Up to 200K input tokens, 8K output tokens.
š» Strong Code Capabilities: Code generation, debugging, tool calling.
š° Cost Advantage: ~1/3 the cost of Sonnet 4.
š§ Format Support: Claude native message format + chat format.
Zhipu AI's latest flagship model with 355B total params, 32B active.
š» Coding Excellence: Aligns with Claude Sonnet 4, best in China.
š Extended Context: Expanded from 128K to 200K tokens.
š§ Enhanced Reasoning: Supports tool calling during inference.
š Search Optimization: Improved tool calling and agent performance.
āļø Better Writing: Enhanced style, readability, and role-playing alignment.
š Multilingual: Boosted cross-language translation capabilities.
š§ Format Support: Chat format.
Google's latest AI video generation models for high-quality video creation.
š¬ High Resolution: 1080p video generation.
šµ Synchronized Audio: Dialogue, ambient sounds, effects with native lip-sync.
ā±ļø Video Length: Generate seamless clips up to 8 seconds.
šØ Creative Control: Reference image support, first/last frame setting, cinematic presets.
ā” Dual Variants: Veo3.1 (standard quality) + Veo3.1-Pro (maximum quality).
š§ Format Support: Async calls + chat format.
All models support chat format calls, with Claude models additionally supporting native message format for maximum integration flexibility!
World's most advanced reasoning models with 400k context window.
gpt-5-minimal: Lightning-fast for simple tasks.
gpt-5-low: Speed-optimized (212 tokens/sec).
gpt-5-medium: Balanced performance for general use.
gpt-5-high: Maximum "deep thinking" mode.
gpt-5-codex-low / gpt-5-codex-medium / gpt-5-codex-high: Specialized for coding & software engineering.
⨠Features: State-of-the-art coding, mathematics, visual perception & complex reasoning.