API Doc-CometAPI
HomeDashBoardModel_Price
HomeDashBoardModel_Price
Discord_Support
  1. GET START
  • GET START
    • Model New Release Announcement
    • Help Center
    • Quick Start
    • About Pricing
    • About Grouping
    • Interface Stability
    • Privacy policy
    • Terms of service
    • Error code description
    • Code example
    • Must see for use
    • Common Misconceptions
    • Confusion about use
    • Best Practices
      • CometAPI Account Balance Query API Usage Instructions
      • Retry Logic Documentation for CometAPI and OpenAI Official API
      • Midjourney Best Practices
      • Runway Best Practices
  • OpenAI Compatiable Endpoint
    • gpt-4o-image generates image
    • Chat
    • Recognizing Images
    • Models
    • Embeddings
    • Images
    • Realtime
    • Image Editing (gpt-image-1)
  • Audio
    • Create speech
    • Create transcription
    • Create translation
  • Anthropic Compatiable Endpoint
    • Anthropic Claude
  • Music Generation Endpoint
    • Suno
      • Setting suno Version
      • Generate lyrics
      • Generate music clip
      • Upload clip
      • Submit concatenation
      • Single task query
      • Batch query tasks
    • Udio(Temporarily unavailable)
      • Generate music
      • Task query
  • Image Generation Endpoint
    • Midjourney(images)
      • Task Fetching API
        • List by Condition
        • Fetch Single Task (most recommended)
      • Imagine
      • Change (UPSCALE; VARIATION; REROLL)
      • Action (UPSCALE; VARIATION; REROLL; ZOOM, etc.)
      • Blend (image -> image)
      • Describe (image -> text)
      • Modal (Area Redesign & Zoom)
    • Ideogram(images)
      • Official documentation (updated in real time)
      • Generate 3.0 (text to image)
      • Remix 3.0 (hybrid image)
      • Reframe 3.0(Reconstruction)
      • Replace Background 3.0(Background replacement)
      • Edit 3.0(Editing images)
      • ideogram Text Raw Image
      • ideogram Hybrid image
      • ideogram enlargement HD
      • ideogram describes the image
      • ideogram Edit image
    • Flux(images)
      • Generate image (replicate format)
      • flux fine-tune images(Temporarily unavailable)
      • flux generate image(Temporarily unavailable)
      • flux query
    • Replicate(image)
      • replicate Generate
      • replicate query
    • Recraft(images)
      • Appendix
      • Recraft Generate Image
      • Recraft Vectorize Image
      • Recraft Remove Background
      • Recraft Clarity Upscale
      • Recraft Create style
      • Recraft Generative Upscale
  • Video Generation Ednpoint
    • runway(video)
      • official format
        • runway images raw video
        • runway to get task details
      • Reverse Format
        • generate(text)
        • generate(Reference images)
        • Video to Video Style Redraw
        • Act-one Expression Migration
        • feed-get task
    • kling (video)
      • callback_url
      • Generating images
      • Text Generation Video
      • Image Generation Video
      • Video Extension
      • virtual try-on
      • lip sync
      • Individual queries (videos)
    • MiniMax Conch(video)
      • MiniMax Conch Official Documentation
      • MiniMax Conch Generation
      • MiniMax Conch Query
      • MiniMax Conch Download
    • luma (video)
      • Official api interface format
        • luma generate
        • luma search
    • PIKA(video)
      • pika feed
      • PIKA Reference Video Generation
      • PIKA Reference Image Generation
      • PIKA reference text generation
    • sora
      • Reverse Format
        • Create Video
        • Query Video Task
        • Create Video
  • Software Integration Guide
    • cometapi Site API Call Testing
    • OpenManus
    • Chatbox
    • CherryStudio
    • Cursor
    • ChatHub
    • cline
    • dify
    • gptme
    • Immersive Translation
    • Lobe-Chat
    • Zotero
    • LangChain
    • AnythingLLM
    • Eudic Translation
    • OpenAI Translator
    • ChatAll Translation
    • Pot Translation
    • GPT Academic Optimization (gpt_academic)
    • NEXT CHAT (ChatGPT Next Web)
    • Obsidian's Text Generator Plugin
    • Open WebUI
    • avante.nvim
    • librechat
    • Lazy Customer Service
    • utools-ChatGPT Friend
    • IntelliJ Translation Plugin
    • n8n
  1. GET START

Model New Release Announcement

🌟 2025.05.07#

🔹 Suno v4.5
Suno v4.5: v4.5 has more expressive music and richer vocals, designed to enhance the user's expression and intuition in music creation. This site now supports Suno 4.5, change the request parameter mv to chirp-auk
The above model follows the suno format, please refer to: https://apidoc.cometapi.com/api-13851480

🌟 2025.04.29#

🔹 qwen3-235b-a22b
qwen3-235b-a22b: This is the flagship model of the Qwen3 series, with 235 billion parameters, utilizing a Mixture of Experts (MoE) architecture.
Particularly suitable for complex tasks requiring high-performance inference, such as coding, mathematics, and multimodal applications.
🔹 qwen3-30b-a3b
qwen3-30b-a3b: With 30 billion parameters, it balances performance and resource requirements, suitable for enterprise-level applications.
This model may use MoE or other optimized architectures, applicable for scenarios requiring efficient processing of complex tasks, such as intelligent customer service and content generation.
🔹 qwen3-8b
qwen3-8b: A lightweight model with 800 million parameters, designed specifically for resource-constrained environments (such as mobile devices or low-configuration servers).
Its efficiency and fast response capability make it suitable for simple queries, real-time interaction, and lightweight applications.
These models follow the OpenAI Chat standard format for calls. For specific details, please refer to:
https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.04.27#

🔹 gpt-image-1
gpt-image-1 introduces native multimodal models to the API, built on GPT-4o's image generation capabilities, designed to provide developers with a powerful and flexible tool for generating high-quality, diverse images.
Features: High-fidelity images; diverse visual styles; rich world knowledge; consistent text rendering; unlocking practical applications across multiple domains.
This model follows the openai v1/images/generations format for calls, see details at: https://apidoc.cometapi.com/images-api-13851474 ;Here's an example of input parameters:
{
    "model": "gpt-image-1",
    "prompt": "A cute baby sea otter",
    "n": 1,
    "size": "1024x1024"
}

🌟 2025.04.20#

🔹 gemini-2.5-flash-preview-04-17
gemini-2.5-flash-preview-04-17, Gemini 2.5 Flash is an AI model developed by Google, designed to provide developers with fast and cost-effective solutions, especially suitable for applications requiring enhanced reasoning capabilities.
According to the Gemini 2.5 Flash preview announcement, the model's preview version was released on April 17, 2025, supports multimodal input, and has a context window of up to 1 million tokens.
This model follows the OpenAI chat standard format for calling, refer to:https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.04.17#

🔹 o4-mini
🔹 o4-mini-2025-04-16
o4-mini, o4-mini-2025-04-16: A smaller, faster, and more economical model, research shows it performs well in mathematics, coding, and visual tasks, designed to be efficient and responsive, suitable for developers. Released on April 16, 2025.
🔹 o3
🔹 o3-2025-04-16
o3, o3-2025-04-16: A reflective generative pre-trained transformer (GPT) model designed to handle problems requiring step-by-step logical reasoning.
Research shows it excels at mathematics, coding, and scientific tasks. It can also use tools such as web browsing and image generation, with a release date of April 16, 2025.
The above models follow the OpenAI chat standard format for calls, refer to: https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.04.15#

🔹 gpt-4.1
gpt-4.1: Major advancements in coding and instruction following; GPT-4.1 has become the leading model for coding.
Long context: On Video-MME, a benchmark for multimodal long context understanding, GPT-4.1 has created a new state-of-the-art result.
The GPT-4.1 model series delivers superior performance at lower cost.
🔹 gpt-4.1-mini
gpt-4.1-mini: Represents a significant leap in small model performance, even outperforming GPT-4o on many benchmarks.
It matches or exceeds GPT-4o in intelligence assessment while reducing latency by nearly half and costs by 83%.
🔹 gpt-4.1-nano
gpt-4.1-nano: Features a larger context window—supporting up to 1 million context tokens
And can better utilize this context through improved long context understanding. Has an updated knowledge cutoff date of June 2024.
These models follows the standard OpenAI chat format for API calls, for reference see: https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.04.14#

🔹 grok-3-deepersearch
grok-3-deepersearch: Features high data timeliness, excellent interactive experience, and thorough search thinking process; comprehensive webpage aggregation.
This model follows the OpenAI chat standard format for API calls, refer to: https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.04.13#

🔹 gemini-2.0-flash-exp-image-generation
This model supports conversation while enabling image generation and editing capabilities, outputting high-definition images.
This model follows the OpenAI chat standard format for API calls, refer to: https://apidoc.cometapi.com/api-15928299

🌟 2025.04.10#

🔹 grok-3-fast
🔹 grok-3-fast-latest
grok-3-fast, grok-3-fast-latest: grok-3 and grok-3-fast use exactly the same underlying model and provide the same response quality. However, grok-3-fast is served on faster infrastructure, delivering response times that are much quicker than the standard grok-3.
This model follows the OpenAI chat standard format for API calls, refer to: https://apidoc.cometapi.com/chat-api-13851472
🔹 grok-3-mini
🔹 grok-3-mini-latest
grok-3-mini, grok-3-mini-latest: A lightweight model that thinks before responding. Fast, intelligent, and ideal for logic-based tasks that don't require deep domain knowledge. The original thought traces are accessible.
This model follows the OpenAI chat standard format for API calls, refer to: https://apidoc.cometapi.com/chat-api-13851472
🔹 grok-3-mini-fast
🔹 grok-3-mini-fast-latest
grok-3-mini-fast, grok-3-mini-fast-latest: grok-3-mini and grok-3-mini-fast use exactly the same underlying model and provide the same response quality. However, grok-3-mini-fast is served on faster infrastructure, delivering response times that are much quicker than the standard grok-3-mini.
This model follows the OpenAI chat standard format for API calls, refer to: https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.04.07#

🔹 llama-4-maverick
llama-4-maverick, a high-capacity multimodal language model from Meta, supports multilingual text and image inputs and generates multilingual text and code output in 12 supported languages.
Maverick is optimized for visual language tasks and has instructions tuned for assistant-like behavior, image reasoning, and generic multimodal interaction.
Maverick features native multimodal early fusion and 1 million labeled context windows.
Maverick is released on April 5, 2025 under the Llama 4 Community License for research and commercial applications requiring advanced multimodal understanding and high model throughput.
The model follows the openai chat standard format call,cf:https://apidoc.cometapi.com/chat-api-13851472
🔹 llama-4-scout
llama-4-scout, is a mixed-expertise (MoE) language model developed by Meta. It supports native multimodal input (text and images) and multilingual output (text and code) for 12 supported languages.
Designed for assisted interaction and visual reasoning, Scout uses 16 experts per forward pass, a context length of 10 million words, and a training corpus of about 40 trillion words.
Designed for high efficiency and local or commercial deployment, llama-4-scout employs early fusion technology for seamless modal integration.
It is command-tuned for multilingual chat, subtitling, and image comprehension tasks.
It is released under the Llama 4 Community License, with last training data as of August 2024 and a public release on April 5, 2025.
The model follows the openai chat standard format for calls, cf:https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.03.29#

🔹 gpt-4o-all
gpt-4o-all has support for ChatGPT's latest generated image mode
The model follows the openai chat standard format for calls, cf:https://apidoc.cometapi.com/api-15928299
🔹 gpt-4o-image
gpt-4o-image This model is dedicated to image generation and editing, which enables image style conversion, preservation of original image features, superb consistency, and output of high-definition images.
The model follows the openai chat standard format for calls, cf:https://apidoc.cometapi.com/api-15928299

🌟 2025.03.27#

🔹 gemini-2.5-pro-exp-03-25
Features native multimodal processing capabilities, with an extensive context window of up to 1 million tokens, providing unprecedented powerful support for complex, long-sequence tasks.
The model follows the openai chat standard format for calls, cf:https://apidoc.cometapi.com/chat-api-13851472
🔹 gemini-2.5-pro-preview-03-25
According to Google's data, Gemini 2.5 Pro demonstrates particularly outstanding performance in handling complex tasks.
The model follows the openai chat standard format for calls, cf:https://apidoc.cometapi.com/chat-api-13851472

🌟 2025.03.24#

🔹 gpt-4.5-preview-2025-02-27
Preview Version: Showcasing the latest features of GPT-4.5, providing enhanced understanding and generation capabilities, suitable for various tasks, improving user experience.
🔹 gpt-4.5-preview
Preview Version: Deeply optimized algorithms and performance, delivering ultra-fast responses and precise outputs, perfectly suited for efficient decision-making scenarios.
🔹 gpt-4.5
Professional Standard Version: Stable and reliable, combining rich expression and multi-task processing capabilities, suitable for wide applications including business, education, creative, and technical fields.

🌟 2025.02.20#

🔹 claude-3-7-sonnet-thinking
Advanced model designed for complex reasoning and creative thinking, unleashing unlimited possibilities, empowering breakthrough problem-solving and innovation.
🔹 claude-3-7-sonnet-20250219
High-end version integrating the latest technological breakthroughs, handling complex tasks with superior performance, providing intelligent innovative solutions for users.
🔹 cometapi-3-7-sonnet
Outstanding multi-domain processing expert, delivering precise and smooth output experience, easily tackling various professional challenges.
🔹 cometapi-3-7-sonnet-thinking
Equipped with revolutionary algorithm architecture, significantly enhancing deep analysis and complex task management capabilities, making thinking more thorough and comprehensive.

🔗 Usage Guide:#

✅ All models have been added to the default group, allowing you to flexibly call them according to different usage scenarios and requirements, easily integrate, and maximize their application value.

🛠 Quick Start:#

Simple integration into your system unlocks powerful capabilities. Fully utilize each model's unique advantages to meet professional needs across different domains.
🔥 Experience the revolutionary performance improvements these breakthrough models bring right now! 🔥

For professional support or detailed consultation, please contact our customer service team or visit our technical documentation center. We look forward to your valuable feedback!
Next
Help Center
Built with