API Doc-CometAPI
HomeDashBoardModel Marketplace
HomeDashBoardModel Marketplace
Discord_Support
  1. Best Practices
  • GET START
    • Quick Start
    • Important Guidelines
    • Release Notes
  • API Reference
    • Error Codes & Handling
    • Text Models-openai format
      • Chat
      • response
      • gpt-4o-image generates image
      • Images
      • Image Editing (gpt-image-1)
      • Recognizing Images
      • Embeddings
      • Realtime
      • Models
      • Hunyuan3D
    • Anthropic Compatiable
      • Anthropic Claude
    • Image Models
      • Midjourney(images)
        • Quick Tutorial - Complete Process in One Go
        • Task Fetching API
          • List by Condition
          • Fetch Single Task (most recommended)
        • Imagine
        • Submit Video
        • Submit Editor
        • Action (UPSCALE; VARIATION; REROLL; ZOOM, etc.)
        • Blend (image -> image)
        • Describe (image -> text)
        • Modal (Area Redesign & Zoom)
      • Flux(images)
        • Generate image (replicate format)
        • Create Task - General
        • flux fine-tune images(Temporarily unavailable)
        • flux generate image(Temporarily unavailable)
        • flux query
      • Replicate(image)
        • Create Task - General
        • Create Task -flux-kontext-pro、max
        • Create Task -flux-1.1-pro
        • Create Task -flux-1.1-pro-ultra
        • replicate query
      • Recraft(images)
        • Appendix
        • Recraft Generate Image
        • Recraft Vectorize Image
        • Recraft Remove Background
        • Recraft Clarity Upscale
        • Recraft Create style
        • Recraft Generative Upscale
      • Ideogram(images)(Temporarily removed)
        • Official documentation (updated in real time)
        • Generate 3.0 (text to image)
        • Remix 3.0 (hybrid image)
        • Reframe 3.0(Reconstruction)
        • Replace Background 3.0(Background replacement)
        • Edit 3.0(Editing images)
        • ideogram Text Raw Image
        • ideogram Hybrid image
        • ideogram enlargement HD
        • ideogram describes the image
        • ideogram Edit image((legacy))
    • Music Models
      • Suno
        • Setting suno Version
        • Suno API Scenario Application Guide
        • Generate lyrics
        • Generate music clip
        • Upload clip
        • Submit concatenation
        • Full Track Audio Separation
        • Single Track Audio Separation
        • Create New Persona
        • Single task query
        • Generate mp4 mv video
        • Timing: lyrics, audio timeline
        • Get wav format file
        • Batch query tasks
      • Udio(Temporarily unavailable)
        • Generate music
        • Task query
    • Video Models
      • veo3
        • veo3-chat format
        • Submit video generation task
        • Query video generation status
      • runway(video)
        • official format
          • runway images raw video
          • Generate a video from a video
          • Generate an image from text
          • Upscale a video
          • Control a character
          • runway to get task details
        • Reverse Format
          • generate(text)
          • generate(Reference images)
          • Video to Video Style Redraw
          • Act-one Expression Migration
          • feed-get task
      • kling (video)
        • callback_url
        • testing
          • Multimodal Video Editing (In Testing)
            • Initialize Video for Editing
            • Add Video Selection
            • Delete Video Selection
            • Clear Video Selection
            • Preview Selected Video Area
            • Create Task
        • Generating images
        • Expanded
        • Text Generation Video
        • Image Generation Video
        • Multi-Image To Video
        • Multi-Image to Image
        • Video Extension
        • virtual try-on
        • lip sync
        • effects
        • Video to audio
        • Text to audio
        • Individual queries
      • bytedance
        • bytedance-video
        • bytedance-video get
        • bytedance-image-generation
        • bytedance-Image Editing
      • MiniMax Conch(video)
        • MiniMax Conch Official Documentation
        • MiniMax Conch Generation
        • MiniMax Conch Query
        • MiniMax Conch Download
      • luma (video)(temporarily dismantle)
        • Official api interface format
          • luma generate
          • luma search
      • PIKA(video)(temporarily dismantle)
        • pika feed
        • PIKA Reference Video Generation
        • PIKA Reference Image Generation
        • PIKA reference text generation
      • sora(temporarily dismantle)
        • Reverse Format
          • Create Video
          • Query Video Task
          • Create Video
    • Audio Models
      • Create speech
      • Create transcription
      • Create translation
  • CODE EXAMPLES
    • Code example
  • Guides & Tutorials
    • Integration Guides
      • COMET API API Call Testing
      • OpenManus
      • Chatbox
      • CherryStudio
      • Cursor
      • COZE
      • Cline
      • ChatHub
      • Dify
      • LiteLLM
      • zapier
      • n8n
      • n8n Local Deployment
      • AnythingLLM
      • Immersive Translation
      • NEXT CHAT (ChatGPT Next Web)
      • ChatAll Translation
      • FastGPT
      • Lobe-Chat
      • Zotero
      • LangChain
      • Open WebUI
      • OpenAI Translator
      • Pot Translation
      • Obsidian's Text Generator Plugin
      • GPT Academic Optimization (gpt_academic)
      • gptme
      • avante.nvim
      • Eudic Translation
      • librechat
      • utools-ChatGPT Friend
      • IntelliJ Translation Plugin
      • Lazy Customer Service
      • MAKE
    • Best Practices
      • Claude Code Installation and Usage Guide
      • Gemini CLI Installation and Usage Guide
      • CometAPI Account Balance Query API Usage Instructions
      • Retry Logic Documentation for CometAPI and OpenAI Official API
      • Midjourney Best Practices
      • Runway Best Practices
      • Guide to calling gemini-2.5-flash-image
  • Pricing & Billing
    • About Pricing
  • Support
    • Help Center
    • Confusion about use
    • Common Misconceptions
    • Terms of service
    • Privacy policy
    • Interface Stability
  1. Best Practices

Guide to calling gemini-2.5-flash-image

Use CometAPI to call gemini-2.5-flash-image-preview for image generation#

This document demonstrates how to use CometAPI to generate images with Google’s Gemini series image model gemini-2.5-flash-image-preview, covering three common invocation methods:
OpenAI-compatible Chat interface (text-to-image)
Gemini official generateContent text-to-image interface
Gemini official generateContent image-to-image interface (input and output both Base64)
Notes:
Replace sk-xxxx in the examples with your CometAPI key. For security, do not expose the key in clients or public repositories.
CometAPI Authorization typically uses the key value directly, i.e., Authorization: sk-xxxx.
The returned image is usually provided as Base64-encoded inline_data; you need to decode it on the client and save it as a file.
Basic information:
Base URL: https://api.CometAPI.com
Model name: gemini-2.5-flash-image-preview/gemini-2.5-flash-image

I. OpenAI Chat compatible interface (text-to-image)#

Call gemini-2.5-flash-image-preview via the OpenAI-style /v1/chat/completions endpoint. It is recommended to explicitly set model to gemini-2.5-flash-image-preview or gemini-2.5-flash-image.
Example (Windows shell, using ^ for line continuation):
Notes:
stream must be true; the response will be returned as a stream;
The response structure is wrapped by CometAPI for OpenAI compatibility. The response includes a Base64 image; decode and save it on the client as needed.
Prompt tips:
You can specify style, lighting, colors, lens, resolution, etc., in the prompt, for example: “cartoon style, soft color palette, 4K square image”.

II. Gemini official generateContent text-to-image#

Use the Gemini official-style generateContent endpoint for text-to-image. Put the text prompt in contents.parts[].text.
Example:
Response highlights:
Image data is usually included in response.candidates[0].content.parts:
Text description:
{ \"text\": \"...\" }
Image data:
{ \"inline_data\": { \"mime_type\": \"image/png\", \"data\": \"<base64>\" } }
Decode the data field (Base64 string) and save it as a file with the corresponding extension.

III. Gemini official generateContent image-to-image (input/output both Base64)#

This endpoint supports image-to-image: upload an input image (Base64) and receive a modified new image (also in Base64).
Example:
Notes:
Convert your source image file to a Base64 string and place it in inline_data.data (do not include prefixes like data:image/jpeg;base64,).
The output is also in candidates[0].content.parts and includes:
An optional text part (explanation or hint)
An image part inline_data (where data is the Base64 of the output image)

How to extract and save Base64 images from the response#

Using a Gemini-style response as an example, the pseudo-structure is as follows (illustrative only):
{
  "candidates": [
    {
      "content": {
        "parts": [
          { "text": "..." },
          {
            "inline_data": {
              "mime_type": "image/png",
              "data": "<base64-string>"
            }
          }
        ]
      }
    }
  ]
}
Extract inline_data.data and choose the file extension based on its mime_type (e.g., .png / .jpg / .webp).
Decode the Base64 on your client and save it as a binary file.
Python example:
import base64

b64 = "<base64-string>"
with open("output.png", "wb") as f:
    f.write(base64.b64decode(b64))
Node.js example:

FAQs and recommendations#

Authorization header format
Use Authorization: sk-xxxx. Refer to the CometAPI documentation as the source of truth.
Streaming responses
Set "stream": true for the OpenAI-compatible interface to return SSE. If you use curl to observe the stream, ensure your client correctly handles data: lines until [DONE].
Prompt optimization
Specifying style keywords (e.g., “cyberpunk, film texture, low contrast, high saturation”), aspect ratio (square/landscape/portrait), subject, background, lighting, and level of detail helps improve results.
Base64 notes
Do not include prefixes like data:image/png;base64, in the data field; keep only the raw Base64 payload.
Troubleshooting
4xx usually indicates request parameter or authentication issues (check the key, model name, JSON format); 5xx indicates server-side issues (try again later or contact support).
The above covers the three invocation methods and key points for using gemini-2.5-flash-image-preview via CometAPI. Choose the interface style that suits your use case, and handle Base64 image persistence and display on the client.
Official API documentation reference link: https://ai.google.dev/gemini-api/docs/image-generation?hl=zh-cn#gemini-image-editing
Previous
Runway Best Practices
Next
About Pricing
Built with