> ## Documentation Index
> Fetch the complete documentation index at: https://apidoc.cometapi.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Generate images with Gemini

> Generate images with Gemini models via CometAPI — supports 4K, multi-turn editing, up to 14 reference images, and Thinking.

<Tip>
  For a short first request, start with the [Gemini image API quickstart](/quickstarts/image/gemini-image-api). For step-by-step examples, see [Use Gemini image models](/api/image/gemini/generate-image-guide).
</Tip>

<Warning>
  Gemini image generation options can change as Google updates image models and `generateContent`. Check the [Gemini image generation documentation](https://ai.google.dev/gemini-api/docs/image-generation) for the latest complete parameter list and provider-specific behavior.
</Warning>

<Note>
  `gemini-2.5-flash-image` is scheduled for shutdown by the provider on 2026-10-02. For new projects, use `gemini-3.1-flash-image-preview` or `gemini-3-pro-image-preview`. See Google's [deprecation schedule](https://ai.google.dev/gemini-api/docs/deprecations) for details.
</Note>

<Note>
  Gemini image responses can include intermediate image parts where `thought` is `true`. These are not the final output. When saving generated images, skip `thought: true` parts and use the last image part where `inlineData` exists and `thought` is not `true`.
</Note>


## OpenAPI

````yaml api/openapi/image/gemini/post-gemini-generates-image.openapi.json POST /v1beta/models/{model}:generateContent
openapi: 3.1.0
info:
  title: Gemini Image Generation API
  version: 1.0.0
servers:
  - url: https://api.cometapi.com
security:
  - bearerAuth: []
paths:
  /v1beta/models/{model}:generateContent:
    post:
      summary: Gemini Image Generation
      operationId: gemini_generates_image
      parameters:
        - name: model
          in: path
          required: true
          description: >-
            The Gemini image model to use. See the [Models
            page](/overview/models) for current options.
          schema:
            type: string
            default: gemini-3.1-flash-image-preview
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required:
                - contents
              properties:
                contents:
                  type: array
                  description: >-
                    Conversation turns. Each item has a `role` ("user" or
                    "model") and `parts` array containing text and/or image
                    data.
                  items:
                    type: object
                    properties:
                      role:
                        type: string
                        description: Role of the message sender.
                        enum:
                          - user
                          - model
                      parts:
                        type: array
                        description: >-
                          Content blocks — text prompts and/or inline image
                          data.
                        items:
                          type: object
                          properties:
                            text:
                              type: string
                              description: Text content (prompt or instruction).
                            inline_data:
                              type: object
                              description: >-
                                Inline image data for image-to-image or
                                multi-image input.
                              properties:
                                mime_type:
                                  type: string
                                  description: MIME type of the image.
                                  enum:
                                    - image/jpeg
                                    - image/png
                                    - image/webp
                                data:
                                  type: string
                                  description: >-
                                    Raw Base64-encoded image data. Do not
                                    include the `data:image/...;base64,` prefix.
                generationConfig:
                  type: object
                  description: >-
                    Controls generation behavior — output modalities, image
                    resolution, thinking, etc.
                  properties:
                    responseModalities:
                      type: array
                      description: >-
                        Output types to return. Use `["TEXT", "IMAGE"]` for
                        mixed output, or `["IMAGE"]` to force image-only output.
                      items:
                        type: string
                        enum:
                          - TEXT
                          - IMAGE
                      default:
                        - TEXT
                        - IMAGE
                    imageConfig:
                      type: object
                      description: Image output configuration.
                      properties:
                        aspectRatio:
                          type: string
                          description: >-
                            All models: `1:1` `2:3` `3:2` `3:4` `4:3` `4:5`
                            `5:4` `9:16` `16:9` `21:9`. Gemini 3.1 Flash also
                            supports `1:4` `4:1` `1:8` `8:1`.
                          default: '1:1'
                        imageSize:
                          type: string
                          description: >-
                            Output resolution. Gemini 3 models only —
                            `gemini-2.5-flash-image` always outputs 1024px. Use
                            uppercase K.
                          enum:
                            - 512px
                            - 1K
                            - 2K
                            - 4K
                          default: 1K
                    thinkingConfig:
                      type: object
                      description: >-
                        Controls the Thinking process (Gemini 3 models).
                        Thinking generates interim images before the final
                        output.
                      properties:
                        thinkingLevel:
                          type: string
                          description: >-
                            Thinking effort level. Only configurable for
                            `gemini-3.1-flash-image-preview`;
                            `gemini-3-pro-image-preview` always uses high
                            thinking.
                          enum:
                            - minimal
                            - high
                          default: minimal
                        includeThoughts:
                          type: boolean
                          description: Whether to include thought parts in the response.
                          default: false
                tools:
                  type: array
                  description: >-
                    Optional tools. Pass `[{"google_search": {}}]` to enable
                    Google Search grounding for real-time information in
                    generated images.
                  items:
                    type: object
                    properties:
                      google_search:
                        type: object
                        description: Enables Google Search grounding.
            examples:
              text-to-image:
                summary: Text to Image
                value:
                  contents:
                    - role: user
                      parts:
                        - text: >-
                            A Monarch butterfly anatomical sketch on textured
                            parchment, Da Vinci style
                  generationConfig:
                    responseModalities:
                      - TEXT
                      - IMAGE
                    imageConfig:
                      aspectRatio: '1:1'
                      imageSize: 4K
              image-to-image:
                summary: Image to Image
                value:
                  contents:
                    - role: user
                      parts:
                        - text: Transform this into a watercolor painting
                        - inline_data:
                            mime_type: image/jpeg
                            data: <base64-encoded-image>
                  generationConfig:
                    responseModalities:
                      - TEXT
                      - IMAGE
              multi-image-composition:
                summary: Multi-Image Composition
                value:
                  contents:
                    - role: user
                      parts:
                        - text: Blend these images into one scene
                        - inline_data:
                            mime_type: image/jpeg
                            data: <base64-image-1>
                        - inline_data:
                            mime_type: image/jpeg
                            data: <base64-image-2>
                  generationConfig:
                    responseModalities:
                      - TEXT
                      - IMAGE
              force-image-only:
                summary: Force Image Output
                value:
                  contents:
                    - role: user
                      parts:
                        - text: A photo-realistic sunset over the ocean
                  generationConfig:
                    responseModalities:
                      - IMAGE
                    imageConfig:
                      aspectRatio: '16:9'
                      imageSize: 2K
              with-thinking:
                summary: With Thinking (3.1 Flash)
                value:
                  contents:
                    - role: user
                      parts:
                        - text: >-
                            A futuristic city inside a glass bottle floating in
                            space
                  generationConfig:
                    responseModalities:
                      - IMAGE
                    thinkingConfig:
                      thinkingLevel: high
                      includeThoughts: true
              with-search-grounding:
                summary: With Google Search Grounding
                value:
                  contents:
                    - role: user
                      parts:
                        - text: >-
                            Visualize the current weather forecast for San
                            Francisco as a chart
                  generationConfig:
                    responseModalities:
                      - TEXT
                      - IMAGE
                  tools:
                    - google_search: {}
      responses:
        '200':
          description: Success
          content:
            application/json:
              schema:
                type: object
                properties:
                  candidates:
                    type: array
                    items:
                      type: object
                      properties:
                        content:
                          type: object
                          properties:
                            role:
                              type: string
                              description: Always `model` for responses.
                            parts:
                              type: array
                              description: >-
                                Response parts — may contain text, images,
                                intermediate thought images, or a mix of these.
                                Skip image parts where `thought` is `true` when
                                saving the final output.
                              items:
                                type: object
                                properties:
                                  text:
                                    type: string
                                    description: Text content from the model.
                                  inlineData:
                                    type: object
                                    description: Generated image data.
                                    properties:
                                      mimeType:
                                        type: string
                                        description: Image MIME type, typically `image/png`.
                                      data:
                                        type: string
                                        description: Base64-encoded image data.
                                  thought:
                                    type: boolean
                                    description: >-
                                      Whether this part is an intermediate
                                      thought part rather than final user-facing
                                      output. Skip generated image parts where
                                      this value is `true`.
                                  thoughtSignature:
                                    type: string
                                    description: >-
                                      Provider signature metadata for
                                      thought-related response parts. Do not use
                                      this as the primary final-image selector;
                                      use `thought !== true` and save the last
                                      remaining image part.
                        finishReason:
                          type: string
                          description: Reason generation stopped.
                          enum:
                            - STOP
                            - MAX_TOKENS
                            - SAFETY
                            - RECITATION
                        index:
                          type: integer
                        safetyRatings:
                          type: array
                          items:
                            type: object
                            properties:
                              category:
                                type: string
                              probability:
                                type: string
                  usageMetadata:
                    type: object
                    properties:
                      promptTokenCount:
                        type: integer
                      candidatesTokenCount:
                        type: integer
                      totalTokenCount:
                        type: integer
                      thoughtsTokenCount:
                        type: integer
                        description: >-
                          Token count for thinking process (Gemini 3 models
                          only).
      x-codeSamples:
        - lang: Python
          label: Text to Image
          source: |
            import os
            from google import genai
            from google.genai import types

            client = genai.Client(
                http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
                api_key=os.environ.get("COMETAPI_KEY"),
            )

            response = client.models.generate_content(
                model="gemini-3.1-flash-image-preview",
                contents="A Monarch butterfly anatomical sketch on textured parchment",
                config=types.GenerateContentConfig(
                    response_modalities=["TEXT", "IMAGE"],
                    image_config=types.ImageConfig(aspect_ratio="1:1", image_size="4K"),
                ),
            )

            final_image = None
            for part in response.parts:
                if getattr(part, "thought", False):
                    continue
                if part.text:
                    print(part.text)
                elif image := part.as_image():
                    final_image = image

            if final_image:
                final_image.save("output.png")
        - lang: Python
          label: Image to Image
          source: |
            import os
            from google import genai
            from google.genai import types
            from PIL import Image

            client = genai.Client(
                http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
                api_key=os.environ.get("COMETAPI_KEY"),
            )

            source_image = Image.open("source.jpg")

            response = client.models.generate_content(
                model="gemini-3.1-flash-image-preview",
                contents=["Transform this into a watercolor painting", source_image],
                config=types.GenerateContentConfig(
                    response_modalities=["TEXT", "IMAGE"],
                ),
            )

            final_image = None
            for part in response.parts:
                if getattr(part, "thought", False):
                    continue
                if image := part.as_image():
                    final_image = image

            if final_image:
                final_image.save("output.png")
        - lang: Python
          label: Multi-turn Chat
          source: >
            import os

            from google import genai

            from google.genai import types


            client = genai.Client(
                http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
                api_key=os.environ.get("COMETAPI_KEY"),
            )


            chat = client.chats.create(
                model="gemini-3.1-flash-image-preview",
                config=types.GenerateContentConfig(
                    response_modalities=["TEXT", "IMAGE"],
                ),
            )


            def save_final_image(response, path):
                final_image = None
                for part in response.parts:
                    if getattr(part, "thought", False):
                        continue
                    if image := part.as_image():
                        final_image = image
                if final_image:
                    final_image.save(path)

            # First turn: generate

            response = chat.send_message("Create an infographic explaining
            photosynthesis")

            save_final_image(response, "v1.png")


            # Second turn: refine

            response = chat.send_message("Translate this to Spanish, keep other
            elements unchanged")

            save_final_image(response, "v2.png")
        - lang: Python
          label: With Thinking
          source: |
            import os
            from google import genai
            from google.genai import types

            client = genai.Client(
                http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
                api_key=os.environ.get("COMETAPI_KEY"),
            )

            response = client.models.generate_content(
                model="gemini-3.1-flash-image-preview",
                contents="A futuristic city inside a glass bottle floating in space",
                config=types.GenerateContentConfig(
                    response_modalities=["IMAGE"],
                    thinking_config=types.ThinkingConfig(
                        thinking_level="HIGH",
                        include_thoughts=True,
                    ),
                ),
            )

            final_image = None
            for part in response.parts:
                if getattr(part, "thought", False):
                    continue
                if image := part.as_image():
                    final_image = image

            if final_image:
                final_image.save("output.png")
        - lang: Python
          label: With Search Grounding
          source: |
            import os
            from google import genai
            from google.genai import types

            client = genai.Client(
                http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
                api_key=os.environ["COMETAPI_KEY"],
            )

            response = client.models.generate_content(
                model="gemini-3.1-flash-image-preview",
                contents="Visualize the current weather in San Francisco as a chart",
                config=types.GenerateContentConfig(
                    response_modalities=["TEXT", "IMAGE"],
                    tools=[types.Tool(google_search=types.GoogleSearch())],
                ),
            )

            final_image = None
            for part in response.parts:
                if getattr(part, "thought", False):
                    continue
                if image := part.as_image():
                    final_image = image

            if final_image:
                final_image.save("weather.png")
        - lang: Python
          label: Force Image Only
          source: |
            import os
            from google import genai
            from google.genai import types

            client = genai.Client(
                http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
                api_key=os.environ["COMETAPI_KEY"],
            )

            response = client.models.generate_content(
                model="gemini-3.1-flash-image-preview",
                contents="A photo-realistic sunset over the ocean",
                config=types.GenerateContentConfig(
                    response_modalities=["IMAGE"],
                    image_config=types.ImageConfig(aspect_ratio="16:9", image_size="2K"),
                ),
            )

            final_image = None
            for part in response.parts:
                if getattr(part, "thought", False):
                    continue
                if image := part.as_image():
                    final_image = image

            if final_image:
                final_image.save("sunset.png")
        - lang: JavaScript
          label: Text to Image
          source: |
            import { GoogleGenAI } from "@google/genai";
            import * as fs from "fs";

            const ai = new GoogleGenAI({
              apiKey: process.env.COMETAPI_KEY,
              httpOptions: { apiVersion: "v1beta", baseUrl: "https://api.cometapi.com" },
            });

            const response = await ai.models.generateContent({
              model: "gemini-3.1-flash-image-preview",
              contents: "A Monarch butterfly anatomical sketch on textured parchment",
              config: {
                responseModalities: ["TEXT", "IMAGE"],
                imageConfig: { aspectRatio: "1:1", imageSize: "4K" },
              },
            });

            let finalImagePart;
            for (const part of response.candidates[0].content.parts) {
              if (part.thought === true) {
                continue;
              }
              if (part.text) {
                console.log(part.text);
              }
              if (part.inlineData) {
                finalImagePart = part;
              }
            }

            if (finalImagePart) {
              fs.writeFileSync("output.png", Buffer.from(finalImagePart.inlineData.data, "base64"));
            }
        - lang: JavaScript
          label: Image to Image
          source: |
            import { GoogleGenAI } from "@google/genai";
            import * as fs from "fs";

            const ai = new GoogleGenAI({
              apiKey: process.env.COMETAPI_KEY,
              httpOptions: { apiVersion: "v1beta", baseUrl: "https://api.cometapi.com" },
            });

            const imageData = fs.readFileSync("source.jpg").toString("base64");

            const response = await ai.models.generateContent({
              model: "gemini-3.1-flash-image-preview",
              contents: [
                { text: "Transform this into a watercolor painting" },
                { inlineData: { mimeType: "image/jpeg", data: imageData } },
              ],
              config: { responseModalities: ["TEXT", "IMAGE"] },
            });

            const imageParts = response.candidates[0].content.parts.filter(
              (part) => part.inlineData && part.thought !== true,
            );
            const finalImagePart = imageParts.at(-1);

            if (finalImagePart) {
              fs.writeFileSync("output.png", Buffer.from(finalImagePart.inlineData.data, "base64"));
            }
        - lang: JavaScript
          label: Multi-turn Chat
          source: >
            import { GoogleGenAI } from "@google/genai";

            import * as fs from "fs";


            const ai = new GoogleGenAI({
              apiKey: process.env.COMETAPI_KEY,
              httpOptions: { apiVersion: "v1beta", baseUrl: "https://api.cometapi.com" },
            });


            const chat = ai.chats.create({
              model: "gemini-3.1-flash-image-preview",
              config: { responseModalities: ["TEXT", "IMAGE"] },
            });


            function saveFinalImage(response, path) {
              const imageParts = response.candidates[0].content.parts.filter(
                (part) => part.inlineData && part.thought !== true,
              );
              const finalImagePart = imageParts.at(-1);
              if (finalImagePart) {
                fs.writeFileSync(path, Buffer.from(finalImagePart.inlineData.data, "base64"));
              }
            }


            const r1 = await chat.sendMessage("Create an infographic explaining
            photosynthesis");

            saveFinalImage(r1, "v1.png");


            const r2 = await chat.sendMessage("Translate this to Spanish, keep
            other elements unchanged");

            saveFinalImage(r2, "v2.png");
        - lang: JavaScript
          label: With Search Grounding
          source: |
            import { GoogleGenAI } from "@google/genai";
            import fs from "node:fs";

            const ai = new GoogleGenAI({
                apiKey: process.env.COMETAPI_KEY,
                httpOptions: { apiVersion: "v1beta", baseUrl: "https://api.cometapi.com" },
            });

            const response = await ai.models.generateContent({
                model: "gemini-3.1-flash-image-preview",
                contents: "Visualize the current weather in San Francisco as a chart",
                config: {
                    responseModalities: ["TEXT", "IMAGE"],
                    tools: [{ googleSearch: {} }],
                },
            });

            const imageParts = response.candidates[0].content.parts.filter(
                (part) => part.inlineData && part.thought !== true,
            );
            const finalImagePart = imageParts.at(-1);

            if (finalImagePart) {
                fs.writeFileSync("weather.png", Buffer.from(finalImagePart.inlineData.data, "base64"));
            }
        - lang: JavaScript
          label: With Thinking
          source: |
            import { GoogleGenAI } from "@google/genai";
            import fs from "node:fs";

            const ai = new GoogleGenAI({
                apiKey: process.env.COMETAPI_KEY,
                httpOptions: { apiVersion: "v1beta", baseUrl: "https://api.cometapi.com" },
            });

            const response = await ai.models.generateContent({
                model: "gemini-3.1-flash-image-preview",
                contents: "A futuristic city inside a glass bottle floating in space",
                config: {
                    responseModalities: ["IMAGE"],
                    thinkingConfig: { thinkingLevel: "HIGH", includeThoughts: true },
                },
            });

            const imageParts = response.candidates[0].content.parts.filter(
                (part) => part.inlineData && part.thought !== true,
            );
            const finalImagePart = imageParts.at(-1);

            if (finalImagePart) {
                fs.writeFileSync("city.png", Buffer.from(finalImagePart.inlineData.data, "base64"));
            }
        - lang: JavaScript
          label: Force Image Only
          source: |
            import { GoogleGenAI } from "@google/genai";
            import fs from "node:fs";

            const ai = new GoogleGenAI({
                apiKey: process.env.COMETAPI_KEY,
                httpOptions: { apiVersion: "v1beta", baseUrl: "https://api.cometapi.com" },
            });

            const response = await ai.models.generateContent({
                model: "gemini-3.1-flash-image-preview",
                contents: "A photo-realistic sunset over the ocean",
                config: {
                    responseModalities: ["IMAGE"],
                    imageConfig: { aspectRatio: "16:9", imageSize: "2K" },
                },
            });

            const imageParts = response.candidates[0].content.parts.filter(
                (part) => part.inlineData && part.thought !== true,
            );
            const finalImagePart = imageParts.at(-1);

            if (finalImagePart) {
                fs.writeFileSync("sunset.png", Buffer.from(finalImagePart.inlineData.data, "base64"));
            }
        - lang: Shell
          label: Text to Image
          source: |
            curl -s -X POST \
              "https://api.cometapi.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
              -H "x-goog-api-key: $COMETAPI_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "contents": [{"parts": [{"text": "A Monarch butterfly anatomical sketch on textured parchment"}]}],
                "generationConfig": {
                  "responseModalities": ["TEXT", "IMAGE"],
                  "imageConfig": {"aspectRatio": "1:1", "imageSize": "4K"}
                }
              }'
        - lang: Shell
          label: Force Image Only
          source: |
            curl -s -X POST \
              "https://api.cometapi.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
              -H "x-goog-api-key: $COMETAPI_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "contents": [{"parts": [{"text": "A photo-realistic sunset over the ocean"}]}],
                "generationConfig": {
                  "responseModalities": ["IMAGE"],
                  "imageConfig": {"aspectRatio": "16:9", "imageSize": "2K"}
                }
              }'
        - lang: Shell
          label: With Search Grounding
          source: |
            curl -s -X POST \
              "https://api.cometapi.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
              -H "x-goog-api-key: $COMETAPI_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "contents": [{"parts": [{"text": "Visualize the current weather in San Francisco as a chart"}]}],
                "generationConfig": {"responseModalities": ["TEXT", "IMAGE"]},
                "tools": [{"google_search": {}}]
              }'
        - lang: Shell
          label: Image to Image
          source: |
            IMAGE_B64=$(base64 < source.jpg | tr -d '\n')

            curl -s -X POST \
              "https://api.cometapi.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
              -H "x-goog-api-key: $COMETAPI_KEY" \
              -H "Content-Type: application/json" \
              --data-binary @- <<EOF
            {
              "contents": [{
                "parts": [
                  {"text": "Transform this into a watercolor painting"},
                  {"inlineData": {"mimeType": "image/jpeg", "data": "${IMAGE_B64}"}}
                ]
              }],
              "generationConfig": {"responseModalities": ["TEXT", "IMAGE"]}
            }
            EOF
        - lang: Shell
          label: Multi-turn Chat
          source: |
            curl -s -X POST \
              "https://api.cometapi.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
              -H "x-goog-api-key: $COMETAPI_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "contents": [
                  {"role": "user", "parts": [{"text": "Create an infographic explaining photosynthesis"}]},
                  {"role": "model", "parts": [{"text": "Here is the infographic."}]},
                  {"role": "user", "parts": [{"text": "Translate this to Spanish, keep other elements unchanged"}]}
                ],
                "generationConfig": {"responseModalities": ["TEXT", "IMAGE"]}
              }'
        - lang: Shell
          label: With Thinking
          source: |
            curl -s -X POST \
              "https://api.cometapi.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
              -H "x-goog-api-key: $COMETAPI_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                "contents": [{"parts": [{"text": "A futuristic city inside a glass bottle floating in space"}]}],
                "generationConfig": {
                  "responseModalities": ["IMAGE"],
                  "thinkingConfig": {"thinkingLevel": "HIGH", "includeThoughts": true}
                }
              }'
components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: Use your CometAPI API key.

````