Guide to calling gemini-2.5-flash-image (Nano Banana)

This document demonstrates how to use Google Gemini's image model, gemini-2.5-flash-image-preview, via cometapi for image generation. It covers two common methods:

Gemini's official generateContent API for text-to-image generation

Gemini's official generateContent API for image-to-image generation (both input and output are Base64)

Important Notes:

Replace sk-xxxx in the examples with your cometapi key. For security, do not expose your key in client-side code or public repositories.

The Authorization header for cometapi typically uses the key value directly, e.g., Authorization: sk-xxxx.

The returned image is usually provided as Base64-encoded inline_data. You will need to decode it on the client side and save it as a file.

Basic Information:

Base URL: https://api.cometapi.com

Model Name: gemini-2.5-flash-image-preview / gemini-2.5-flash-image

I. Gemini's Official `generateContent` for Text-to-Image

Use Gemini's official generateContent endpoint for text-to-image generation. Place the text prompt in contents.parts[].text.

Example (Windows shell, using ^ for line continuation):

Response Highlights:

The image data is typically found in response.candidates[0].content.parts, which can contain:

Text description:

{ "text": "..." }

Image data:

{ "inline_data": { "mime_type": "image/png", "data": "<base64>" } }

Decode the data field (the Base64 string) and save it as a file with the corresponding extension.

II. Gemini's Official `generateContent` for Image-to-Image (Base64 I/O)

This endpoint supports "image-to-image" generation: upload an input image (as Base64) and receive a modified new image (also in Base64 format).

Example:

Description:

First, convert your source image file into a Base64 string and place it in inline_data.data. Do not include prefixes like data:image/jpeg;base64,.

The output is also located in candidates[0].content.parts and includes:

An optional text part (description or prompt).

The image part as inline_data (where data is the Base64 of the output image).

For multiple images, you can append them directly, for example:

III. Official Gemini: Image Generation from Multiple Images (Base64 Input/Output)

This endpoint supports "multi-image to image" generation: upload multiple input images (Base64) and it will return a new, modified image (also in Base64 format).

Method 1: Combine multiple images into a single collage, as shown in the example below

Example input description:

A model is posing and leaning against a pink bmw. She is wearing the following items, the scene is against a light grey background. The green alien is a keychain and it's attached to the pink handbag. The model also has a pink parrot on her shoulder. There is a pug sitting next to her wearing a pink collar and gold headphones.

Returned Base64 converted back to an image:

Example:

Notes:

First, convert your source image file to a Base64 string and insert it into inline_data.data (do not include prefixes like data:image/jpeg;base64,).

The output is also located in candidates[0].content.parts and contains:

An optional text part (description or prompt)

An image part inline_data (where data is the Base64 of the output image)

Method 2: Pass multiple images via Base64 parameters (up to three images)

Example:

API Input Parameters:

Example of returned image:

How to Extract and Save the Base64 Image from the Response

Using the Gemini-style response as an example, the pseudo-structure is as follows (for illustration purposes):

{
  "candidates": [
    {
      "content": {
        "parts": [
          { "text": "..." },
          {
            "inlineData": {
              "mimeType": "image/png",
              "data": "<base64-string>"
            }
          }
        ]
      }
    }
  ]
}

Extract the inlineData.data string and choose the file extension based on its mimeType (e.g., .png, .jpg, .webp).

In your client application, decode the Base64 string and save it as a binary file.

Python Example:

Node.js Example:

FAQ & Suggestions

Authorization Header Format

Use Authorization: sk-xxxx. Please refer to the official cometapi documentation for confirmation.

Prompt Optimization

Specifying style keywords (e.g., "cyberpunk, film grain, low contrast, high saturation"), aspect ratio (square/landscape/portrait), subject, background, lighting, and level of detail can help improve the results.

Base64 Note

Do not include a prefix like data:image/png;base64, in the data field; only include the pure Base64 data string.

Troubleshooting

4xx errors usually indicate issues with request parameters or authentication (check your key, model name, JSON format). 5xx errors are server-side problems (you can retry later or contact support).

The above covers the methods and key points for using gemini-2.5-flash-image-preview via cometapi. Choose the API style that suits your needs and implement the Base64 image decoding and display on the client side.

Official API documentation for reference:

Image Generation

Image Understanding (Multimodal)

Guide to calling gemini-2.5-flash-image (Nano Banana)

I. Gemini's Official generateContent for Text-to-Image#

II. Gemini's Official generateContent for Image-to-Image (Base64 I/O)#

III. Official Gemini: Image Generation from Multiple Images (Base64 Input/Output)#

Method 1: Combine multiple images into a single collage, as shown in the example below#

Method 2: Pass multiple images via Base64 parameters (up to three images)#

How to Extract and Save the Base64 Image from the Response#