Gemini 콘텐츠 생성 - CometAPI Documentation

POST

v1beta

models

{model}

{operator}

from google import genai

client = genai.Client(
    api_key="<COMETAPI_KEY>",
    http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
)

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Explain how AI works in a few words",
)

print(response.text)

{
  "candidates": [
    {
      "content": {
        "role": "<string>",
        "parts": [
          {
            "text": "<string>",
            "functionCall": {
              "name": "<string>",
              "args": {}
            },
            "inlineData": {
              "mimeType": "<string>",
              "data": "<string>"
            },
            "thought": true
          }
        ]
      },
      "safetyRatings": [
        {
          "category": "<string>",
          "probability": "<string>",
          "blocked": true
        }
      ],
      "citationMetadata": {
        "citationSources": [
          {
            "startIndex": 123,
            "endIndex": 123,
            "uri": "<string>",
            "license": "<string>"
          }
        ]
      },
      "tokenCount": 123,
      "avgLogprobs": 123,
      "groundingMetadata": {
        "groundingChunks": [
          {
            "web": {
              "uri": "<string>",
              "title": "<string>"
            }
          }
        ],
        "groundingSupports": [
          {
            "groundingChunkIndices": [
              123
            ],
            "confidenceScores": [
              123
            ],
            "segment": {
              "startIndex": 123,
              "endIndex": 123,
              "text": "<string>"
            }
          }
        ],
        "webSearchQueries": [
          "<string>"
        ]
      },
      "index": 123
    }
  ],
  "promptFeedback": {
    "safetyRatings": [
      {
        "category": "<string>",
        "probability": "<string>",
        "blocked": true
      }
    ]
  },
  "usageMetadata": {
    "promptTokenCount": 123,
    "candidatesTokenCount": 123,
    "totalTokenCount": 123,
    "trafficType": "<string>",
    "thoughtsTokenCount": 123,
    "promptTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ]
  },
  "modelVersion": "<string>",
  "createTime": "<string>",
  "responseId": "<string>"
}

CometAPI는 Gemini 네이티브 API 형식을 지원하므로 thinking 제어, Google Search grounding, 네이티브 이미지 생성 modality 등 Gemini 전용 기능을 완전히 활용할 수 있습니다. OpenAI 호환 채팅 엔드포인트에서 제공되지 않는 기능이 필요할 때 이 엔드포인트를 사용하세요.

인증에는 x-goog-api-key와 Authorization: Bearer 헤더를 모두 지원합니다.

빠른 시작

CometAPI와 함께 Gemini SDK 또는 HTTP 클라이언트를 사용하려면 base URL과 API 키를 교체하세요:

Setting	Google Default	CometAPI
Base URL	`generativelanguage.googleapis.com`	`api.cometapi.com`
API Key	`$GEMINI_API_KEY`	`$COMETAPI_KEY`

thinking 구성하기 (reasoning)

Gemini 모델은 응답을 생성하기 전에 내부 추론(Inference)을 수행할 수 있습니다. 제어 방식은 모델 세대에 따라 달라집니다.

Gemini 3 (thinkingLevel)
Gemini 2.5 (thinkingBudget)

Gemini 3 모델은 추론 깊이를 제어하기 위해 thinkingLevel을 사용합니다. 사용 가능한 수준: MINIMAL, LOW, MEDIUM, HIGH.특별히 다른 Gemini 3 variant가 필요한 경우가 아니라면 기본 예제 모델로 gemini-3-flash-preview를 사용하세요.

curl "https://api.cometapi.com/v1beta/models/gemini-3-flash-preview:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Explain quantum physics simply."}]}],
    "generationConfig": {
      "thinkingConfig": {"thinkingLevel": "LOW"}
    }
  }'

Gemini 2.5 모델은 더 세밀한 토큰(Token) 단위 제어를 위해 thinkingBudget을 사용합니다:

0 — thinking 비활성화
-1 — 동적(모델이 결정, 기본값)
> 0 — 특정 토큰(Token) 예산(예: 1024, 2048)

curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Solve this logic puzzle step by step."}]}],
    "generationConfig": {
      "thinkingConfig": {"thinkingBudget": 2048}
    }
  }'

Gemini 2.5 모델에서 thinkingLevel을 사용하거나(Gemini 3 모델에서 thinkingBudget을 사용하는 경우도 마찬가지) 오류가 발생할 수 있습니다. 사용 중인 모델 버전에 맞는 올바른 파라미터를 사용하세요.

응답 스트리밍하기

모델이 콘텐츠를 생성하는 동안 Server-Sent Events를 받으려면 operator로 streamGenerateContent?alt=sse를 사용하세요. 각 SSE 이벤트에는 JSON GenerateContentResponse 객체가 들어 있는 data: 줄이 포함됩니다.

curl "https://api.cometapi.com/v1beta/models/gemini-3-flash-preview:streamGenerateContent?alt=sse" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  --no-buffer \
  -d '{
    "contents": [{"parts": [{"text": "Write a short poem about the stars"}]}]
  }'

시스템 지침 설정하기

대화 전체에서 모델의 동작을 안내하려면 systemInstruction을 사용하세요:

curl "https://api.cometapi.com/v1beta/models/gemini-3-flash-preview:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "What is 2+2?"}]}],
    "systemInstruction": {
      "parts": [{"text": "You are a math tutor. Always show your work."}]
    }
  }'

요청 JSON 출력

구조화된 JSON 출력을 강제하려면 responseMimeType를 설정하세요. 엄격한 스키마 검증을 위해 선택적으로 responseSchema를 제공할 수도 있습니다:

curl "https://api.cometapi.com/v1beta/models/gemini-3-flash-preview:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "List 3 planets with their distances from the sun"}]}],
    "generationConfig": {
      "responseMimeType": "application/json"
    }
  }'

Google Search로 그라운딩

실시간 웹 검색을 활성화하려면 googleSearch 도구를 추가하세요:

curl "https://api.cometapi.com/v1beta/models/gemini-3-flash-preview:generateContent" \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: $COMETAPI_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Who won the euro 2024?"}]}],
    "tools": [{"google_search": {}}]
  }'

응답에는 소스 URL과 신뢰도 점수가 포함된 groundingMetadata가 포함됩니다.

응답 예시

CometAPI의 Gemini 엔드포인트에서 반환되는 일반적인 응답은 다음과 같습니다:

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [{"text": "Hello"}]
      },
      "finishReason": "STOP",
      "avgLogprobs": -0.0023
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 5,
    "candidatesTokenCount": 1,
    "totalTokenCount": 30,
    "trafficType": "ON_DEMAND",
    "thoughtsTokenCount": 24,
    "promptTokensDetails": [{"modality": "TEXT", "tokenCount": 5}],
    "candidatesTokensDetails": [{"modality": "TEXT", "tokenCount": 1}]
  },
  "modelVersion": "gemini-3-flash-preview",
  "createTime": "2026-03-25T04:21:43.756483Z",
  "responseId": "CeynaY3LDtvG4_UP0qaCuQY"
}

usageMetadata의 thoughtsTokenCount 필드는 사고 출력이 응답에 포함되지 않더라도, 모델이 내부 추론에 사용한 토큰 수를 보여줍니다.

OpenAI 호환 엔드포인트와 비교

기능	Gemini 네이티브 (`/v1beta/models/...`)	OpenAI 호환 (`/v1/chat/completions`)
Thinking 제어	`thinkingConfig`에서 `thinkingLevel` / `thinkingBudget` 사용	사용 불가
Google Search 그라운딩	`tools: [\{"google_search": \{\}\}]`	사용 불가
Google Maps 그라운딩	`tools: [\{"googleMaps": \{\}\}]`	사용 불가
이미지 생성 modality	`responseModalities: ["IMAGE"]`	사용 불가
인증 헤더	`x-goog-api-key` 또는 `Bearer`	`Bearer`만 지원
응답 형식	Gemini 네이티브 (`candidates`, `parts`)	OpenAI 형식 (`choices`, `message`)

인증

x-goog-api-key

string

header

필수

Your CometAPI key passed via the x-goog-api-key header. Bearer token authentication (Authorization: Bearer <key>) is also supported.

경로 매개변수

model

string

필수

Gemini model ID. Example: gemini-3-flash-preview, gemini-2.5-pro. See the Models page for current options.

operator

enum<string>

필수

The operation to perform. Use generateContent for synchronous responses, or streamGenerateContent?alt=sse for Server-Sent Events streaming.

사용 가능한 옵션:

generateContent,

streamGenerateContent?alt=sse

본문

application/json

contents

object[]

Show child attributes

systemInstruction

object

System instructions that guide the model's behavior across the entire conversation. Text only.

Show child attributes

tools

object[]

Tools the model may use to generate responses. Supports function declarations, Google Search, Google Maps, and code execution.

Show child attributes

toolConfig

object

Configuration for tool usage, such as function calling mode.

Show child attributes

safetySettings

object[]

Safety filter settings. Override default thresholds for specific harm categories.

Show child attributes

generationConfig

object

Configuration for model generation behavior including temperature, output length, and response format.

Show child attributes

cachedContent

string

The name of cached content to use as context. Format: cachedContents/{id}. See the Gemini context caching documentation for details.

응답

200 - application/json

Successful response. For streaming requests, the response is a stream of SSE events, each containing a GenerateContentResponse JSON object prefixed with data:.

candidates

object[]

The generated response candidates.

Show child attributes

promptFeedback

object

Feedback on the prompt, including safety blocking information.

Show child attributes

usageMetadata

object

Token usage statistics for the request.

Show child attributes

modelVersion

string

The model version that generated this response.

createTime

string

The timestamp when this response was created (ISO 8601 format).

responseId

string

Unique identifier for this response.

Anthropic 메시지

임베딩

from google import genai

client = genai.Client(
    api_key="<COMETAPI_KEY>",
    http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
)

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Explain how AI works in a few words",
)

print(response.text)

{
  "candidates": [
    {
      "content": {
        "role": "<string>",
        "parts": [
          {
            "text": "<string>",
            "functionCall": {
              "name": "<string>",
              "args": {}
            },
            "inlineData": {
              "mimeType": "<string>",
              "data": "<string>"
            },
            "thought": true
          }
        ]
      },
      "safetyRatings": [
        {
          "category": "<string>",
          "probability": "<string>",
          "blocked": true
        }
      ],
      "citationMetadata": {
        "citationSources": [
          {
            "startIndex": 123,
            "endIndex": 123,
            "uri": "<string>",
            "license": "<string>"
          }
        ]
      },
      "tokenCount": 123,
      "avgLogprobs": 123,
      "groundingMetadata": {
        "groundingChunks": [
          {
            "web": {
              "uri": "<string>",
              "title": "<string>"
            }
          }
        ],
        "groundingSupports": [
          {
            "groundingChunkIndices": [
              123
            ],
            "confidenceScores": [
              123
            ],
            "segment": {
              "startIndex": 123,
              "endIndex": 123,
              "text": "<string>"
            }
          }
        ],
        "webSearchQueries": [
          "<string>"
        ]
      },
      "index": 123
    }
  ],
  "promptFeedback": {
    "safetyRatings": [
      {
        "category": "<string>",
        "probability": "<string>",
        "blocked": true
      }
    ]
  },
  "usageMetadata": {
    "promptTokenCount": 123,
    "candidatesTokenCount": 123,
    "totalTokenCount": 123,
    "trafficType": "<string>",
    "thoughtsTokenCount": 123,
    "promptTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "<string>",
        "tokenCount": 123
      }
    ]
  },
  "modelVersion": "<string>",
  "createTime": "<string>",
  "responseId": "<string>"
}

Documentation Index

​빠른 시작

​thinking 구성하기 (reasoning)

​응답 스트리밍하기

​시스템 지침 설정하기

​요청 JSON 출력

​Google Search로 그라운딩

​응답 예시

​OpenAI 호환 엔드포인트와 비교

인증

경로 매개변수

본문

응답

빠른 시작

thinking 구성하기 (reasoning)

응답 스트리밍하기

시스템 지침 설정하기

요청 JSON 출력

Google Search로 그라운딩

응답 예시

OpenAI 호환 엔드포인트와 비교