Используйте нативный формат API Gemini через CometAPI для генерации текста, multimodal-ввода, thinking/reasoning, function calling, grounding через Google Search, JSON mode и Streaming.
from google import genai
client = genai.Client(
api_key="<COMETAPI_KEY>",
http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
)
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Explain how AI works in a few words",
)
print(response.text){
"candidates": [
{
"content": {
"role": "<string>",
"parts": [
{
"text": "<string>",
"functionCall": {
"name": "<string>",
"args": {}
},
"inlineData": {
"mimeType": "<string>",
"data": "<string>"
},
"thought": true
}
]
},
"finishReason": "STOP",
"safetyRatings": [
{
"category": "<string>",
"probability": "<string>",
"blocked": true
}
],
"citationMetadata": {
"citationSources": [
{
"startIndex": 123,
"endIndex": 123,
"uri": "<string>",
"license": "<string>"
}
]
},
"tokenCount": 123,
"avgLogprobs": 123,
"groundingMetadata": {
"groundingChunks": [
{
"web": {
"uri": "<string>",
"title": "<string>"
}
}
],
"groundingSupports": [
{
"groundingChunkIndices": [
123
],
"confidenceScores": [
123
],
"segment": {
"startIndex": 123,
"endIndex": 123,
"text": "<string>"
}
}
],
"webSearchQueries": [
"<string>"
]
},
"index": 123
}
],
"promptFeedback": {
"blockReason": "SAFETY",
"safetyRatings": [
{
"category": "<string>",
"probability": "<string>",
"blocked": true
}
]
},
"usageMetadata": {
"promptTokenCount": 123,
"candidatesTokenCount": 123,
"totalTokenCount": 123,
"trafficType": "<string>",
"thoughtsTokenCount": 123,
"promptTokensDetails": [
{
"modality": "<string>",
"tokenCount": 123
}
],
"candidatesTokensDetails": [
{
"modality": "<string>",
"tokenCount": 123
}
]
},
"modelVersion": "<string>",
"createTime": "<string>",
"responseId": "<string>"
}| Настройка | Значение по умолчанию Google | CometAPI |
|---|---|---|
| Базовый URL | generativelanguage.googleapis.com | api.cometapi.com |
| API key | $GEMINI_API_KEY | $COMETAPI_KEY |
x-goog-api-key и Authorization: Bearer.thinkingLevel для управления глубиной рассуждения. Доступные уровни: MINIMAL, LOW, MEDIUM, HIGH.curl "https://api.cometapi.com/v1beta/models/gemini-3.1-pro-preview:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $COMETAPI_KEY" \
-d '{
"contents": [{"parts": [{"text": "Explain quantum physics simply."}]}],
"generationConfig": {
"thinkingConfig": {"thinkingLevel": "LOW"}
}
}'
thinkingBudget для более точного управления на уровне Token:0 — отключить thinking-1 — динамически (решает модель, значение по умолчанию)> 0 — конкретный бюджет Token (например, 1024, 2048)curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $COMETAPI_KEY" \
-d '{
"contents": [{"parts": [{"text": "Solve this logic puzzle step by step."}]}],
"generationConfig": {
"thinkingConfig": {"thinkingBudget": 2048}
}
}'
thinkingLevel с моделями Gemini 2.5 (или thinkingBudget с моделями Gemini 3) может вызывать ошибки. Используйте правильный параметр для версии вашей модели.streamGenerateContent?alt=sse в качестве оператора, чтобы получать Server-Sent Events по мере генерации контента моделью. Каждое SSE-событие содержит строку data: с JSON-объектом GenerateContentResponse.
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $COMETAPI_KEY" \
--no-buffer \
-d '{
"contents": [{"parts": [{"text": "Write a short poem about the stars"}]}]
}'
systemInstruction:
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $COMETAPI_KEY" \
-d '{
"contents": [{"parts": [{"text": "What is 2+2?"}]}],
"systemInstruction": {
"parts": [{"text": "You are a math tutor. Always show your work."}]
}
}'
responseMimeType. При необходимости можно также указать responseSchema для строгой валидации схемы:
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $COMETAPI_KEY" \
-d '{
"contents": [{"parts": [{"text": "List 3 planets with their distances from the sun"}]}],
"generationConfig": {
"responseMimeType": "application/json"
}
}'
googleSearch:
curl "https://api.cometapi.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $COMETAPI_KEY" \
-d '{
"contents": [{"parts": [{"text": "Who won the euro 2024?"}]}],
"tools": [{"google_search": {}}]
}'
groundingMetadata с URL-адресами источников и оценками достоверности.
{
"candidates": [
{
"content": {
"role": "model",
"parts": [{"text": "Hello"}]
},
"finishReason": "STOP",
"avgLogprobs": -0.0023
}
],
"usageMetadata": {
"promptTokenCount": 5,
"candidatesTokenCount": 1,
"totalTokenCount": 30,
"trafficType": "ON_DEMAND",
"thoughtsTokenCount": 24,
"promptTokensDetails": [{"modality": "TEXT", "tokenCount": 5}],
"candidatesTokensDetails": [{"modality": "TEXT", "tokenCount": 1}]
},
"modelVersion": "gemini-2.5-flash",
"createTime": "2026-03-25T04:21:43.756483Z",
"responseId": "CeynaY3LDtvG4_UP0qaCuQY"
}
thoughtsTokenCount в usageMetadata показывает, сколько Tokens модель потратила на внутреннее рассуждение, даже если вывод процесса мышления не включён в ответ.| Возможность | Gemini Native (/v1beta/models/...) | OpenAI-Compatible (/v1/chat/completions) |
|---|---|---|
| Управление thinking | thinkingConfig с thinkingLevel / thinkingBudget | Недоступно |
| Grounding через Google Search | tools: [\{"google_search": \{\}\}] | Недоступно |
| Grounding через Google Maps | tools: [\{"googleMaps": \{\}\}] | Недоступно |
| Модальность генерации изображений | responseModalities: ["IMAGE"] | Недоступно |
| Заголовок авторизации | x-goog-api-key или Bearer | Только Bearer |
| Формат ответа | Gemini native (candidates, parts) | Формат OpenAI (choices, message) |
Your CometAPI key passed via the x-goog-api-key header. Bearer token authentication (Authorization: Bearer <key>) is also supported.
The Gemini model ID to use. See the Models page for current Gemini model IDs.
"gemini-2.5-flash"
The operation to perform. Use generateContent for synchronous responses, or streamGenerateContent?alt=sse for Server-Sent Events streaming.
generateContent, streamGenerateContent?alt=sse "generateContent"
The conversation history and current input. For single-turn queries, provide a single item. For multi-turn conversations, include all previous turns.
Show child attributes
System instructions that guide the model's behavior across the entire conversation. Text only.
Show child attributes
Tools the model may use to generate responses. Supports function declarations, Google Search, Google Maps, and code execution.
Show child attributes
Configuration for tool usage, such as function calling mode.
Show child attributes
Safety filter settings. Override default thresholds for specific harm categories.
Show child attributes
Configuration for model generation behavior including temperature, output length, and response format.
Show child attributes
The name of cached content to use as context. Format: cachedContents/{id}. See the Gemini context caching documentation for details.
Successful response. For streaming requests, the response is a stream of SSE events, each containing a GenerateContentResponse JSON object prefixed with data:.
The generated response candidates.
Show child attributes
Feedback on the prompt, including safety blocking information.
Show child attributes
Token usage statistics for the request.
Show child attributes
The model version that generated this response.
The timestamp when this response was created (ISO 8601 format).
Unique identifier for this response.
from google import genai
client = genai.Client(
api_key="<COMETAPI_KEY>",
http_options={"api_version": "v1beta", "base_url": "https://api.cometapi.com"},
)
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Explain how AI works in a few words",
)
print(response.text){
"candidates": [
{
"content": {
"role": "<string>",
"parts": [
{
"text": "<string>",
"functionCall": {
"name": "<string>",
"args": {}
},
"inlineData": {
"mimeType": "<string>",
"data": "<string>"
},
"thought": true
}
]
},
"finishReason": "STOP",
"safetyRatings": [
{
"category": "<string>",
"probability": "<string>",
"blocked": true
}
],
"citationMetadata": {
"citationSources": [
{
"startIndex": 123,
"endIndex": 123,
"uri": "<string>",
"license": "<string>"
}
]
},
"tokenCount": 123,
"avgLogprobs": 123,
"groundingMetadata": {
"groundingChunks": [
{
"web": {
"uri": "<string>",
"title": "<string>"
}
}
],
"groundingSupports": [
{
"groundingChunkIndices": [
123
],
"confidenceScores": [
123
],
"segment": {
"startIndex": 123,
"endIndex": 123,
"text": "<string>"
}
}
],
"webSearchQueries": [
"<string>"
]
},
"index": 123
}
],
"promptFeedback": {
"blockReason": "SAFETY",
"safetyRatings": [
{
"category": "<string>",
"probability": "<string>",
"blocked": true
}
]
},
"usageMetadata": {
"promptTokenCount": 123,
"candidatesTokenCount": 123,
"totalTokenCount": 123,
"trafficType": "<string>",
"thoughtsTokenCount": 123,
"promptTokensDetails": [
{
"modality": "<string>",
"tokenCount": 123
}
],
"candidatesTokensDetails": [
{
"modality": "<string>",
"tokenCount": 123
}
]
},
"modelVersion": "<string>",
"createTime": "<string>",
"responseId": "<string>"
}