Chat Completions - CometAPI Documentation

POST

chat

completions

from openai import OpenAI
client = OpenAI(
    base_url="https://api.cometapi.com/v1",
    api_key="<COMETAPI_KEY>",
)

completion = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
)

print(completion.choices[0].message)

{
  "id": "chatcmpl-DNA27oKtBUL8TmbGpBM3B3zhWgYfZ",
  "object": "chat.completion",
  "created": 1774412483,
  "model": "gpt-4.1-nano-2025-04-14",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Four",
        "refusal": null,
        "annotations": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 29,
    "completion_tokens": 2,
    "total_tokens": 31,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": "fp_490a4ad033"
}

CometAPI, Chat Completions isteklerini tek bir OpenAI uyumlu arayüz üzerinden OpenAI, Claude ve Gemini dahil olmak üzere birden fazla sağlayıcıya yönlendirir. model parametresini değiştirerek modeller arasında geçiş yapın; çoğu OpenAI uyumlu SDK, base_url değerini https://api.cometapi.com/v1 olarak ayarlayarak çalışır.

Farklı modeller, parametrelerin farklı alt kümelerini destekler ve biraz farklı yanıt alanları döndürür. Örneğin, reasoning_effort yalnızca reasoning modelleri için geçerlidir (o-series, GPT-5.1+) ve bazı modeller logprobs veya n > 1 desteklemez.

OpenAI Pro modelleri, o-series reasoning modelleri ve Codex modelleri için bunun yerine Responses endpoint’ini kullanın. Bu model aileleri, Responses API üzerinde daha kapsamlı desteğe sahiptir.

Mesaj rolleri

Role	Description
`system`	Asistanın davranışını ve kişiliğini belirler. Konuşmanın başına yerleştirilir.
`developer`	Daha yeni modeller için `system` yerine geçer (o1+). Kullanıcı girişinden bağımsız olarak modelin uyması gereken talimatları sağlar.
`user`	Son kullanıcıdan gelen mesajlar.
`assistant`	Önceki model yanıtları, konuşma geçmişini korumak için kullanılır.
`tool`	Araç/fonksiyon çağrılarından gelen sonuçlar. Orijinal araç çağrısıyla eşleşen `tool_call_id` içermelidir.

Daha yeni modeller için (GPT-4.1, GPT-5 series, o-series), talimat mesajlarında system yerine developer kullanmayı tercih edin. İkisi de çalışır, ancak developer talimatlara daha güçlü uyum davranışı sağlar.

Multimodal input gönderin

Birçok model, metnin yanında görselleri ve sesi de destekler. Multimodal mesajlar göndermek için content için dizi formatını kullanın:

{
  "role": "user",
  "content": [
    {"type": "text", "text": "Describe this image"},
    {
      "type": "image_url",
      "image_url": {
        "url": "https://example.com/image.png",
        "detail": "high"
      }
    }
  ]
}

detail parametresi görsel analiz derinliğini kontrol eder:

low — daha hızlıdır, daha az token kullanır (sabit maliyet)
high — ayrıntılı analiz, daha fazla token tüketilir
auto — modele karar verdirir (varsayılan)

Yanıtları stream edin

Artımlı çıktı almak için stream değerini true olarak ayarlayın. Yanıt, her olayın bir chat.completion.chunk nesnesi içerdiği Server-Sent Events (SSE) olarak iletilir:

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Streaming yanıtlarına token kullanım istatistiklerini dahil etmek için stream_options.include_usage değerini true olarak ayarlayın. Kullanım verisi, [DONE] öncesindeki son chunk içinde görünür.

Yapılandırılmış çıktı isteyin

Modeli belirli bir şemaya uyan geçerli JSON döndürmeye zorlamak için response_format kullanın:

{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "result",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "answer": {"type": "string"},
          "confidence": {"type": "number"}
        },
        "required": ["answer", "confidence"],
        "additionalProperties": false
      }
    }
  }
}

JSON Schema modu (json_schema), çıktının şemanızla tam olarak eşleşmesini garanti eder. JSON Object modu (json_object) yalnızca geçerli JSON olmasını garanti eder — yapı zorunlu kılınmaz.

Araçları ve işlevleri çağırın

Modelin harici işlevleri çağırmasını etkinleştirmek için araç tanımlarını sağlayın:

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string", "description": "City name"}
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}

Model bir aracı çağırmaya karar verdiğinde, yanıtta finish_reason: "tool_calls" olur ve message.tool_calls dizisi işlev adını ve argümanları içerir. Ardından işlevi çalıştırır ve sonucu eşleşen tool_call_id ile bir tool mesajı olarak geri gönderirsiniz.

Sağlayıcılar arası notlar

Sağlayıcılar arasında parametre desteği

Parameter	OpenAI GPT	Claude (via compat)	Gemini (via compat)
`temperature`	0–2	0–1	0–2
`top_p`	0–1	0–1	0–1
`n`	1–128	yalnızca 1	1–8
`stop`	En fazla 4	En fazla 4	En fazla 5
`tools`	✅	✅	✅
`response_format`	✅	✅ (json_schema)	✅
`logprobs`	✅	❌	❌
`reasoning_effort`	o-series, GPT-5.1+	❌	❌ (Gemini native için `thinking` kullanın)

max_tokens ve max_completion_tokens

max_tokens — Eski parametre. Çoğu modelle çalışır ancak daha yeni OpenAI modelleri için kullanımdan kaldırılmıştır.
max_completion_tokens — GPT-4.1, GPT-5 serisi ve o-series modelleri için önerilen parametre. Hem output tokens hem de reasoning tokens içerdiği için reasoning modelleri için gereklidir.

CometAPI, farklı sağlayıcılara yönlendirirken eşlemeyi otomatik olarak yönetir.

system ve developer role

system — Geleneksel talimat rolü. Tüm modellerle çalışır.
developer — o1 modelleriyle tanıtılmıştır. Daha yeni modeller için daha güçlü talimat takibi sağlar. Eski modellerde system davranışına geri döner.

GPT-4.1+ veya o-series modellerini hedefleyen yeni projelerde developer kullanın.

SSS

Oran limitleri nasıl ele alınır?

429 Too Many Requests ile karşılaştığınızda, üstel geri çekilme uygulayın:

import time
import random
from openai import OpenAI, RateLimitError

client = OpenAI(
    base_url="https://api.cometapi.com/v1",
    api_key="<COMETAPI_KEY>",
)

def chat_with_retry(messages, max_retries=3):
    for i in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-5.4",
                messages=messages,
            )
        except RateLimitError:
            if i < max_retries - 1:
                wait_time = (2 ** i) + random.random()
                time.sleep(wait_time)
            else:
                raise

Konuşma bağlamı nasıl korunur?

Tam konuşma geçmişini messages dizisine ekleyin:

messages = [
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is Python?"},
    {"role": "assistant", "content": "Python is a high-level programming language..."},
    {"role": "user", "content": "What are its main advantages?"},
]

`finish_reason` ne anlama gelir?

Değer	Anlam
`stop`	Doğal olarak tamamlandı veya bir stop sequence tetiklendi.
`length`	`max_tokens` veya `max_completion_tokens` sınırına ulaşıldı.
`tool_calls`	Model bir veya daha fazla araç/fonksiyon çağrısı yaptı.
`content_filter`	Çıktı, içerik politikası nedeniyle filtrelendi.

Maliyetler nasıl kontrol edilir?

Çıktı uzunluğunu sınırlamak için max_completion_tokens kullanın.
Maliyet açısından verimli modeller seçin (ör. daha basit görevler için gpt-5.4-mini veya gpt-5.4-nano).
Prompt’ları kısa tutun — gereksiz bağlamdan kaçının.
usage yanıt alanındaki token kullanımını izleyin.

Yetkilendirmeler

Authorization

string

header

gerekli

Bearer token authentication. Use your CometAPI key.

Gövde

application/json

model

string

varsayılan:gpt-5.4

gerekli

Model ID to use for this request. See the Models page for current options.

Örnek:

"gpt-4.1"

messages

object[]

gerekli

A list of messages forming the conversation. Each message has a role (system, user, assistant, or developer) and content (text string or multimodal content array).

Show child attributes

stream

boolean

If true, partial response tokens are delivered incrementally via server-sent events (SSE). The stream ends with a data: [DONE] message.

temperature

number

varsayılan:1

Sampling temperature between 0 and 2. Higher values (e.g., 0.8) produce more random output; lower values (e.g., 0.2) make output more focused and deterministic. Recommended to adjust this or top_p, but not both.

Gerekli aralık: 0 <= x <= 2

top_p

number

varsayılan:1

Nucleus sampling parameter. The model considers only the tokens whose cumulative probability reaches top_p. For example, 0.1 means only the top 10% probability tokens are considered. Recommended to adjust this or temperature, but not both.

Gerekli aralık: 0 <= x <= 1

integer

varsayılan:1

Number of completion choices to generate for each input message. Defaults to 1.

stop

string

Up to 4 sequences where the API will stop generating further tokens. Can be a string or an array of strings.

max_tokens

integer

Maximum number of tokens to generate in the completion. The total of input + output tokens is capped by the model's context length.

presence_penalty

number

varsayılan:0

Number between -2.0 and 2.0. Positive values penalize tokens based on whether they have already appeared, encouraging the model to explore new topics.

Gerekli aralık: -2 <= x <= 2

frequency_penalty

number

varsayılan:0

Number between -2.0 and 2.0. Positive values penalize tokens proportionally to how often they have appeared, reducing verbatim repetition.

Gerekli aralık: -2 <= x <= 2

logit_bias

object

A JSON object mapping token IDs to bias values from -100 to 100. The bias is added to the model's logits before sampling. Values between -1 and 1 subtly adjust likelihood; -100 or 100 effectively ban or force selection of a token.

user

string

A unique identifier for your end-user. Helps with abuse detection and monitoring.

max_completion_tokens

integer

An upper bound for the number of tokens to generate, including visible output tokens and reasoning tokens. Use this instead of max_tokens for GPT-4.1+, GPT-5 series, and o-series models.

response_format

object

Specifies the output format. Use {"type": "json_object"} for JSON mode, or {"type": "json_schema", "json_schema": {...}} for strict structured output.

Show child attributes

tools

object[]

A list of tools the model may call. Currently supports function type tools.

Show child attributes

tool_choice

varsayılan:auto

Controls how the model selects tools. auto (default): model decides. none: no tools. required: must call a tool.

logprobs

boolean

varsayılan:false

Whether to return log probabilities of the output tokens.

top_logprobs

integer

Number of most likely tokens to return at each position (0-20). Requires logprobs to be true.

Gerekli aralık: 0 <= x <= 20

reasoning_effort

enum<string>

Controls the reasoning effort for o-series and GPT-5.1+ models.

Mevcut seçenekler:

low,

medium,

high

stream_options

object

Options for streaming. Only valid when stream is true.

Show child attributes

service_tier

enum<string>

Specifies the processing tier.

Mevcut seçenekler:

auto,

default,

flex,

priority

Yanıt

Successful chat completion response.

string

Unique completion identifier.

Örnek:

"chatcmpl-abc123"

object

enum<string>

Mevcut seçenekler:

chat.completion

Örnek:

"chat.completion"

created

integer

Unix timestamp of creation.

Örnek:

1774412483

model

string

The model used (may include version suffix).

Örnek:

"gpt-5.4-2025-07-16"

choices

object[]

Array of completion choices.

Show child attributes

usage

object

Show child attributes

service_tier

string

Örnek:

"default"

system_fingerprint

string | null

Örnek:

"fp_490a4ad033"

Responses

from openai import OpenAI
client = OpenAI(
    base_url="https://api.cometapi.com/v1",
    api_key="<COMETAPI_KEY>",
)

completion = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
)

print(completion.choices[0].message)

{
  "id": "chatcmpl-DNA27oKtBUL8TmbGpBM3B3zhWgYfZ",
  "object": "chat.completion",
  "created": 1774412483,
  "model": "gpt-4.1-nano-2025-04-14",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Four",
        "refusal": null,
        "annotations": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 29,
    "completion_tokens": 2,
    "total_tokens": 31,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": "fp_490a4ad033"
}

Documentation Index

​Mesaj rolleri

​Multimodal input gönderin

​Yanıtları stream edin

​Yapılandırılmış çıktı isteyin

​Araçları ve işlevleri çağırın

​Sağlayıcılar arası notlar

​SSS

​Oran limitleri nasıl ele alınır?

​Konuşma bağlamı nasıl korunur?

​finish_reason ne anlama gelir?

​Maliyetler nasıl kontrol edilir?

Yetkilendirmeler

Gövde

Yanıt

Mesaj rolleri

Multimodal input gönderin

Yanıtları stream edin

Yapılandırılmış çıktı isteyin

Araçları ve işlevleri çağırın

Sağlayıcılar arası notlar

SSS

Oran limitleri nasıl ele alınır?

Konuşma bağlamı nasıl korunur?

`finish_reason` ne anlama gelir?

Maliyetler nasıl kontrol edilir?