Створити повідомлення

POST

messages

import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.cometapi.com",
    api_key=os.environ["COMETAPI_KEY"],
)

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "Hello, world"}
    ],
)

print(message.content[0].text)

{
  "id": "<string>",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "text": "<string>",
      "thinking": "<string>",
      "signature": "<string>",
      "id": "<string>",
      "name": "<string>",
      "input": {}
    }
  ],
  "model": "<string>",
  "stop_sequence": "<string>",
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "cache_creation_input_tokens": 123,
    "cache_read_input_tokens": 123,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 123,
      "ephemeral_1h_input_tokens": 123
    },
    "output_tokens_details": {
      "thinking_tokens": 123
    }
  }
}

CometAPI нативно підтримує Anthropic Messages API, надаючи вам прямий доступ до моделей Claude зі специфічними для Anthropic можливостями. Використовуйте цей endpoint для можливостей Claude, таких як adaptive thinking, prompt caching і effort control.

Використовуйте офіційний довідник Anthropic Messages API як авторитетне джерело для повного списку параметрів, схеми відповіді та поведінки, специфічної для Claude. Ця сторінка CometAPI пояснює, як надіслати такий формат запиту через CometAPI.

Параметри запиту Anthropic і поля відповіді можуть змінюватися в міру розвитку можливостей Claude. Перевіряйте документацію Anthropic Messages API, щоб отримати найсвіжіший повний список параметрів і поведінку, специфічну для провайдера.

Багато новіших моделей Claude відхиляють нестандартні значення temperature, top_p і top_k у Messages API. Не вказуйте ці поля семплювання, якщо ви не підтвердили їхню підтримку для вибраної моделі. Якщо модель повертає помилку про непідтримуваний або застарілий параметр, видаліть це поле із запиту.

Для автентифікації підтримуються заголовки x-api-key і Authorization: Bearer. Офіційні SDK Anthropic за замовчуванням використовують x-api-key.

Швидкий старт

Щоб використовувати офіційний SDK Anthropic із CometAPI, задайте base URL:

import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.cometapi.com",
    api_key=os.environ["COMETAPI_KEY"],
)

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
    apiKey: process.env.COMETAPI_KEY,
    baseURL: "https://api.cometapi.com",
});

const message = await client.messages.create({
    model: "claude-sonnet-5",
    max_tokens: 1024,
    messages: [{ role: "user", content: "Hello!" }],
});
console.log(message.content[0].text);

Керування adaptive thinking

Використовуйте adaptive thinking з output_config.effort, щоб контролювати, скільки роботи Claude виконує для відповіді. Новіші моделі Claude відхиляють застарілу ручну форму thinking thinking={"type": "enabled", "budget_tokens": ...}.

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    output_config={"effort": "xhigh"},
    messages=[
        {
            "role": "user",
            "content": "Analyze the trade-offs between a monolithic architecture and microservices for a small engineering team.",
        }
    ],
)

for block in message.content:
    if block.type == "text":
        print(block.text)

Токени thinking зараховуються до вашого ліміту max_tokens. Установіть max_tokens достатньо великим як для thinking, так і для фінальної відповіді, коли використовуєте вищі рівні effort.

Кешування Prompt

Щоб зменшити затримку та вартість у наступних запитах, кешуйте великі system Prompt або префікси розмови. Додайте cache_control до блоків content, які потрібно кешувати:

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are an expert code reviewer. [Long detailed instructions...]",
            "cache_control": {"type": "ephemeral"},
        }
    ],
    messages=[{"role": "user", "content": "Review this code..."}],
)

Використання кешу відображається в полі usage відповіді:

cache_creation_input_tokens — tokens, записані в кеш (тарифікуються за вищою ставкою)
cache_read_input_tokens — tokens, прочитані з кешу (тарифікуються за зниженою ставкою)

Кешування Prompt потребує мінімум 1,024 tokens у кешованому блоці content. Контент, коротший за це, не буде кешовано.

Потокові Responses

Щоб передавати Responses потоком за допомогою Server-Sent Events (SSE), встановіть stream: true. Події надходять у такому порядку:

message_start — містить метадані повідомлення та початкове usage
content_block_start — позначає початок кожного блоку content
content_block_delta — інкрементальні фрагменти тексту (text_delta)
content_block_stop — позначає кінець кожного блоку content
message_delta — фінальний stop_reason і повне usage
message_stop — сигналізує про завершення потоку

with client.messages.stream(
    model="claude-sonnet-5",
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="")

Керування effort

Щоб керувати тим, скільки effort Claude вкладає в генерування відповіді, використовуйте output_config.effort:

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=4096,
    messages=[
        {"role": "user", "content": "Summarize this briefly."}
    ],
    output_config={"effort": "low"},  # "low", "medium", "high", "xhigh", or "max"
)

Використання server tools

Claude підтримує серверні інструменти, які працюють в інфраструктурі Anthropic:

Web Fetch
Web Search

Отримання й аналіз контенту з URL:

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Analyze the content at https://arxiv.org/abs/1512.03385"}
    ],
    tools=[
        {"type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5}
    ],
)

Пошук у вебі інформації в реальному часі:

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What are the latest developments in AI?"}
    ],
    tools=[
        {"type": "web_search_20250305", "name": "web_search", "max_uses": 5}
    ],
)

Приклад відповіді

Типова відповідь від Anthropic endpoint у CometAPI:

{
  "id": "msg_bdrk_01UjHdmSztrL7QYYm7CKBDFB",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-sonnet-5",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 19,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 0,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 4
  }
}

Порівняння з OpenAI-compatible endpoint

Функція	Anthropic Messages (`/v1/messages`)	OpenAI-Compatible (`/v1/chat/completions`)
Адаптивне мислення	`thinking` з `type: "adaptive"` і `output_config.effort`	Недоступно
Кешування Prompt	`cache_control` у блоках `content`	Недоступно
Керування effort	`output_config.effort`	Недоступно
Отримання/пошук у вебі	Інструменти сервера (`web_fetch`, `web_search`)	Недоступно
Заголовок автентифікації	`x-api-key` або `Bearer`	Лише `Bearer`
Формат відповіді	Формат Anthropic (блоки `content`)	Формат OpenAI (`choices`, `message`)
Моделі	Лише Claude	Кілька провайдерів (GPT, Claude, Gemini тощо)

Авторизації

x-api-key

string

header

обов'язково

Your CometAPI key passed via the x-api-key header. Authorization: Bearer $COMETAPI_KEY is also supported.

Заголовки

anthropic-version

string

за замовчуванням:2023-06-01

The Anthropic API version to use. Defaults to 2023-06-01.

Приклад:

"2023-06-01"

anthropic-beta

string

Comma-separated list of beta features to enable. Examples: max-tokens-3-5-sonnet-2024-07-15, pdfs-2024-09-25, output-128k-2025-02-19.

Тіло

application/json

model

string

обов'язково

The Claude model to use. See the Models page for available Claude model IDs.

Приклад:

"claude-sonnet-5"

messages

object[]

обов'язково

The conversation messages. Must alternate between user and assistant roles. Each message's content can be a string or an array of content blocks (text, image, document, tool_use, tool_result). There is a limit of 100,000 messages per request.

Show child attributes

max_tokens

integer

обов'язково

The maximum number of tokens to generate. The model may stop before reaching this limit. When using thinking, the thinking tokens count towards this limit.

Необхідний діапазон: x >= 1

Приклад:

1024

system

System prompt providing context and instructions to Claude. Can be a plain string or an array of content blocks (useful for prompt caching).

temperature

number

за замовчуванням:1

Model-dependent sampling control. Many newer Claude models reject non-default temperature values on the Messages API. Omit this field unless you have verified that the selected model accepts it; if the model returns an unsupported or deprecated-parameter error, remove the field instead of substituting another sampling value.

Необхідний діапазон: 0 <= x <= 1

Приклад:

1

top_p

number

Model-dependent nucleus sampling control. Many newer Claude models reject non-default top_p values on the Messages API. Omit this field unless you have verified support for the selected model. Do not set temperature and top_p together.

Необхідний діапазон: 0 <= x <= 1

Приклад:

1

top_k

integer

Model-dependent top-k sampling control. Many newer Claude models reject non-default top_k values on the Messages API. Omit this field unless you have verified support for the selected model.

Необхідний діапазон: x >= 0

Приклад:

0

stream

boolean

за замовчуванням:false

If true, stream the response incrementally using Server-Sent Events (SSE). Events include message_start, content_block_start, content_block_delta, content_block_stop, message_delta, and message_stop.

stop_sequences

string[]

Custom strings that cause the model to stop generating when encountered. The stop sequence is not included in the response.

thinking

object

Controls Claude thinking when the selected model supports a configurable thinking mode. For newer adaptive-thinking models, use {"type":"adaptive"} with output_config.effort, or omit thinking when adaptive thinking is already the model default. Manual {"type":"enabled","budget_tokens":...} is supported only by older models and is rejected by newer Claude models.

Show child attributes

tools

object[]

Tools the model may use. Supports client-defined functions, web search (web_search_20250305), web fetch (web_fetch_20250910), code execution (code_execution_20250522), and more.

Show child attributes

tool_choice

object

Controls how the model uses tools.

Show child attributes

metadata

object

Request metadata for tracking and analytics.

Show child attributes

output_config

object

Configuration for response effort and output format. Field support depends on the selected Claude model.

Show child attributes

service_tier

enum<string>

The service tier to use. auto tries priority capacity first, standard_only uses only standard capacity.

Доступні опції:

auto,

standard_only

Відповідь

200 - application/json

Successful response. When stream is true, the response is a stream of SSE events.

string

Unique identifier for this message (e.g., msg_01XFDUDYJgAACzvnptvVoYEL).

type

enum<string>

Always message.

Доступні опції:

message

role

enum<string>

Always assistant.

Доступні опції:

assistant

content

object[]

The response content blocks. May include text, thinking, tool_use, and other block types.

Show child attributes

model

string

The specific model version that generated this response, such as claude-sonnet-5.

stop_reason

enum<string>

Why the model stopped generating. refusal can be returned as a successful HTTP response when the model declines a request.

Доступні опції:

end_turn,

max_tokens,

stop_sequence,

tool_use,

pause_turn,

refusal

stop_sequence

string | null

The stop sequence that caused the model to stop, if applicable.

usage

object

Token usage statistics.

Show child attributes

Створення відповіді моделі

Попередня

Генерування контенту

Наступна

import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.cometapi.com",
    api_key=os.environ["COMETAPI_KEY"],
)

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "Hello, world"}
    ],
)

print(message.content[0].text)

{
  "id": "<string>",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "text": "<string>",
      "thinking": "<string>",
      "signature": "<string>",
      "id": "<string>",
      "name": "<string>",
      "input": {}
    }
  ],
  "model": "<string>",
  "stop_sequence": "<string>",
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "cache_creation_input_tokens": 123,
    "cache_read_input_tokens": 123,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 123,
      "ephemeral_1h_input_tokens": 123
    },
    "output_tokens_details": {
      "thinking_tokens": 123
    }
  }
}

Модерація контенту

API-ключі

Створити повідомлення

Швидкий старт

Керування adaptive thinking

Кешування Prompt

Потокові Responses

Керування effort

Використання server tools

Приклад відповіді

Порівняння з OpenAI-compatible endpoint

Авторизації

Заголовки

Тіло

Відповідь

​Швидкий старт

​Керування adaptive thinking

​Кешування Prompt

​Потокові Responses

​Керування effort

​Використання server tools

​Приклад відповіді

​Порівняння з OpenAI-compatible endpoint

Авторизації

Заголовки

Тіло

Відповідь

Швидкий старт

Керування adaptive thinking

Кешування Prompt

Потокові Responses

Керування effort

Використання server tools

Приклад відповіді

Порівняння з OpenAI-compatible endpoint