メッセージを作成

POST

messages

import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.cometapi.com",
    api_key=os.environ["COMETAPI_KEY"],
)

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "Hello, world"}
    ],
)

print(message.content[0].text)

{
  "id": "<string>",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "text": "<string>",
      "thinking": "<string>",
      "signature": "<string>",
      "id": "<string>",
      "name": "<string>",
      "input": {}
    }
  ],
  "model": "<string>",
  "stop_sequence": "<string>",
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "cache_creation_input_tokens": 123,
    "cache_read_input_tokens": 123,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 123,
      "ephemeral_1h_input_tokens": 123
    },
    "output_tokens_details": {
      "thinking_tokens": 123
    }
  }
}

CometAPI は Anthropic メッセージ API をネイティブにサポートしており、Anthropic 固有の機能を備えた Claude モデルへ直接アクセスできます。適応的思考、プロンプトキャッシュ、effort 制御といった Claude の機能には、このエンドポイントを使用してください。

完全なパラメータ一覧、レスポンススキーマ、Claude 固有の動作については、公式の Anthropic Messages API reference を信頼できる情報源として参照してください。この CometAPI ページでは、そのリクエスト形式を CometAPI 経由で送信する方法を説明します。

Claude の機能進化に伴い、Anthropic のリクエストパラメータとレスポンスフィールドは変更される可能性があります。最新の完全なパラメータ一覧とプロバイダー固有の挙動については、Anthropic Messages API documentation を確認してください。

多くの新しい Claude モデルは、Messages API でデフォルト以外の temperature、top_p、top_k の値を受け付けません。選択したモデルでサポートを確認している場合を除き、これらのサンプリングフィールドは省略してください。モデルが未対応または非推奨パラメータのエラーを返した場合は、そのフィールドをリクエストから削除してください。

認証には x-api-key と Authorization: Bearer ヘッダーの両方を利用できます。公式の Anthropic SDK はデフォルトで x-api-key を使用します。

クイックスタート

CometAPI で公式の Anthropic SDK を使用するには、ベース URL を設定します。

import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.cometapi.com",
    api_key=os.environ["COMETAPI_KEY"],
)

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
    apiKey: process.env.COMETAPI_KEY,
    baseURL: "https://api.cometapi.com",
});

const message = await client.messages.create({
    model: "claude-sonnet-5",
    max_tokens: 1024,
    messages: [{ role: "user", content: "Hello!" }],
});
console.log(message.content[0].text);

適応的思考を制御する

output_config.effort を使って適応的思考を利用し、Claude がレスポンスにどれだけの処理をかけるかを制御します。新しい Claude モデルは、従来の手動思考形式 thinking={"type": "enabled", "budget_tokens": ...} を受け付けません。

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    output_config={"effort": "xhigh"},
    messages=[
        {
            "role": "user",
            "content": "Analyze the trade-offs between a monolithic architecture and microservices for a small engineering team.",
        }
    ],
)

for block in message.content:
    if block.type == "text":
        print(block.text)

高い effort レベルを使用する場合、thinking トークン（Token）も max_tokens の上限に含まれます。思考分と最終回答の両方に十分な max_tokens を設定してください。

プロンプトをキャッシュする

後続のリクエストでレイテンシとコストを削減するために、大きな system プロンプトや会話のプレフィックスをキャッシュできます。キャッシュしたい content ブロックに cache_control を追加します。

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are an expert code reviewer. [Long detailed instructions...]",
            "cache_control": {"type": "ephemeral"},
        }
    ],
    messages=[{"role": "user", "content": "Review this code..."}],
)

キャッシュの利用状況は、レスポンスの usage フィールドで報告されます。

cache_creation_input_tokens — キャッシュに書き込まれたトークン（Token）（より高い料金で課金）
cache_read_input_tokens — キャッシュから読み取られたトークン（Token）（割引料金で課金）

プロンプトキャッシュを利用するには、キャッシュ対象の content ブロックに最低 1,024 tokens が必要です。これより短い content はキャッシュされません。

レスポンスをストリーミングする

Server-Sent Events（SSE）を使ってレスポンスをストリーミングするには、stream: true を設定します。イベントは次の順序で到着します。

message_start — メッセージのメタデータと初期 usage を含みます
content_block_start — 各 content ブロックの開始を示します
content_block_delta — 増分テキストチャンク（text_delta）
content_block_stop — 各 content ブロックの終了を示します
message_delta — 最終的な stop_reason と完全な usage
message_stop — ストリームの終了を示します

with client.messages.stream(
    model="claude-sonnet-5",
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="")

effort を制御する

Claude がレスポンス生成にどれだけの effort をかけるかを制御するには、output_config.effort を使用します。

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=4096,
    messages=[
        {"role": "user", "content": "Summarize this briefly."}
    ],
    output_config={"effort": "low"},  # "low", "medium", "high", "xhigh", or "max"
)

サーバーツールを使う

Claude は、Anthropic のインフラ上で実行されるサーバーサイドツールをサポートしています。

Web Fetch
Web Search

URL からコンテンツを取得して分析します。

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Analyze the content at https://arxiv.org/abs/1512.03385"}
    ],
    tools=[
        {"type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5}
    ],
)

リアルタイム情報をウェブ検索します。

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What are the latest developments in AI?"}
    ],
    tools=[
        {"type": "web_search_20250305", "name": "web_search", "max_uses": 5}
    ],
)

レスポンス例

CometAPI の Anthropic エンドポイントからの典型的なレスポンス:

{
  "id": "msg_bdrk_01UjHdmSztrL7QYYm7CKBDFB",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-sonnet-5",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 19,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 0,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 4
  }
}

OpenAI互換エンドポイントとの比較

機能	Anthropic メッセージ (`/v1/messages`)	OpenAI互換 (`/v1/chat/completions`)
適応的思考	`type: "adaptive"` と `output_config.effort` を使用する `thinking`	利用不可
プロンプトキャッシュ	content ブロック上の `cache_control`	利用不可
Effort 制御	`output_config.effort`	利用不可
Web 取得/検索	サーバーツール（`web_fetch`, `web_search`）	利用不可
認証ヘッダー	`x-api-key` または `Bearer`	`Bearer` のみ
レスポンス形式	Anthropic 形式（`content` ブロック）	OpenAI 形式（`choices`, `message`）
モデル	Claude のみ	マルチプロバイダー（GPT, Claude, Gemini など）

承認

x-api-key

string

header

必須

Your CometAPI key passed via the x-api-key header. Authorization: Bearer $COMETAPI_KEY is also supported.

ヘッダー

anthropic-version

string

デフォルト:2023-06-01

The Anthropic API version to use. Defaults to 2023-06-01.

例:

"2023-06-01"

anthropic-beta

string

Comma-separated list of beta features to enable. Examples: max-tokens-3-5-sonnet-2024-07-15, pdfs-2024-09-25, output-128k-2025-02-19.

ボディ

application/json

model

string

必須

The Claude model to use. See the Models page for available Claude model IDs.

例:

"claude-sonnet-5"

messages

object[]

必須

The conversation messages. Must alternate between user and assistant roles. Each message's content can be a string or an array of content blocks (text, image, document, tool_use, tool_result). There is a limit of 100,000 messages per request.

Show child attributes

max_tokens

integer

必須

The maximum number of tokens to generate. The model may stop before reaching this limit. When using thinking, the thinking tokens count towards this limit.

必須範囲: x >= 1

例:

1024

system

System prompt providing context and instructions to Claude. Can be a plain string or an array of content blocks (useful for prompt caching).

temperature

number

デフォルト:1

Model-dependent sampling control. Many newer Claude models reject non-default temperature values on the Messages API. Omit this field unless you have verified that the selected model accepts it; if the model returns an unsupported or deprecated-parameter error, remove the field instead of substituting another sampling value.

必須範囲: 0 <= x <= 1

例:

1

top_p

number

Model-dependent nucleus sampling control. Many newer Claude models reject non-default top_p values on the Messages API. Omit this field unless you have verified support for the selected model. Do not set temperature and top_p together.

必須範囲: 0 <= x <= 1

例:

1

top_k

integer

Model-dependent top-k sampling control. Many newer Claude models reject non-default top_k values on the Messages API. Omit this field unless you have verified support for the selected model.

必須範囲: x >= 0

例:

0

stream

boolean

デフォルト:false

If true, stream the response incrementally using Server-Sent Events (SSE). Events include message_start, content_block_start, content_block_delta, content_block_stop, message_delta, and message_stop.

stop_sequences

string[]

Custom strings that cause the model to stop generating when encountered. The stop sequence is not included in the response.

thinking

object

Controls Claude thinking when the selected model supports a configurable thinking mode. For newer adaptive-thinking models, use {"type":"adaptive"} with output_config.effort, or omit thinking when adaptive thinking is already the model default. Manual {"type":"enabled","budget_tokens":...} is supported only by older models and is rejected by newer Claude models.

Show child attributes

tools

object[]

Tools the model may use. Supports client-defined functions, web search (web_search_20250305), web fetch (web_fetch_20250910), code execution (code_execution_20250522), and more.

Show child attributes

tool_choice

object

Controls how the model uses tools.

Show child attributes

metadata

object

Request metadata for tracking and analytics.

Show child attributes

output_config

object

Configuration for response effort and output format. Field support depends on the selected Claude model.

Show child attributes

service_tier

enum<string>

The service tier to use. auto tries priority capacity first, standard_only uses only standard capacity.

利用可能なオプション:

auto,

standard_only

レスポンス

200 - application/json

Successful response. When stream is true, the response is a stream of SSE events.

string

Unique identifier for this message (e.g., msg_01XFDUDYJgAACzvnptvVoYEL).

type

enum<string>

Always message.

利用可能なオプション:

message

role

enum<string>

Always assistant.

利用可能なオプション:

assistant

content

object[]

The response content blocks. May include text, thinking, tool_use, and other block types.

Show child attributes

model

string

The specific model version that generated this response, such as claude-sonnet-5.

stop_reason

enum<string>

Why the model stopped generating. refusal can be returned as a successful HTTP response when the model declines a request.

利用可能なオプション:

end_turn,

max_tokens,

stop_sequence,

tool_use,

pause_turn,

refusal

stop_sequence

string | null

The stop sequence that caused the model to stop, if applicable.

usage

object

Token usage statistics.

Show child attributes

モデルレスポンスを作成

コンテンツ生成

import os
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.cometapi.com",
    api_key=os.environ["COMETAPI_KEY"],
)

message = client.messages.create(
    model="claude-sonnet-5",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "Hello, world"}
    ],
)

print(message.content[0].text)

{
  "id": "<string>",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "text": "<string>",
      "thinking": "<string>",
      "signature": "<string>",
      "id": "<string>",
      "name": "<string>",
      "input": {}
    }
  ],
  "model": "<string>",
  "stop_sequence": "<string>",
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "cache_creation_input_tokens": 123,
    "cache_read_input_tokens": 123,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 123,
      "ephemeral_1h_input_tokens": 123
    },
    "output_tokens_details": {
      "thinking_tokens": 123
    }
  }
}

コンテンツモデレーション

APIキー

クイックスタート

適応的思考を制御する

プロンプトをキャッシュする

レスポンスをストリーミングする

effort を制御する

サーバーツールを使う

レスポンス例

OpenAI互換エンドポイントとの比較

承認

ヘッダー

ボディ

レスポンス

​クイックスタート

​適応的思考を制御する

​プロンプトをキャッシュする

​レスポンスをストリーミングする

​effort を制御する

​サーバーツールを使う

​レスポンス例

​OpenAI互換エンドポイントとの比較

承認

ヘッダー

ボディ

レスポンス

クイックスタート

適応的思考を制御する

プロンプトをキャッシュする

レスポンスをストリーミングする

effort を制御する

サーバーツールを使う

レスポンス例

OpenAI互換エンドポイントとの比較