> ## Documentation Index
> Fetch the complete documentation index at: https://apidoc.cometapi.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Anthropic Messages

> Use the Anthropic Messages API through CometAPI to access Claude models with extended thinking, prompt caching, tool use, web search/fetch, streaming, and effort control.

CometAPI supports the Anthropic Messages API natively, giving you direct access to Claude models with all Anthropic-specific features. Use this endpoint for Claude-exclusive capabilities like extended thinking, prompt caching, and effort control.

<Note>
  Both `x-api-key` and `Authorization: Bearer` headers are supported for authentication. The official Anthropic SDKs use `x-api-key` by default.
</Note>

## Quick start

To use the official Anthropic SDK with CometAPI, set the base URL:

<CodeGroup>
  ```python Python theme={null}
  import os
  import anthropic

  client = anthropic.Anthropic(
      base_url="https://api.cometapi.com",
      api_key=os.environ["COMETAPI_KEY"],
  )

  message = client.messages.create(
      model="claude-sonnet-4-6",
      max_tokens=1024,
      messages=[{"role": "user", "content": "Hello!"}],
  )
  print(message.content[0].text)
  ```

  ```javascript JavaScript theme={null}
  import Anthropic from "@anthropic-ai/sdk";

  const client = new Anthropic({
      apiKey: process.env.COMETAPI_KEY,
      baseURL: "https://api.cometapi.com",
  });

  const message = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 1024,
      messages: [{ role: "user", content: "Hello!" }],
  });
  console.log(message.content[0].text);
  ```
</CodeGroup>

## Enable extended thinking

Enable Claude's step-by-step reasoning with the `thinking` parameter. The response includes `thinking` content blocks showing Claude's internal reasoning before the final answer.

```python theme={null}
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000,
    },
    messages=[
        {"role": "user", "content": "Prove that there are infinitely many primes."}
    ],
)

for block in message.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
    elif block.type == "text":
        print(f"Answer: {block.text}")
```

<Warning>
  Thinking requires a minimum `budget_tokens` of **1,024**. Thinking tokens count towards your `max_tokens` limit — set `max_tokens` high enough to accommodate both thinking and the response.
</Warning>

***

## Cache prompts

To reduce latency and cost on subsequent requests, cache large system prompts or conversation prefixes. Add `cache_control` to content blocks that should be cached:

```python theme={null}
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are an expert code reviewer. [Long detailed instructions...]",
            "cache_control": {"type": "ephemeral"},
        }
    ],
    messages=[{"role": "user", "content": "Review this code..."}],
)
```

Cache usage is reported in the response `usage` field:

* `cache_creation_input_tokens` — tokens written to cache (billed at a higher rate)
* `cache_read_input_tokens` — tokens read from cache (billed at a reduced rate)

<Info>
  Prompt caching requires a minimum of **1,024 tokens** in the cached content block. Content shorter than this will not be cached.
</Info>

***

## Stream responses

To stream responses using Server-Sent Events (SSE), set `stream: true`. Events arrive in this order:

1. `message_start` — contains the message metadata and initial usage
2. `content_block_start` — marks the beginning of each content block
3. `content_block_delta` — incremental text chunks (`text_delta`)
4. `content_block_stop` — marks the end of each content block
5. `message_delta` — final `stop_reason` and complete `usage`
6. `message_stop` — signals the end of the stream

```python theme={null}
with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="")
```

***

## Control effort

To control how much effort Claude puts into generating a response, use `output_config.effort`:

```python theme={null}
message = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[
        {"role": "user", "content": "Summarize this briefly."}
    ],
    output_config={"effort": "low"},  # "low", "medium", or "high"
)
```

***

## Use server tools

Claude supports server-side tools that run on Anthropic's infrastructure:

<Tabs>
  <Tab title="Web Fetch">
    Fetch and analyze content from URLs:

    ```python theme={null}
    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[
            {"role": "user", "content": "Analyze the content at https://arxiv.org/abs/1512.03385"}
        ],
        tools=[
            {"type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5}
        ],
    )
    ```
  </Tab>

  <Tab title="Web Search">
    Search the web for real-time information:

    ```python theme={null}
    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[
            {"role": "user", "content": "What are the latest developments in AI?"}
        ],
        tools=[
            {"type": "web_search_20250305", "name": "web_search", "max_uses": 5}
        ],
    )
    ```
  </Tab>
</Tabs>

***

## Response example

A typical response from CometAPI's Anthropic endpoint:

```json theme={null}
{
  "id": "msg_bdrk_01UjHdmSztrL7QYYm7CKBDFB",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-sonnet-4-6",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 19,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 0,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 4
  }
}
```

***

## Compare with OpenAI-compatible endpoint

| Feature           | Anthropic Messages (`/v1/messages`)       | OpenAI-Compatible (`/v1/chat/completions`) |
| ----------------- | ----------------------------------------- | ------------------------------------------ |
| Extended thinking | `thinking` parameter with `budget_tokens` | Not available                              |
| Prompt caching    | `cache_control` on content blocks         | Not available                              |
| Effort control    | `output_config.effort`                    | Not available                              |
| Web fetch/search  | Server tools (`web_fetch`, `web_search`)  | Not available                              |
| Auth header       | `x-api-key` or `Bearer`                   | `Bearer` only                              |
| Response format   | Anthropic format (`content` blocks)       | OpenAI format (`choices`, `message`)       |
| Models            | Claude only                               | Multi-provider (GPT, Claude, Gemini, etc.) |


## OpenAPI

````yaml /api/openapi/text/post-anthropic-messages.openapi.json POST /v1/messages
openapi: 3.1.0
info:
  title: Anthropic Messages API
  version: 1.0.0
servers:
  - url: https://api.cometapi.com
security:
  - apiKeyAuth: []
paths:
  /v1/messages:
    post:
      summary: Anthropic Messages
      description: >-
        Send structured messages to Claude models using the Anthropic native API
        format. Supports text and multimodal inputs, multi-turn conversations,
        extended thinking, tool use, prompt caching, streaming, and web search.
        CometAPI proxies requests to the Anthropic API — use this endpoint when
        you need Claude-specific features like extended thinking or prompt
        caching that aren't available through the OpenAI-compatible endpoint.
      operationId: anthropic_messages
      parameters:
        - name: anthropic-version
          in: header
          required: false
          description: The Anthropic API version to use. Defaults to `2023-06-01`.
          schema:
            type: string
            default: '2023-06-01'
            example: '2023-06-01'
        - name: anthropic-beta
          in: header
          required: false
          description: >-
            Comma-separated list of beta features to enable. Examples:
            `max-tokens-3-5-sonnet-2024-07-15`, `pdfs-2024-09-25`,
            `output-128k-2025-02-19`.
          schema:
            type: string
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required:
                - model
                - messages
                - max_tokens
              properties:
                model:
                  type: string
                  description: >-
                    The Claude model to use. See the [Models
                    page](/overview/models) for current Claude model IDs.
                  example: claude-sonnet-4-6
                messages:
                  type: array
                  description: >-
                    The conversation messages. Must alternate between `user` and
                    `assistant` roles. Each message's `content` can be a string
                    or an array of content blocks (text, image, document,
                    tool_use, tool_result). There is a limit of 100,000 messages
                    per request.
                  items:
                    type: object
                    required:
                      - role
                      - content
                    properties:
                      role:
                        type: string
                        description: The role of the message author.
                        enum:
                          - user
                          - assistant
                      content:
                        description: >-
                          The message content. Either a plain string or an array
                          of content blocks for multimodal input.
                        oneOf:
                          - type: string
                          - type: array
                            items:
                              type: object
                              properties:
                                type:
                                  type: string
                                  description: The content block type.
                                  enum:
                                    - text
                                    - image
                                    - document
                                    - tool_use
                                    - tool_result
                                text:
                                  type: string
                                  description: Text content (for `text` type blocks).
                                source:
                                  type: object
                                  description: >-
                                    Source data for `image` or `document`
                                    blocks.
                                  properties:
                                    type:
                                      type: string
                                      enum:
                                        - base64
                                        - url
                                    media_type:
                                      type: string
                                    data:
                                      type: string
                                    url:
                                      type: string
                                cache_control:
                                  type: object
                                  description: >-
                                    Cache control for prompt caching. Set
                                    `{"type": "ephemeral"}` to cache this
                                    content block.
                                  properties:
                                    type:
                                      type: string
                                      enum:
                                        - ephemeral
                                    ttl:
                                      type: string
                                      description: >-
                                        Cache TTL. Options: `ephemeral_5m` (5
                                        min), `ephemeral_1h` (1 hour).
                max_tokens:
                  type: integer
                  description: >-
                    The maximum number of tokens to generate. The model may stop
                    before reaching this limit. When using `thinking`, the
                    thinking tokens count towards this limit.
                  minimum: 1
                  example: 1024
                system:
                  description: >-
                    System prompt providing context and instructions to Claude.
                    Can be a plain string or an array of content blocks (useful
                    for prompt caching).
                  oneOf:
                    - type: string
                    - type: array
                      items:
                        type: object
                        properties:
                          type:
                            type: string
                            enum:
                              - text
                          text:
                            type: string
                          cache_control:
                            type: object
                            properties:
                              type:
                                type: string
                                enum:
                                  - ephemeral
                temperature:
                  type: number
                  description: >-
                    Controls randomness in the response. Range: 0.0–1.0. Use
                    lower values for analytical tasks and higher values for
                    creative tasks. Defaults to `1.0`.
                  minimum: 0
                  maximum: 1
                  default: 1
                top_p:
                  type: number
                  description: >-
                    Nucleus sampling threshold. Only tokens with cumulative
                    probability up to this value are considered. Range: 0.0–1.0.
                    Use either `temperature` or `top_p`, not both.
                  minimum: 0
                  maximum: 1
                top_k:
                  type: integer
                  description: >-
                    Only sample from the top K most probable tokens. Recommended
                    for advanced use cases only.
                  minimum: 0
                stream:
                  type: boolean
                  description: >-
                    If `true`, stream the response incrementally using
                    Server-Sent Events (SSE). Events include `message_start`,
                    `content_block_start`, `content_block_delta`,
                    `content_block_stop`, `message_delta`, and `message_stop`.
                  default: false
                stop_sequences:
                  type: array
                  description: >-
                    Custom strings that cause the model to stop generating when
                    encountered. The stop sequence is not included in the
                    response.
                  items:
                    type: string
                thinking:
                  type: object
                  description: >-
                    Enable extended thinking — Claude's step-by-step reasoning
                    process. When enabled, the response includes `thinking`
                    content blocks before the answer. Requires a minimum
                    `budget_tokens` of 1,024.
                  properties:
                    type:
                      type: string
                      description: Set to `enabled` to turn on extended thinking.
                      enum:
                        - enabled
                        - disabled
                    budget_tokens:
                      type: integer
                      description: >-
                        Maximum number of tokens for the thinking process.
                        Minimum: 1,024. These tokens count towards the
                        `max_tokens` limit.
                      minimum: 1024
                tools:
                  type: array
                  description: >-
                    Tools the model may use. Supports client-defined functions,
                    web search (`web_search_20250305`), web fetch
                    (`web_fetch_20250910`), code execution
                    (`code_execution_20250522`), and more.
                  items:
                    type: object
                    properties:
                      name:
                        type: string
                        description: The tool name.
                      description:
                        type: string
                        description: A description of what the tool does.
                      input_schema:
                        type: object
                        description: JSON Schema defining the tool's input parameters.
                      type:
                        type: string
                        description: >-
                          The tool type. Required for server tools (e.g.,
                          `web_search_20250305`, `web_fetch_20250910`,
                          `code_execution_20250522`).
                      max_uses:
                        type: integer
                        description: >-
                          Maximum number of times this tool can be used in a
                          single request.
                tool_choice:
                  type: object
                  description: Controls how the model uses tools.
                  properties:
                    type:
                      type: string
                      description: >-
                        The tool choice mode: `auto` (model decides), `any`
                        (must use a tool), `tool` (must use a specific tool), or
                        `none` (no tools).
                      enum:
                        - auto
                        - any
                        - tool
                        - none
                    name:
                      type: string
                      description: >-
                        The specific tool name to use. Required when `type` is
                        `tool`.
                    disable_parallel_tool_use:
                      type: boolean
                      description: >-
                        If `true`, prevent the model from calling multiple tools
                        in parallel.
                metadata:
                  type: object
                  description: Request metadata for tracking and analytics.
                  properties:
                    user_id:
                      type: string
                      description: >-
                        An external identifier for the user making the request.
                        Used for abuse detection.
                output_config:
                  type: object
                  description: Configuration for output behavior.
                  properties:
                    effort:
                      type: string
                      description: >-
                        Controls Claude's effort level in generating responses.
                        Lower effort means faster, more concise responses.
                      enum:
                        - low
                        - medium
                        - high
                    format:
                      type: object
                      description: >-
                        Output format configuration. Use `{"type": "json",
                        "schema": {...}}` for structured JSON output.
                service_tier:
                  type: string
                  description: >-
                    The service tier to use. `auto` tries priority capacity
                    first, `standard_only` uses only standard capacity.
                  enum:
                    - auto
                    - standard_only
            examples:
              Default:
                summary: Basic message
                value:
                  model: claude-sonnet-4-6
                  max_tokens: 1024
                  system: You are a helpful assistant.
                  messages:
                    - role: user
                      content: Hello, world
              Prompt Cache:
                summary: With prompt caching
                value:
                  model: claude-sonnet-4-6
                  max_tokens: 1024
                  system:
                    - type: text
                      text: >-
                        You are an expert code reviewer. Analyze code for
                        correctness, performance, security, and maintainability.
                        Follow SOLID principles and provide actionable
                        suggestions. [Long system prompt content that exceeds
                        1024 tokens to enable caching...]
                      cache_control:
                        type: ephemeral
                  messages:
                    - role: user
                      content: |-
                        Please review this Python code:

                        def calculate_order_total(items):
                            total = 0
                            for item in items:
                                total += item['price'] * item['quantity']
                            return total
              Thinking Control:
                summary: With extended thinking
                value:
                  model: claude-sonnet-4-6
                  max_tokens: 16000
                  thinking:
                    type: enabled
                    budget_tokens: 10000
                  messages:
                    - role: user
                      content: >-
                        Are there an infinite number of prime numbers such that
                        n mod 4 == 3?
              Streaming:
                summary: Streaming response
                value:
                  model: claude-sonnet-4-6
                  max_tokens: 256
                  stream: true
                  messages:
                    - role: user
                      content: Hello
              Web Fetch:
                summary: With web fetch tool
                value:
                  model: claude-sonnet-4-6
                  max_tokens: 1024
                  messages:
                    - role: user
                      content: >-
                        Please analyze the content at
                        https://arxiv.org/abs/1512.03385
                  tools:
                    - type: web_fetch_20250910
                      name: web_fetch
                      max_uses: 5
      responses:
        '200':
          description: >-
            Successful response. When `stream` is `true`, the response is a
            stream of SSE events.
          content:
            application/json:
              schema:
                type: object
                properties:
                  id:
                    type: string
                    description: >-
                      Unique identifier for this message (e.g.,
                      `msg_01XFDUDYJgAACzvnptvVoYEL`).
                  type:
                    type: string
                    description: Always `message`.
                    enum:
                      - message
                  role:
                    type: string
                    description: Always `assistant`.
                    enum:
                      - assistant
                  content:
                    type: array
                    description: >-
                      The response content blocks. May include `text`,
                      `thinking`, `tool_use`, and other block types.
                    items:
                      type: object
                      properties:
                        type:
                          type: string
                          description: The content block type.
                          enum:
                            - text
                            - thinking
                            - tool_use
                        text:
                          type: string
                          description: The generated text (for `text` blocks).
                        thinking:
                          type: string
                          description: >-
                            The model's thinking process (for `thinking` blocks,
                            when extended thinking is enabled).
                        signature:
                          type: string
                          description: Cryptographic signature for the thinking block.
                        id:
                          type: string
                          description: Tool use ID (for `tool_use` blocks).
                        name:
                          type: string
                          description: Tool name (for `tool_use` blocks).
                        input:
                          type: object
                          description: Tool input arguments (for `tool_use` blocks).
                  model:
                    type: string
                    description: >-
                      The specific model version that generated this response
                      (e.g., `claude-sonnet-4-6`).
                  stop_reason:
                    type: string
                    description: Why the model stopped generating.
                    enum:
                      - end_turn
                      - max_tokens
                      - stop_sequence
                      - tool_use
                      - pause_turn
                  stop_sequence:
                    type:
                      - string
                      - 'null'
                    description: >-
                      The stop sequence that caused the model to stop, if
                      applicable.
                  usage:
                    type: object
                    description: Token usage statistics.
                    properties:
                      input_tokens:
                        type: integer
                        description: >-
                          Number of input tokens (prompt + conversation
                          history).
                      output_tokens:
                        type: integer
                        description: Number of output tokens generated.
                      cache_creation_input_tokens:
                        type: integer
                        description: >-
                          Number of input tokens used to create the prompt
                          cache.
                      cache_read_input_tokens:
                        type: integer
                        description: Number of input tokens read from the prompt cache.
                      cache_creation:
                        type: object
                        description: Detailed cache creation token breakdown by TTL tier.
                        properties:
                          ephemeral_5m_input_tokens:
                            type: integer
                            description: Tokens written to 5-minute ephemeral cache.
                          ephemeral_1h_input_tokens:
                            type: integer
                            description: Tokens written to 1-hour ephemeral cache.
      x-codeSamples:
        - lang: Python
          label: Basic
          source: |
            import os
            import anthropic

            client = anthropic.Anthropic(
                base_url="https://api.cometapi.com",
                api_key=os.environ["COMETAPI_KEY"],
            )

            message = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                system="You are a helpful assistant.",
                messages=[
                    {"role": "user", "content": "Hello, world"}
                ],
            )

            print(message.content[0].text)
        - lang: Python
          label: Prompt Cache
          source: |
            import os
            import anthropic

            client = anthropic.Anthropic(
                base_url="https://api.cometapi.com",
                api_key=os.environ["COMETAPI_KEY"],
            )

            message = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                system=[
                    {
                        "type": "text",
                        "text": "You are an expert code reviewer. Analyze code, identify issues, "
                                "and suggest improvements following SOLID principles...",
                        "cache_control": {"type": "ephemeral"},
                    }
                ],
                messages=[
                    {
                        "role": "user",
                        "content": "Please review this Python code:\n\n"
                                   "def calculate_order_total(items):\n"
                                   "    total = 0\n"
                                   "    for item in items:\n"
                                   "        total += item['price'] * item['quantity']\n"
                                   "    return total",
                    }
                ],
            )

            print(message.content[0].text)
        - lang: Python
          label: Extended Thinking
          source: |
            import os
            import anthropic

            client = anthropic.Anthropic(
                base_url="https://api.cometapi.com",
                api_key=os.environ["COMETAPI_KEY"],
            )

            message = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=16000,
                thinking={
                    "type": "enabled",
                    "budget_tokens": 10000,
                },
                messages=[
                    {
                        "role": "user",
                        "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?",
                    }
                ],
            )

            for block in message.content:
                if block.type == "thinking":
                    print(f"Thinking: {block.thinking[:200]}...")
                elif block.type == "text":
                    print(f"Answer: {block.text}")
        - lang: Python
          label: Effort Control
          source: |
            import os
            import anthropic

            client = anthropic.Anthropic(
                base_url="https://api.cometapi.com",
                api_key=os.environ["COMETAPI_KEY"],
            )

            message = client.messages.create(
                model="claude-opus-4-6",
                max_tokens=4096,
                messages=[
                    {
                        "role": "user",
                        "content": "Analyze the trade-offs between microservices and monolithic architectures",
                    }
                ],
                output_config={"effort": "medium"},
            )

            print(message.content[0].text)
        - lang: Python
          label: Streaming
          source: |
            import os
            import anthropic

            client = anthropic.Anthropic(
                base_url="https://api.cometapi.com",
                api_key=os.environ["COMETAPI_KEY"],
            )

            with client.messages.stream(
                model="claude-sonnet-4-6",
                max_tokens=256,
                messages=[
                    {"role": "user", "content": "Hello"}
                ],
            ) as stream:
                for text in stream.text_stream:
                    print(text, end="")
        - lang: Python
          label: Web Fetch
          source: |
            import os
            import anthropic

            client = anthropic.Anthropic(
                base_url="https://api.cometapi.com",
                api_key=os.environ["COMETAPI_KEY"],
            )

            message = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=[
                    {
                        "role": "user",
                        "content": "Please analyze the content at https://arxiv.org/abs/1512.03385",
                    }
                ],
                tools=[
                    {
                        "type": "web_fetch_20250910",
                        "name": "web_fetch",
                        "max_uses": 5,
                    }
                ],
            )

            print(message.content)
        - lang: JavaScript
          label: Basic
          source: |
            import Anthropic from "@anthropic-ai/sdk";

            const client = new Anthropic({
                apiKey: process.env.COMETAPI_KEY,
                baseURL: "https://api.cometapi.com",
            });

            const message = await client.messages.create({
                model: "claude-sonnet-4-6",
                max_tokens: 1024,
                system: "You are a helpful assistant.",
                messages: [
                    { role: "user", content: "Hello, world" },
                ],
            });

            console.log(message.content[0].text);
        - lang: JavaScript
          label: Extended Thinking
          source: |
            import Anthropic from "@anthropic-ai/sdk";

            const client = new Anthropic({
                apiKey: process.env.COMETAPI_KEY,
                baseURL: "https://api.cometapi.com",
            });

            const message = await client.messages.create({
                model: "claude-sonnet-4-6",
                max_tokens: 16000,
                thinking: {
                    type: "enabled",
                    budget_tokens: 10000,
                },
                messages: [
                    {
                        role: "user",
                        content: "Are there an infinite number of prime numbers such that n mod 4 == 3?",
                    },
                ],
            });

            for (const block of message.content) {
                if (block.type === "thinking") {
                    console.log(`Thinking: ${block.thinking.slice(0, 200)}...`);
                } else if (block.type === "text") {
                    console.log(`Answer: ${block.text}`);
                }
            }
        - lang: JavaScript
          label: Effort Control
          source: |
            import Anthropic from "@anthropic-ai/sdk";

            const client = new Anthropic({
                apiKey: process.env.COMETAPI_KEY,
                baseURL: "https://api.cometapi.com",
            });

            const message = await client.messages.create({
                model: "claude-opus-4-6",
                max_tokens: 4096,
                messages: [
                    {
                        role: "user",
                        content: "Analyze the trade-offs between microservices and monolithic architectures",
                    },
                ],
                output_config: { effort: "medium" },
            });

            console.log(message.content[0].text);
        - lang: JavaScript
          label: Streaming
          source: |
            import Anthropic from "@anthropic-ai/sdk";

            const client = new Anthropic({
                apiKey: process.env.COMETAPI_KEY,
                baseURL: "https://api.cometapi.com",
            });

            const stream = await client.messages.create({
                model: "claude-sonnet-4-6",
                max_tokens: 256,
                messages: [
                    { role: "user", content: "Hello" },
                ],
                stream: true,
            });

            for await (const event of stream) {
                if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
                    process.stdout.write(event.delta.text);
                }
            }
        - lang: JavaScript
          label: Web Fetch
          source: |
            import Anthropic from "@anthropic-ai/sdk";

            const client = new Anthropic({
                apiKey: process.env.COMETAPI_KEY,
                baseURL: "https://api.cometapi.com",
            });

            const message = await client.messages.create({
                model: "claude-sonnet-4-6",
                max_tokens: 1024,
                messages: [
                    {
                        role: "user",
                        content: "Please analyze the content at https://arxiv.org/abs/1512.03385",
                    },
                ],
                tools: [
                    {
                        type: "web_fetch_20250910",
                        name: "web_fetch",
                        max_uses: 5,
                    },
                ],
            });

            console.log(message.content);
        - lang: Shell
          label: Basic
          source: |
            curl https://api.cometapi.com/v1/messages \
              -H "Content-Type: application/json" \
              -H "x-api-key: $COMETAPI_KEY" \
              -H "anthropic-version: 2023-06-01" \
              -d '{
                "model": "claude-sonnet-4-6",
                "max_tokens": 1024,
                "system": "You are a helpful assistant.",
                "messages": [
                  {"role": "user", "content": "Hello, world"}
                ]
              }'
        - lang: Shell
          label: Extended Thinking
          source: |
            curl https://api.cometapi.com/v1/messages \
              -H "Content-Type: application/json" \
              -H "x-api-key: $COMETAPI_KEY" \
              -H "anthropic-version: 2023-06-01" \
              -d '{
                "model": "claude-sonnet-4-6",
                "max_tokens": 16000,
                "thinking": {
                  "type": "enabled",
                  "budget_tokens": 10000
                },
                "messages": [
                  {"role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"}
                ]
              }'
        - lang: Shell
          label: Effort Control
          source: |
            curl https://api.cometapi.com/v1/messages \
              -H "Content-Type: application/json" \
              -H "x-api-key: $COMETAPI_KEY" \
              -H "anthropic-version: 2023-06-01" \
              -d '{
                "model": "claude-opus-4-6",
                "max_tokens": 4096,
                "messages": [
                  {"role": "user", "content": "Analyze the trade-offs between microservices and monolithic architectures"}
                ],
                "output_config": {"effort": "medium"}
              }'
        - lang: Shell
          label: Streaming
          source: |
            curl https://api.cometapi.com/v1/messages \
              -H "Content-Type: application/json" \
              -H "x-api-key: $COMETAPI_KEY" \
              -H "anthropic-version: 2023-06-01" \
              -d '{
                "model": "claude-sonnet-4-6",
                "max_tokens": 256,
                "stream": true,
                "messages": [
                  {"role": "user", "content": "Hello"}
                ]
              }'
        - lang: Shell
          label: Prompt Cache
          source: |
            curl https://api.cometapi.com/v1/messages \
              -H "Content-Type: application/json" \
              -H "x-api-key: $COMETAPI_KEY" \
              -H "anthropic-version: 2023-06-01" \
              -d '{
                "model": "claude-sonnet-4-6",
                "max_tokens": 1024,
                "system": [
                  {
                    "type": "text",
                    "text": "You are an expert code reviewer...",
                    "cache_control": {"type": "ephemeral"}
                  }
                ],
                "messages": [
                  {"role": "user", "content": "Please review this code..."}
                ]
              }'
        - lang: Shell
          label: Web Fetch
          source: |
            curl https://api.cometapi.com/v1/messages \
              -H "Content-Type: application/json" \
              -H "x-api-key: $COMETAPI_KEY" \
              -H "anthropic-version: 2023-06-01" \
              -d '{
                "model": "claude-sonnet-4-6",
                "max_tokens": 1024,
                "messages": [
                  {"role": "user", "content": "Please analyze the content at https://arxiv.org/abs/1512.03385"}
                ],
                "tools": [
                  {"type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5}
                ]
              }'
components:
  securitySchemes:
    apiKeyAuth:
      type: apiKey
      in: header
      name: x-api-key
      description: >-
        Your CometAPI key passed via the `x-api-key` header. `Authorization:
        Bearer $COMETAPI_KEY` is also supported.

````