Estimate request cost before calling a model

Estimate cost before a model call by combining the model directory price with the units that the endpoint bills: tokens, images, audio length, or video tasks. Treat the estimate as a budget guard, then use actual usage and billing records after the request completes.

Estimate token-based calls

The following Python example estimates token-based request cost from configured pricing values:

import math
import os

prompt = "Write a short product description for CometAPI."
max_output_tokens = 200

input_price_per_1m = float(os.environ["MODEL_INPUT_PRICE_PER_1M"])
output_price_per_1m = float(os.environ["MODEL_OUTPUT_PRICE_PER_1M"])

estimated_input_tokens = math.ceil(len(prompt) / 4)

estimated_cost = (
    estimated_input_tokens * input_price_per_1m
    + max_output_tokens * output_price_per_1m
) / 1_000_000

print(f"Estimated maximum cost: ${estimated_cost:.6f}")

The result is a pre-call estimate:

Estimated maximum cost: $0.000123

Set a maximum output budget

The following request caps generated output so the estimate has an upper bound:

curl https://api.cometapi.com/v1/chat/completions \
  -H "Authorization: Bearer $COMETAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model-id",
    "messages": [
      {
        "role": "user",
        "content": "Write a short product description for CometAPI."
      }
    ],
    "max_completion_tokens": 200
  }'

The response includes actual usage after the model call:

{
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 42,
    "total_tokens": 52
  }
}

Estimate task-based calls

The following JavaScript example estimates a task-based workflow such as image or video generation:

const taskCount = 3;
const pricePerTask = Number(process.env.MODEL_PRICE_PER_TASK);

const estimatedCost = taskCount * pricePerTask;

console.log(`Estimated maximum cost: $${estimatedCost.toFixed(4)}`);

The result is the task budget:

Estimated maximum cost: $0.4500

Common errors

Error	Fix
Using a price from the wrong model	Copy pricing from the same model ID in the model directory.
Ignoring output tokens	Set `max_completion_tokens` or the endpoint-specific output limit.
Treating estimates as invoices	Compare estimates with actual usage after the call.
Missing task multipliers	For image, audio, and video, check whether billing is per task, per second, or per generated asset.

​Estimate token-based calls

​Set a maximum output budget

​Estimate task-based calls

​Common errors

​Related links

Estimate token-based calls

Set a maximum output budget

Estimate task-based calls

Common errors

Related links