Documentation Index
Fetch the complete documentation index at: https://apidoc.cometapi.com/llms.txt
Use this file to discover all available pages before exploring further.
Handle rate limits by controlling concurrency before requests leave your app. When CometAPI returns 429, retry with exponential backoff and jitter, then lower burst traffic if repeated retries occur.
Limit concurrency
The following Python example caps concurrent chat requests with an async semaphore:
import asyncio
import os
from openai import AsyncOpenAI
client = AsyncOpenAI(
api_key=os.environ["COMETAPI_KEY"],
base_url="https://api.cometapi.com/v1",
)
semaphore = asyncio.Semaphore(5)
async def ask(prompt):
async with semaphore:
completion = await client.chat.completions.create(
model="your-model-id",
messages=[{"role": "user", "content": prompt}],
)
return completion.choices[0].message.content
async def main():
prompts = ["Say hello.", "Write a title.", "Return one JSON key."]
results = await asyncio.gather(*(ask(prompt) for prompt in prompts))
print(results)
asyncio.run(main())
The result is an array of model outputs:
[
"Hello.",
"A concise title",
"{\"key\":\"value\"}"
]
Retry rate limits
The following JavaScript example retries 429 responses with jitter:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.COMETAPI_KEY,
baseURL: "https://api.cometapi.com/v1",
});
async function sleep(milliseconds) {
return new Promise((resolve) => setTimeout(resolve, milliseconds));
}
async function createCompletion() {
for (let attempt = 0; attempt < 5; attempt += 1) {
try {
return await client.chat.completions.create({
model: "your-model-id",
messages: [{ role: "user", content: "Say hello." }],
});
} catch (error) {
if (error.status !== 429 || attempt === 4) {
throw error;
}
const delay = Math.min(30000, 1000 * 2 ** attempt);
await sleep(delay + Math.random() * 1000);
}
}
}
const completion = await createCompletion();
console.log(completion.choices[0].message.content);
The successful response contains a normal chat completion:
{
"choices": [
{
"message": {
"content": "Hello."
}
}
]
}
Common errors
| Error | Fix |
|---|
| Unlimited parallel requests | Add a semaphore, queue, or worker pool. |
| Retrying all failures | Retry only 429 and temporary server failures. |
| No per-model metrics | Log route, model ID, status, and latency for each request. |
| Retry storm | Add jitter and cap the maximum retry delay. |
Last updated: May 27, 2026