创建转录
使用 CometAPI POST /v1/audio/transcriptions 将音频转录为原始语言的文本。支持 Whisper 模型和多种输出格式。
首次请求
发送一个受支持的音频文件,并提供model 和 file。在验证上传处理、身份验证和响应解析时,建议第一个文件尽量简短。
读取响应
默认响应包含转录后的text。如果你请求了其他响应格式,请确保你的客户端按该格式进行解析,而不是默认假设为 JSON 结构。
后续步骤
授权
Bearer token authentication. Use your CometAPI key.
请求体
The audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
The speech-to-text model to use. Choose a current speech model from the Models page.
The language of the input audio in ISO-639-1 format (e.g., en, zh, ja). Supplying the language improves accuracy and latency.
Optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
The output format for the transcription.
json, text, srt, verbose_json, vtt Sampling temperature between 0 and 1. Higher values produce more random output; lower values are more focused. When set to 0, the model auto-adjusts temperature using log probability.
0 <= x <= 1响应
The transcription result.
The transcribed text.