文字起こし
CometAPI の POST /v1/audio/transcriptions を使用して、音声を元の言語のテキストに文字起こしします。複数の出力形式に対応した Whisper モデルをサポートしています。
Documentation Index
Fetch the complete documentation index at: https://apidoc.cometapi.com/llms.txt
Use this file to discover all available pages before exploring further.
承認
Bearer token authentication. Use your CometAPI key.
ボディ
The audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
The speech-to-text model to use. Choose a current speech model from the Models page.
The language of the input audio in ISO-639-1 format (e.g., en, zh, ja). Supplying the language improves accuracy and latency.
Optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
The output format for the transcription.
json, text, srt, verbose_json, vtt Sampling temperature between 0 and 1. Higher values produce more random output; lower values are more focused. When set to 0, the model auto-adjusts temperature using log probability.
0 <= x <= 1レスポンス
The transcription result.
The transcribed text.