虛擬人像
使用 CometAPI 中的 Kling Avatar API,從圖片產生由虛擬人像驅動的影片。使用 POST /kling/v1/videos/avatar/image2video 快速建立 image-to-video 虛擬人像。
使用此端點可透過一張來源圖片加上一個音訊來源,建立會說話的虛擬人像短片。Documentation Index
Fetch the complete documentation index at: https://apidoc.cometapi.com/llms.txt
Use this file to discover all available pages before exploring further.
呼叫前準備
- 提供一個虛擬人像
image,可使用公開 URL 或原始 base64 字串 audio_id或sound_file二擇一,且只能傳送其中一個- 第一個請求請保持簡單:一張人臉圖片、一段音訊片段,以及一段可選的簡短 prompt
- 除非你明確需要更高品質的路徑,否則請先使用
mode: std
音訊來源規則
- 如果你已透過 Kling TTS 路徑產生語音,
audio_id是最簡單的方式 - 如果你已經有自己的 MP3、WAV、M4A 或 AAC 資源,可使用
sound_file - 文件指出虛擬人像音訊長度可為 2 到 60 秒
任務流程
輪詢任務
授權
Bearer token authentication. Use your CometAPI key.
標頭
Optional content type header.
主體
- Option 1
- Option 2
Avatar reference image. Accepts an image URL or raw Base64 string (no data: prefix). Supported formats: JPG, JPEG, PNG. Max file size 10 MB. Minimum dimension 300 px on each side; aspect ratio between 1:2.5 and 2.5:1.
Audio ID returned by the Kling TTS API. Only audio clips between 2 and 60 seconds generated within the last 30 days are accepted. Mutually exclusive with sound_file — exactly one must be provided.
Text prompt to guide avatar actions, emotions, and camera movements. Max 2500 characters. Required — the API rejects requests without this field.
Audio file as a URL or Base64 string. Accepted formats: MP3, WAV, M4A, AAC. Max 5 MB, duration 2–60 seconds. Mutually exclusive with audio_id — exactly one must be provided.
Generation mode. std (standard, faster and more cost-effective) or pro (professional, higher quality output).
Webhook URL for task status notifications. The server sends a callback when the task status changes.
Optional user-defined task ID for your own tracking. Does not replace the system-generated task ID. Must be unique per account.