Create a Kling avatar video
Generate avatar-driven videos from images using the Kling Avatar API in CometAPI. Use POST /kling/v1/videos/avatar/image2video for fast image-to-video avatars.
Before you call it
- Provide one avatar
imageas a public URL or raw base64 string - Use an avatar image that meets Kling pixel requirements; tiny thumbnails are rejected by the generation task
- Send exactly one of
audio_idorsound_file - Keep the first request simple: one face image, one audio clip, and a short optional prompt
- Include
task_idwhen the referenced audio belongs to a prior task that must be linked - Start with
mode: stdunless you specifically need the higher-quality path
Audio source rules
audio_idis the easiest path when you already generated speech through the Kling TTS routesound_fileworks when you already have your own MP3, WAV, M4A, or AAC asset- Avatar audio is documented as 2 to 60 seconds long
Task flow
Poll the task
Authorizations
Bearer token authentication. Use your CometAPI key.
Headers
Optional content type header.
Body
- Option 1
- Option 2
Avatar image URL or base64 image string. Use an image that meets Kling pixel requirements; very small thumbnails are rejected.
Prompt describing the desired avatar performance.
Audio id from a prior Kling audio task.
Public audio URL when you provide your own audio.
Optional prior task id associated with the referenced audio asset.
Generation mode. Use std or pro; omitted requests use std.
std, pro