> ## Documentation Index
> Fetch the complete documentation index at: https://apidoc.cometapi.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Create a Kling avatar video

> Generate avatar-driven videos from images using the Kling Avatar API in CometAPI. Use POST /kling/v1/videos/avatar/image2video for fast image-to-video avatars.

Use this endpoint to create talking-avatar clips from one source image plus one audio source.

## Before you call it

* Provide one avatar `image` as a public URL or raw base64 string
* Use an avatar image that meets Kling pixel requirements; tiny thumbnails are rejected by the generation task
* Send exactly one of `audio_id` or `sound_file`
* Keep the first request simple: one face image, one audio clip, and a short optional prompt
* Include `task_id` when the referenced audio belongs to a prior task that must be linked
* Start with `mode: std` unless you specifically need the higher-quality path

## Audio source rules

* `audio_id` is the easiest path when you already generated speech through the Kling TTS route
* `sound_file` works when you already have your own MP3, WAV, M4A, or AAC asset
* Avatar audio is documented as 2 to 60 seconds long

## Task flow

<Steps>
  <Step title="Create the avatar task">
    Submit the image and one audio source, then save the returned task id.
  </Step>

  <Step title="Poll the task">
    Continue with [Get a Kling task](./individual-queries) until the task reaches a terminal state.
  </Step>

  <Step title="Store the finished result">
    Copy the final asset into your own storage if you need retention beyond the provider delivery URL.
  </Step>
</Steps>

<Note>
  For the complete parameter reference, see the [official Kling Avatar documentation](https://kling.ai/document-api/apiReference/model/avatar).
</Note>


## OpenAPI

````yaml api/openapi/video/kling/post-avatar.openapi.json POST /kling/v1/videos/avatar/image2video
openapi: 3.1.0
info:
  title: Avatar API
  version: 1.0.0
  description: >-
    Create a Kling avatar video task from one source image plus one audio
    source.
servers:
  - url: https://api.cometapi.com
security:
  - bearerAuth: []
paths:
  /kling/v1/videos/avatar/image2video:
    post:
      summary: Create a Kling avatar task
      description: >-
        Submit one avatar image and exactly one audio source. Poll the returned
        task id through the generic Kling query route.
      operationId: avatar
      parameters:
        - name: Content-Type
          in: header
          required: false
          description: Optional content type header.
          schema:
            type: string
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required:
                - image
                - prompt
              oneOf:
                - required:
                    - audio_id
                - required:
                    - sound_file
              properties:
                image:
                  type: string
                  description: >-
                    Avatar image URL or base64 image string. Use an image that
                    meets Kling pixel requirements; very small thumbnails are
                    rejected.
                prompt:
                  type: string
                  description: Prompt describing the desired avatar performance.
                audio_id:
                  type: string
                  description: Audio id from a prior Kling audio task.
                sound_file:
                  type: string
                  description: Public audio URL when you provide your own audio.
                task_id:
                  type: string
                  description: >-
                    Optional prior task id associated with the referenced audio
                    asset.
                mode:
                  type: string
                  description: >-
                    Generation mode. Use `std` or `pro`; omitted requests use
                    `std`.
                  enum:
                    - std
                    - pro
              default:
                image: https://your-image-host/avatar.jpg
                prompt: The speaker talks naturally to camera
                sound_file: https://your-audio-host/speech.wav
                mode: std
            examples:
              Default:
                summary: Avatar image-to-video request
                value:
                  image: https://your-image-host/avatar.jpg
                  prompt: The speaker talks naturally to camera
                  sound_file: https://your-audio-host/speech.wav
                  mode: std
      responses:
        '200':
          description: Task accepted.
          content:
            application/json:
              schema:
                type: object
                required:
                  - code
                  - message
                  - data
                properties:
                  code:
                    type: integer
                  message:
                    type: string
                  data:
                    type: object
                    required:
                      - task_id
                      - task_status
                      - created_at
                      - updated_at
                    properties:
                      task_id:
                        type: string
                      task_status:
                        type: string
                      task_info:
                        type: object
                        additionalProperties: true
                      created_at:
                        type: integer
                      updated_at:
                        type: integer
      x-codeSamples:
        - lang: Shell
          label: Default
          source: |
            curl https://api.cometapi.com/kling/v1/videos/avatar/image2video \
              -H "Authorization: Bearer $COMETAPI_KEY" \
              -H "Content-Type: application/json" \
              -d '{
                  "image": "https://your-image-host/avatar.jpg",
                  "prompt": "The speaker talks naturally to camera",
                  "sound_file": "https://your-audio-host/speech.wav",
                  "mode": "std"
                }'
        - lang: Python
          label: Default
          source: |
            import os
            import requests

            response = requests.post(
                "https://api.cometapi.com/kling/v1/videos/avatar/image2video",
                headers={"Authorization": "Bearer " + os.environ["COMETAPI_KEY"]},
                json={
                  "image": "https://your-image-host/avatar.jpg",
                  "prompt": "The speaker talks naturally to camera",
                  "sound_file": "https://your-audio-host/speech.wav",
                  "mode": "std"
                },
            )

            result = response.json()
            print(result.get("code"), result.get("data", {}).get("task_id"))
        - lang: JavaScript
          label: Default
          source: >
            const response = await
            fetch("https://api.cometapi.com/kling/v1/videos/avatar/image2video",
            {
              method: "POST",
              headers: {
                Authorization: `Bearer ${process.env.COMETAPI_KEY}`,
                "Content-Type": "application/json",
              },
              body: JSON.stringify({
                "image": "https://your-image-host/avatar.jpg",
                "prompt": "The speaker talks naturally to camera",
                "sound_file": "https://your-audio-host/speech.wav",
                "mode": "std"
              }),
            });


            const result = await response.json();

            console.log(result.code, result.data?.task_id);
components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: Bearer token authentication. Use your CometAPI key.

````