Generate a digital human video

Generate lip-synced digital human videos by uploading driver videos and audio files.

Step 1: Submit Generation Task

Endpoint: POST https://api.navtalk.ai/generate

Request Example:

curl -X POST "https://api.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "License: YOUR_LICENSE_KEY",
    "video_url": "https://your-domain.com/driver_video.mp4",
    "audio_url": "https://your-domain.com/target_audio.mp3"
  }'

📌 Parameter Description:

Parameter name
Type
Explanation

video_url

string

Please provide a video link, preferably one that clearly shows the face and mouth movements

audio_url

string

Input audio link in MP3/WAV format

✅ Successful Response:

{
  "status": "started",
  "task_id": "14cb760f-05ac-4fd3-a82c-e841f2f005d0"
}

🔹Step 2: Query Task Status and Results

Use the returned task_id to check the processing results:

✅ Successful Response:

📌 Status Description:

Status Value
meaning

started

The task has been created and is currently being processed

processing

In video compositing

done

Successfully completed; the results can be downloaded

failed

The synthesis has failed. You can retry or check the error message.

Notes and Recommendations

  • The interface response time is usually between 5 to 20 seconds, depending on the file size and server load.

  • It is recommended to use the task_id polling method at intervals of 2 to 3 seconds to check the status.

📘 For more parameters and format descriptions, please refer to the Digital Human API documentation.

Last updated