Generate a digital human video

Generate lip-synced digital human videos by uploading driver videos and audio files.

Step 1: Submit Generation Task

Request Example:

curl -X POST "https://api.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "License: YOUR_LICENSE_KEY",
    "video_url": "https://your-domain.com/driver_video.mp4",
    "audio_url": "https://your-domain.com/target_audio.mp3"
  }'

Parameters:

Parameter
Type
Required
Description

video_url

string

Publicly accessible URL of driver video (MP4 format)

audio_url

string

Publicly accessible URL of target audio (MP3, WAV formats)

Success Response:

{
  "status": "started",
  "task_id": "14cb760f-05ac-4fd3-a82c-e841f2f005d0"
}

Step 2: Check Task Status

Request Example:

curl -X GET "https://api.navtalk.ai/query_status?license=YOUR_LICENSE&task_id=14cb760f-05ac-4fd3-a82c-e841f2f005d0"

Success Response:

{
    "code": 200,
    "message": "SUCCESS",
    "data": {
        "resultUrl": "https://easyaistorageaccount.blob.core.windows.net/easyai/uploadFiles/2025/07/03/2749bdb2-2220-4315-acd1-1783a45eaac6.mp4",
        "state": "done"
    }
}

Status values::

Status Value
meaning

started

The task has been created and is currently being processed

processing

In video compositing

done

Successfully completed; the results can be downloaded

failed

The synthesis has failed. You can retry or check the error message.

Notice

  • Processing Time: 5-20 seconds (depends on audio length and server load)

  • Recommended Polling Interval: 5-10 seconds

  • Use HTTPS links for secure transmission

  • Generated results auto-delete after 24 hours

Full documentation: Digital Human API Docs

Last updated