Overview
The NanoGPT API provides advanced video generation capabilities using state-of-the-art models. This guide covers how to use our video generation endpoints.API Authentication
Supported authentication methods:- API Key header:
x-api-key: <your-api-key> - Bearer token:
Authorization: Bearer <your-api-key> - Cookie session: for web clients (automatic)
Header Examples
Making a Video Generation Request
Endpoint
Request Headers
Request Body
Basic Text-to-Video Request
Image-to-Video Request
Image-conditioned models accept eitherimageDataUrl (base64) or imageUrl (a public HTTPS link). The platform always uses the explicit field you send before falling back to any library attachments.
Uploads sent via the API must be 4 MB or smaller. For larger assets, host them externally and provide an imageUrl.
Base64 input
Public URL input
Additional Request Parameters
These parameters are available across models. Only send the fields your chosen model supports.Core Parameters
| Parameter | Type | Description |
|---|---|---|
conversationUUID | string | Link a generation to a conversation |
resolution | string | Output resolution (480p, 720p, 1080p) |
imageAttachmentId | string | Reference to a library-stored image |
videoUrl | string | Public HTTPS link to a source video (extend/edit/upscale) |
videoDataUrl | string | Base64-encoded data URL for a source video (max 4 MB) |
videoAttachmentId | string | Reference to a library-stored video |
video | string | Alternate video field accepted by select models (for example, wan-wavespeed-25-extend) |
referenceImages | array | Multiple reference images for reference-to-video mode |
referenceVideos | array | Multiple reference videos |
PrefervideoUrl(camelCase) for source videos. Only sendvideoif the model explicitly requires it.
Audio Parameters (lipsync/avatar models)
| Parameter | Type | Description |
|---|---|---|
audioDataUrl | string | Base64-encoded audio |
audioDuration | number | Duration of provided audio in seconds |
voiceId | string | Voice selection for lipsync models |
Model-Specific Parameters
| Parameter | Type | Description |
|---|---|---|
pro / pro_mode | boolean | Enable pro/higher-quality mode |
generateAudio | boolean | Generate audio with video (Veo 3) |
mode | string | text-to-video, image-to-video, reference-to-video, video-edit |
orientation | string | Portrait or landscape (Sora 2) |
size | string | Output dimensions (Pixverse, Wan models) |
animation | boolean | Enable animation (Longstories) |
language | string | Output language (Longstories) |
characters | array | Character definitions (Longstories) |
style | string | Style preset |
Model-Specific Parameters
Veo Models
Kling Models
Hunyuan Models
Wan Image-to-Video
Accepts base64 viaimageDataUrlor a public URL viaimageUrl.
Seedance Models
Accepts base64 viaimageDataUrlor a public URL viaimageUrl. Ensure URLs are directly fetchable by the provider.
Supported Models
Text-to-Video Models
| Model | Duration | Notes |
|---|---|---|
veo2-video | 5-8s | Also supports image-to-video |
veo3-video | 5-8s | Audio generation supported |
veo3-1-video | Variable | T2V and I2V modes |
veo3-fast-video | Variable | Fast generation |
sora-2 | Variable | Pro mode, multiple resolutions |
kling-video | 5-10s | Basic text-to-video |
kling-video-v2 | 5-10s | Enhanced quality |
kling-video-o1 | 5-10s | Multi-mode support |
kling-video-o1-standard | 5-10s | Standard variant |
kling-v25-turbo-pro | 5-10s | Turbo pro |
kling-v25-turbo-std | 5-10s | Turbo standard |
kling-v26-pro | 5-10s | Latest pro |
minimax-video | 6s | Fixed duration |
minimax-hailuo-02 | Variable | Per-second pricing |
minimax-hailuo-02-pro | Variable | Premium variant |
minimax-hailuo-23-standard | Variable | Hailuo 2.3 |
minimax-hailuo-23-pro | Variable | Hailuo 2.3 pro |
hunyuan-video | 5s | Pro mode available |
hunyuan-video-15 | Variable | Resolution-based |
wan-video-22 | 5s | 14B full model |
wan-video-22-5b | 5s | 5B lite model |
wan-video-22-turbo | 5s | Simplified |
wan-wavespeed-25 | Variable | Wan 2.5 |
wan-wavespeed-26 | Variable | Wan 2.6 |
wan-wavespeed-22-plus | Variable | Plus variant |
seedance-video | Variable | Resolution-duration pricing |
seedance-lite-video | Variable | Lite version |
pixverse-v45 | Variable | V4.5 |
pixverse-v5 | Variable | V5 |
pixverse-v55 | Variable | V5.5 with effects |
pixverse-v55-effects | Variable | Effects variant |
lightricks-ltx-2-fast | Variable | Fast generation |
lightricks-ltx-2-pro | Variable | Pro quality |
vidu-video | Variable | Vidu Q1 |
runwayml-gen4-aleph | Variable | Runway Gen4 |
veed-fabric-1.0 | Variable | Veed fabric |
midjourney-video | Variable | Returns 4 videos |
Image-to-Video Only Models
| Model | Notes |
|---|---|
kling-v21-standard | Standard quality |
kling-v21-pro | Pro quality |
kling-v21-master | Master quality, requires prompt |
hunyuan-video-image-to-video | Image input required |
wan-video-image-to-video | Image input required |
Avatar/Lipsync Models
| Model | Input | Notes |
|---|---|---|
kling-v2-avatar-standard | Audio + Image | Standard avatar |
kling-v2-avatar-pro | Audio + Image | Pro avatar |
kling-lipsync-t2v | Text | Text-to-video lipsync |
kling-lipsync-a2v | Audio | Audio-to-video lipsync |
latentsync | Audio + Video | Audio-video sync |
bytedance-avatar-omni-human-1.5 | Audio + Image | Bytedance avatar |
bytedance-waver-1.0 | Audio + Image | Waver model |
Utility Models
| Model | Purpose |
|---|---|
video-upscaler | Upscale video resolution |
bytedance-seedance-upscaler | Alternative upscaler |
seedvr2-video-upscaler | SeedVR upscaler |
wan-wavespeed-video-edit | Video editing |
wan-wavespeed-22-animate | Video animation |
wan-wavespeed-s2v | Speech-to-video |
magicapi-video-face-swap | Face swap |
Extension Models
Extend models run throughPOST /api/generate-video with prompt plus a source video input (videoUrl, videoDataUrl, or videoAttachmentId). The task-based /api/generate-video/extend endpoint is only for Midjourney extensions.
Max source video length: 120 seconds.
| Model | Purpose |
|---|---|
wan-wavespeed-25-extend | Extend Wan videos |
wan-wavespeed-22-spicy-extend | Spicy extension |
veo3-1-extend | Extend VEO videos |
veo3-1-fast-extend | Fast VEO extension |
bytedance-seedance-v1.5-pro-extend | Extend Seedance videos |
Longstories Models (Scripted Video)
| Model | Notes |
|---|---|
longstories-movie | Movie generation with voice narration |
longstories-pixel-art | Pixel art style |
Response Format
Initial Response (202 Accepted)
Response Fields
runId: Unique identifier for polling statusstatus: Always “pending” for initial responsemodel: The model used for generationcost: Estimated/pre-charged costpaymentSource: “USD” or “XNO”remainingBalance: Account balance after deductionprechargeLabel: Provider label for the precharge
Cost Information
Both the initial generation response and the status response includecost:
- Initial response: estimated/pre-charged cost
- Status response: final cost (when
statusisCOMPLETED)
- Fixed: flat rate per generation
- Per-Second: rate x duration
- Resolution-Based: rates per resolution tier
- Duration-Based: step pricing (5s vs 10s)
- Mode-Based: different rates for T2V vs I2V
Polling for Status
After receiving arunId, poll the status endpoint until completion.
Status Endpoint
model is optional but recommended for faster lookups.
Polling Example
Status Response States
In Progress
Completed
Failed
Status Values
IN_QUEUE: Request is queuedIN_PROGRESS: Video is being generatedCOMPLETED: Video ready for downloadFAILED: Generation failedCANCELED: Request was canceled
Additional Endpoints
GET /api/generate-video/recover
Recover recent video generation runs for a user.
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | No | Filter by model |
limit | number | No | Max results (default 10, max 50) |
conversationUUID | string | No | Filter by conversation |
POST /api/generate-video/extend
Extend a Midjourney video using a task-based flow.
Rate Limit: 20 requests/minute
Required Fields: taskId (from the original Midjourney runId), index (0-3)
Notes:
- This endpoint does not accept
video,videoUrl,videoDataUrl, orvideoAttachmentId. - Use
POST /api/generate-videowith an extend model for source-video extension.
GET /api/generate-video/content
Proxy content retrieval for Sora 2 videos.
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
runId | string | Yes | The run ID |
model | string | Yes | Must be sora-2 |
variant | string | No | video, thumbnail, or spritesheet |
Complete Examples
The submit + poll flow works the same regardless of how you supply the image: image-conditioned models accept eitherimageDataUrl (base64) or a public imageUrl, and the platform prefers whichever field you send before checking library attachments.
Example 1: Text-to-Video with cURL
Example 2: Image-to-Video with cURL
Base64 input
Public URL input
Example 3: Image-to-Video with JavaScript
Using a public image URL directly
Example 4: Image-to-Video with Python
Example 5: Batch Processing
Error Handling
Error Response Format
CONTENT_POLICY_VIOLATION, PRO_REQUIRED, INSUFFICIENT_BALANCE, RATE_LIMITED
Error Handling Best Practices
Rate Limits
| Endpoint | Limit |
|---|---|
POST /api/generate-video | 50 requests/minute per IP |
GET /api/generate-video/status | No explicit limit (cached) |
POST /api/generate-video/extend | 20 requests/minute per IP |
GET /api/generate-video/recover | 20 requests/minute per IP |
Additional Notes
Pro Mode
sora-2: Pro mode required for 1792x1024 resolutionhunyuan-video: Pro mode available- Various Kling models: Pro variants
Audio Generation
veo3-video: SetgenerateAudio: trueto include audio
Reference-to-Video
Supported bykling-video-o1, kling-video-o1-standard, wan-wavespeed-26.
Use referenceImages with image URLs or data URLs.
Automatic Refunds
Refunds are automatically issued when:- Provider returns a failure
- Content policy violation (before processing)
- Submission fails before acknowledgement
Best Practices
-
Choose the Right Model
- Use text-to-video for creative generation
- Use image-to-video for animating existing content
- Consider cost vs quality tradeoffs
-
Optimize Prompts
- Be specific and descriptive
- Include motion and camera directions
- Avoid content policy violations
-
Handle Async Operations
- Implement proper polling with delays
- Set reasonable timeouts (5-10 minutes)
- Show progress to users
-
Error Recovery
- Implement retry logic for transient failures
- Handle rate limits with exponential backoff
- Provide clear error messages to users
-
Cost Management
- Check balance before submitting
- Estimate costs before generation
- Use shorter durations for testing