Video Generation

Overview

EnConvo brings AI video generation to your desktop. Create videos from text prompts or reference images using cutting-edge models like OpenAI Sora, Google Veo, Kling, Hailuo, and Wan — all without leaving your workflow.

Supported Models

Model	Provider	Text-to-Video	Image-to-Video	Strengths
Sora-2	OpenAI	Yes	Yes	Cinematic quality, realistic motion
Sora-2 Pro	OpenAI	Yes	Yes	Higher quality, more detail
Veo3	Google (via fal.ai)	Yes	Yes	Photorealistic, natural motion
Veo3 Fast	Google (via fal.ai)	Yes	Yes	Faster generation, good quality
Kling Video v2.5 Turbo Pro	Kuaishou (via fal.ai)	Yes	Yes	Fast, high quality, good value
Hailuo 02 Standard	MiniMax (via fal.ai)	Yes	Yes	Versatile, affordable
Hailuo 02 Pro	MiniMax (via fal.ai)	Yes	Yes	Premium quality
Wan 25 Preview	Alibaba (via fal.ai)	Yes	Yes	Chinese-developed, versatile
xAI video features	xAI	Model-dependent	Yes	xAI media workflows when available for your account

Getting Started

Choose a Provider

Open the Text to Video or Image to Video command settings. Select your Video Generation Provider:

Enconvo Cloud Plan — use multiple models with your Enconvo points
OpenAI — use your own OpenAI API key for Sora models
Fal.ai — use your own fal.ai API key for Kling, Hailuo, Veo, and Wan

Select a Model

Choose the model that fits your needs. Kling Video v2.5 Turbo Pro offers the best balance of speed, quality, and cost.

Configure Settings

Set resolution and duration options based on the selected model.

Generate

Enter a text prompt or provide a reference image, and let the AI create your video.

Text to Video

Generate videos from text descriptions alone.

How to Use

Open the Text to Video command from SmartBar or the command list
Enter a descriptive prompt in English
Wait for the video to generate (typically 30 seconds to a few minutes)
View, save, or share the resulting video

Writing Effective Prompts

Prompts must be in English for best results across all models.

Basic structure:

[Subject] + [Action] + [Setting] + [Style/Mood] + [Camera Movement]

Example prompts:

A golden retriever running through a field of wildflowers at sunset,
slow motion, cinematic lighting, warm tones

A futuristic city skyline at night with flying cars and neon lights,
drone shot sweeping across the buildings, cyberpunk aesthetic

A cup of coffee being poured in slow motion, close-up shot,
steam rising, soft morning light through a window

Prompt Tips

Be specific about motion

Instead of “a bird”, write “a hummingbird hovering in front of a red flower, wings beating rapidly”. Motion descriptions help the model create more dynamic videos.

Describe camera work

Include camera directions: “tracking shot”, “slow zoom in”, “drone aerial view”, “close-up”, “pan left to right”. This gives the video a professional, intentional feel.

Set the mood

Include lighting and atmosphere: “golden hour sunlight”, “moody overcast”, “dramatic shadows”, “soft diffused light”. These details significantly impact the final result.

Keep it focused

One clear subject and action per video produces better results than complex multi-character scenes. Start simple and iterate.

Image to Video

Animate a still image into a video.

How to Use

Open the Image to Video command
Provide one or more reference images (drag and drop, file picker, or URL)
Add a text prompt describing the desired motion and style
Generate and preview the result

Best Practices for Reference Images

Aspect	Recommendation
Resolution	High resolution produces better results
Subject	Clear, well-defined subjects animate better
Composition	Center the main subject for predictable animation
Background	Simpler backgrounds allow more focus on subject motion

Local Reference Images

For xAI video workflows and other providers that require public media URLs, EnConvo can upload a local reference image before sending the generation request. You can choose an image from your Mac instead of manually uploading it to a hosting service first.

Use clear, high-resolution local images for image-to-video. If generation fails, try a smaller PNG or JPEG and confirm the file is fully downloaded from iCloud.

Model Configuration

Resolution (Sora Models)

Resolution	Aspect	Best For
720 x 1280	Portrait	Social media stories, TikTok
1280 x 720	Landscape	YouTube, presentations
1024 x 1792	Tall Portrait	Mobile wallpapers, vertical video
1792 x 1024	Wide Landscape	Cinematic, ultra-wide content

Duration (Sora Models)

Duration	Use Case
4 seconds	Quick clips, social media, previews
8 seconds	Standard clips, most use cases (default)
12 seconds	Longer scenes, storytelling

Duration and resolution options are currently available for Sora models. Other models (Kling, Hailuo, Veo, Wan) use their default output settings managed by the provider.

Cost and Points

Enconvo Cloud Plan Pricing

Model	Points per Video
Kling Video v2.5 Turbo Pro	25,000
Hailuo 02 Standard	22,500
Wan 25 Preview	25,000
Veo3 Fast	37,500
Hailuo 02 Pro	40,000
Sora-2	50,000
Veo3	100,000
Sora-2 Pro	150,000

For the best value, start with Kling Video v2.5 Turbo Pro or Hailuo 02 Standard — they offer excellent quality at the lowest point cost.

Using Your Own API Keys

OpenAI (Sora): Uses your OpenAI API balance. Pricing varies by resolution and duration.
Fal.ai: Uses your fal.ai account balance. Supports Kling, Hailuo, Veo, and Wan models.
xAI: Uses your xAI account and model access when xAI video features are available.

Use Cases

Presentations and Demos

Generate visual demonstrations, concept animations, or background videos for presentations. Landscape resolution (1280x720) works best for slides.

Creative Projects

Explore concept art in motion, music video ideas, or short film scenes. Use Sora-2 Pro or Veo3 for the highest visual quality.

Product Visualization

Animate product photos into dynamic showcases. Use Image to Video with a product photo and describe the camera movement and environment.

Prototyping and Storyboarding

Quickly visualize scenes for video production planning. Generate multiple variations to explore different creative directions before committing to full production.

Workflow Integration

Video generation integrates with other EnConvo features:

AI Chat: Ask an AI agent to generate a video as part of a conversation
Workflows: Chain video generation with other commands for automated content pipelines
File Management: Generated videos are saved to your specified output directory with organized naming

Limitations

AI video generation is a rapidly evolving technology. Current limitations include:

Generation times range from 30 seconds to several minutes depending on the model
Complex multi-character scenes may have inconsistencies
Text rendering in videos is often inaccurate
Very specific motion sequences may not match your exact vision
All prompts should be in English for best results

Troubleshooting

Video generation fails

Check your API key or Enconvo Cloud Plan balance
Ensure your prompt is in English
Try a simpler prompt — overly complex descriptions can cause issues
Switch to a different model to see if the issue is model-specific
Check the console logs for error details

Video quality is poor

Use a more detailed, descriptive prompt
Try a higher-quality model (Sora-2 Pro, Veo3, Hailuo 02 Pro)
For Image to Video, ensure the reference image is high resolution
Avoid requesting too many elements in a single scene

Generation takes too long

Switch to a faster model (Kling Turbo, Veo3 Fast, Hailuo Standard)
Reduce the video duration if using Sora models
Simplify the prompt — fewer elements means faster generation
Note that generation times depend on the provider’s server load

Image Generation

Generate still images from text

AI Chat

Generate videos through AI conversation

SmartBar

Quick access to video generation

Workflows

Automate video generation pipelines

Getting Started

Core Features

AI Capabilities

Providers

Workflows & Extensions

Integrations

Advanced

Configuration

Resources

Overview

Supported Models

Getting Started

Text to Video

How to Use

Writing Effective Prompts

Prompt Tips

Image to Video

How to Use

Best Practices for Reference Images

Local Reference Images

Model Configuration

Resolution (Sora Models)

Duration (Sora Models)

Cost and Points

Enconvo Cloud Plan Pricing

Using Your Own API Keys

Use Cases

Workflow Integration

Limitations

Troubleshooting

Image Generation

AI Chat

SmartBar

Workflows

​Overview

​Supported Models

​Getting Started

​Text to Video

​How to Use

​Writing Effective Prompts

​Prompt Tips

​Image to Video

​How to Use

​Best Practices for Reference Images

​Local Reference Images

​Model Configuration

​Resolution (Sora Models)

​Duration (Sora Models)

​Cost and Points

​Enconvo Cloud Plan Pricing

​Using Your Own API Keys

​Use Cases

​Workflow Integration

​Limitations

​Troubleshooting

​Related Features

Image Generation

AI Chat

SmartBar

Workflows

Overview

Supported Models

Getting Started

Text to Video

How to Use

Writing Effective Prompts

Prompt Tips

Image to Video

How to Use

Best Practices for Reference Images

Local Reference Images

Model Configuration

Resolution (Sora Models)

Duration (Sora Models)

Cost and Points

Enconvo Cloud Plan Pricing

Using Your Own API Keys

Use Cases

Workflow Integration

Limitations

Troubleshooting

Related Features