Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.enconvo.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

EnConvo brings AI video generation to your desktop. Create videos from text prompts or reference images using cutting-edge models like OpenAI Sora, Google Veo, Kling, Hailuo, and Wan — all without leaving your workflow.

Supported Models

ModelProviderText-to-VideoImage-to-VideoStrengths
Sora-2OpenAIYesYesCinematic quality, realistic motion
Sora-2 ProOpenAIYesYesHigher quality, more detail
Veo3Google (via fal.ai)YesYesPhotorealistic, natural motion
Veo3 FastGoogle (via fal.ai)YesYesFaster generation, good quality
Kling Video v2.5 Turbo ProKuaishou (via fal.ai)YesYesFast, high quality, good value
Hailuo 02 StandardMiniMax (via fal.ai)YesYesVersatile, affordable
Hailuo 02 ProMiniMax (via fal.ai)YesYesPremium quality
Wan 25 PreviewAlibaba (via fal.ai)YesYesChinese-developed, versatile

Getting Started

1

Choose a Provider

Open the Text to Video or Image to Video command settings. Select your Video Generation Provider:
  • Enconvo Cloud Plan — use multiple models with your Enconvo points
  • OpenAI — use your own OpenAI API key for Sora models
  • Fal.ai — use your own fal.ai API key for Kling, Hailuo, Veo, and Wan
2

Select a Model

Choose the model that fits your needs. Kling Video v2.5 Turbo Pro offers the best balance of speed, quality, and cost.
3

Configure Settings

Set resolution and duration options based on the selected model.
4

Generate

Enter a text prompt or provide a reference image, and let the AI create your video.

Text to Video

Generate videos from text descriptions alone.

How to Use

  1. Open the Text to Video command from SmartBar or the command list
  2. Enter a descriptive prompt in English
  3. Wait for the video to generate (typically 30 seconds to a few minutes)
  4. View, save, or share the resulting video

Writing Effective Prompts

Prompts must be in English for best results across all models.
Basic structure:
[Subject] + [Action] + [Setting] + [Style/Mood] + [Camera Movement]
Example prompts:
A golden retriever running through a field of wildflowers at sunset,
slow motion, cinematic lighting, warm tones
A futuristic city skyline at night with flying cars and neon lights,
drone shot sweeping across the buildings, cyberpunk aesthetic
A cup of coffee being poured in slow motion, close-up shot,
steam rising, soft morning light through a window

Prompt Tips

Instead of “a bird”, write “a hummingbird hovering in front of a red flower, wings beating rapidly”. Motion descriptions help the model create more dynamic videos.
Include camera directions: “tracking shot”, “slow zoom in”, “drone aerial view”, “close-up”, “pan left to right”. This gives the video a professional, intentional feel.
Include lighting and atmosphere: “golden hour sunlight”, “moody overcast”, “dramatic shadows”, “soft diffused light”. These details significantly impact the final result.
One clear subject and action per video produces better results than complex multi-character scenes. Start simple and iterate.

Image to Video

Animate a still image into a video.

How to Use

  1. Open the Image to Video command
  2. Provide one or more reference images (drag and drop, file picker, or URL)
  3. Add a text prompt describing the desired motion and style
  4. Generate and preview the result

Best Practices for Reference Images

AspectRecommendation
ResolutionHigh resolution produces better results
SubjectClear, well-defined subjects animate better
CompositionCenter the main subject for predictable animation
BackgroundSimpler backgrounds allow more focus on subject motion

Model Configuration

Resolution (Sora Models)

ResolutionAspectBest For
720 x 1280PortraitSocial media stories, TikTok
1280 x 720LandscapeYouTube, presentations
1024 x 1792Tall PortraitMobile wallpapers, vertical video
1792 x 1024Wide LandscapeCinematic, ultra-wide content

Duration (Sora Models)

DurationUse Case
4 secondsQuick clips, social media, previews
8 secondsStandard clips, most use cases (default)
12 secondsLonger scenes, storytelling
Duration and resolution options are currently available for Sora models. Other models (Kling, Hailuo, Veo, Wan) use their default output settings managed by the provider.

Cost and Points

Enconvo Cloud Plan Pricing

ModelPoints per Video
Kling Video v2.5 Turbo Pro25,000
Hailuo 02 Standard22,500
Wan 25 Preview25,000
Veo3 Fast37,500
Hailuo 02 Pro40,000
Sora-250,000
Veo3100,000
Sora-2 Pro150,000
For the best value, start with Kling Video v2.5 Turbo Pro or Hailuo 02 Standard — they offer excellent quality at the lowest point cost.

Using Your Own API Keys

  • OpenAI (Sora): Uses your OpenAI API balance. Pricing varies by resolution and duration.
  • Fal.ai: Uses your fal.ai account balance. Supports Kling, Hailuo, Veo, and Wan models.

Use Cases

Create short, engaging video clips for Instagram Reels, TikTok, or YouTube Shorts. Use portrait resolution (720x1280) and 4-8 second duration for optimal social media fit.
Generate visual demonstrations, concept animations, or background videos for presentations. Landscape resolution (1280x720) works best for slides.
Explore concept art in motion, music video ideas, or short film scenes. Use Sora-2 Pro or Veo3 for the highest visual quality.
Animate product photos into dynamic showcases. Use Image to Video with a product photo and describe the camera movement and environment.
Quickly visualize scenes for video production planning. Generate multiple variations to explore different creative directions before committing to full production.

Workflow Integration

Video generation integrates with other EnConvo features:
  • AI Chat: Ask an AI agent to generate a video as part of a conversation
  • Workflows: Chain video generation with other commands for automated content pipelines
  • File Management: Generated videos are saved to your specified output directory with organized naming

Limitations

AI video generation is a rapidly evolving technology. Current limitations include:
  • Generation times range from 30 seconds to several minutes depending on the model
  • Complex multi-character scenes may have inconsistencies
  • Text rendering in videos is often inaccurate
  • Very specific motion sequences may not match your exact vision
  • All prompts should be in English for best results

Troubleshooting

  1. Check your API key or Enconvo Cloud Plan balance
  2. Ensure your prompt is in English
  3. Try a simpler prompt — overly complex descriptions can cause issues
  4. Switch to a different model to see if the issue is model-specific
  5. Check the console logs for error details
  1. Use a more detailed, descriptive prompt
  2. Try a higher-quality model (Sora-2 Pro, Veo3, Hailuo 02 Pro)
  3. For Image to Video, ensure the reference image is high resolution
  4. Avoid requesting too many elements in a single scene
  1. Switch to a faster model (Kling Turbo, Veo3 Fast, Hailuo Standard)
  2. Reduce the video duration if using Sora models
  3. Simplify the prompt — fewer elements means faster generation
  4. Note that generation times depend on the provider’s server load

Image Generation

Generate still images from text

AI Chat

Generate videos through AI conversation

SmartBar

Quick access to video generation

Workflows

Automate video generation pipelines