Documentation Index
Fetch the complete documentation index at: https://docs.enconvo.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Groq provides extremely fast inference using custom LPU hardware. Perfect for real-time applications requiring instant responses.Supported Models
| Model | Description | Speed |
|---|---|---|
| Gemma2 9B IT | Google’s model | ~700 tok/s |
| Llama 3.1 70B | Meta’s large model | ~300 tok/s |
| Llama 3.1 8B | Meta’s small model | ~750 tok/s |
| Mixtral 8x7B | Mistral MoE | ~500 tok/s |
Setup
Get API Key
- Go to Groq Console
- Sign in or create an account
- Navigate to API Keys
- Create a new API key
Configure in EnConvo
- Open Settings → AI Provider
- Select Groq AI
- Go to Credentials module
- Enter your API key
Configuration
| Setting | Description | Default |
|---|---|---|
| Credentials | API key configuration | Required |
| Model Name | Model to use | Gemma2 9B IT |
| Temperature | Creativity (0-2) | Medium (1) |
Validate and Use
Validate credentials
Click Validate in the Groq credential settings. If validation fails, confirm your API key is active and your Groq account has quota available.
Pick for speed
Groq is best when low latency matters. Start with a small or medium model for quick interactive chat.
Reasoning Effort
For reasoning-capable models (GPT-OSS):| Level | Description |
|---|---|
| Low | Fast reasoning |
| Medium | Balanced |
| High | Thorough |
Pricing
Groq offers generous free tiers. Check Groq for current pricing.Groq’s free tier is great for trying ultra-fast inference!
Why Groq?
Speed
Fastest inference available - 300-750+ tokens/second
Free Tier
Generous free usage for development
Quality Models
Access to Llama, Gemma, Mixtral
Low Latency
Near-instant responses
Best Practices
When to use Groq
When to use Groq
- Real-time chat applications
- Quick iterations during development
- Time-sensitive tasks
Model Selection
Model Selection
- Llama 3.1 70B: Best quality
- Llama 3.1 8B: Fastest
- Gemma2 9B: Good balance
Troubleshooting
Rate limits
Rate limits
- Free tier has rate limits
- Wait and retry, or upgrade