Documentation Index
Fetch the complete documentation index at: https://docs.enconvo.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
LM Studio is a desktop application for running large language models locally on your Mac. It provides a user-friendly GUI for discovering, downloading, and running GGUF models with no API key required. Like Ollama, it keeps all data on your machine for complete privacy.Supported Models
LM Studio supports any GGUF-format model. Popular choices include:| Model | Size | Best For |
|---|---|---|
| Llama 3.1 8B | ~5 GB | General purpose |
| Llama 3.1 70B | ~40 GB | Complex tasks |
| Mistral 7B | ~4 GB | Fast responses |
| Phi-3 Mini | ~2 GB | Lightweight tasks |
| CodeLlama 7B | ~4 GB | Programming |
| Qwen 2.5 7B | ~5 GB | Multilingual |
Setup
Install LM Studio
- Download from lmstudio.ai
- Install the application on your Mac
- Launch LM Studio
Download a Model
- In LM Studio, go to the Discover tab
- Search for a model (e.g., “Llama 3.1”)
- Click Download on your preferred quantization (Q4_K_M recommended for balance)
- Wait for the download to complete
Start the Local Server
- Go to the Developer tab in LM Studio
- Select your downloaded model
- Click Start Server
- Note the server address (default:
http://localhost:1234)
Configure in EnConvo
- Open Settings → AI Provider
- Select LM Studio
- Go to Credentials module
- Set the endpoint to
http://localhost:1234(or your custom port)
Configuration
| Setting | Description | Default |
|---|---|---|
| Endpoint | Local server address | http://localhost:1234 |
| Model Name | Currently loaded model | Auto-detected |
| Temperature | Creativity (0-2) | Medium (1) |
System Requirements
| RAM | Recommended Models |
|---|---|
| 8 GB | 7B models (Q4 quantization) |
| 16 GB | Larger 7B/13B models |
| 32 GB | 30B models |
| 64 GB+ | 70B models |
Apple Silicon Macs (M1/M2/M3/M4) with Metal acceleration provide significantly better performance than Intel Macs for local LLM inference. GPU VRAM (unified memory) is the key factor for model size.
LM Studio vs Ollama
| Feature | LM Studio | Ollama |
|---|---|---|
| Interface | GUI application | Command-line |
| Model format | GGUF | GGUF (auto-managed) |
| Model discovery | In-app browser | ollama pull command |
| Server control | Manual start/stop | Auto-starts on install |
| Configuration | Visual settings | Config files |
Privacy Benefits
Complete Privacy
All data stays on your Mac
Offline Access
Works without internet after model download
No Usage Limits
Run unlimited queries locally
No Cost
Free to use, no API fees
Troubleshooting
Connection refused
Connection refused
- Ensure LM Studio’s local server is running (check the Developer tab)
- Verify the port matches your EnConvo configuration (default: 1234)
- Check that no other application is using the same port
Slow responses
Slow responses
- Use a smaller model or higher quantization (Q4_K_M or Q4_K_S)
- Close other memory-intensive applications
- Check that Metal GPU acceleration is enabled in LM Studio settings
Model fails to load
Model fails to load
- Ensure you have enough RAM for the model size
- Try a smaller quantization variant
- Re-download the model if the file may be corrupted
- Restart LM Studio and try again
No models in dropdown
No models in dropdown
- Make sure a model is loaded and the server is started in LM Studio
- Refresh the model list in EnConvo settings
- Check the endpoint URL is correct