Skip to main content

Overview

Ollama allows you to run large language models locally on your Mac. Perfect for privacy-focused work and offline usage.

Supported Models

Any model available in Ollama:
ModelSizeBest For
Llama 3.1 70B40GBComplex tasks
Llama 3.1 8B4.7GBGeneral purpose
Mistral 7B4.1GBFast responses
CodeLlama4.7GBProgramming
Phi-32.2GBLightweight
Gemma 25.4GBBalanced
Qwen 2VariousMultilingual

Setup

1

Install Ollama

Download from ollama.ai or:
brew install ollama
2

Pull a Model

ollama pull llama3.1
3

Verify Installation

ollama list
4

Configure in EnConvo

  1. Open SettingsAI Provider
  2. Select Ollama
  3. Go to Credentials module
  4. Set endpoint: http://localhost:11434
5

Select Model

Choose from your installed models

Configuration

SettingDescriptionDefault
CredentialsEndpoint configurationlocalhost:11434
Model NameInstalled modelllama2:latest
TemperatureCreativity (0-2)Medium (1)

Reasoning Mode

Enable thinking for compatible models:
OptionDescription
DisabledStandard responses
ThinkingEnable reasoning
ollama pull llama3.1        # General purpose
ollama pull codellama       # Coding
ollama pull mistral         # Fast
ollama pull phi3            # Lightweight
ollama pull gemma2          # Google's model
ollama pull qwen2           # Multilingual

System Requirements

RAMRecommended Models
8GB7B models (Llama 3.1 8B, Mistral 7B)
16GBLarger 7B models, some 13B
32GB13B-30B models
64GB+70B models
Apple Silicon Macs with Metal acceleration provide excellent local LLM performance.

Privacy Benefits

Complete Privacy

Data never leaves your Mac

Offline Access

Works without internet

No Usage Limits

Unlimited local queries

Full Control

Choose exactly which models to run

Troubleshooting

  • Ensure Ollama is running: ollama serve
  • Check port 11434 is available
  • Verify endpoint in settings
  • Use smaller models
  • Close memory-intensive apps
  • Consider quantized models
  • Use smaller model
  • Reduce context length
  • Restart Ollama