Providers
Providers are YAML files in .aura/config/providers/. Each file configures a connection to an LLM backend.
Provider Types
| Type | Protocol | Capabilities |
|---|---|---|
ollama | Native Ollama API | Chat, embedding, thinking, vision |
llamacpp | OpenAI-compatible | Chat, reranking, thinking, vision; also used for whisper (STT) and kokoro (TTS) |
openrouter | OpenRouter API | Chat, embedding (cloud models, token auth) |
openai | OpenAI Responses API | Chat, embedding, transcription, synthesis |
anthropic | Native Anthropic Messages API | Chat, thinking, vision, tools |
google | Native Gemini API | Chat, thinking, vision, tools, embedding |
copilot | GitHub Copilot (dual-protocol) | Chat (GPT via Responses API, Claude via Messages API) |
codex | OpenAI Plus (Responses API) | Chat (ChatGPT Plus/Pro subscription) |
All providers implement a 4-method core interface (Chat, Models, Model, Estimate). Optional capabilities (Embed, Rerank, Transcribe, Synthesize, ModelLoader) use opt-in interfaces — providers implement only what they support, and callers discover capabilities via providers.As[T](provider).
YAML Schema
provider_name:
# API endpoint URL. Required for most providers, optional for anthropic.
url: http://host.docker.internal:11434
# Provider type — determines which API protocol to use (required).
# Values: ollama, llamacpp, openrouter, openai, anthropic, google, copilot, codex.
type: ollama
# Authentication token.
# Falls back to AURA_PROVIDERS_{NAME}_TOKEN environment variable.
# token: ""
# How long models stay loaded in VRAM (Ollama only).
# Uses Go duration syntax: "5m", "1h", "30s".
# keep_alive: 15m
# HTTP response header timeout — how long to wait for the server to start responding.
# Does not affect streaming responses. Default: 5m.
# timeout: 5m
# Model visibility filter for `aura models` and `/model`.
# Does NOT affect --model flag or feature agents.
models:
include: [] # Glob patterns — empty means all (e.g., ["llama*", "qwen*"])
exclude: [] # Glob patterns — applied after include (e.g., ["*embed*"])
# Declared capabilities for this provider. Empty = all capabilities assumed.
# Values: chat, embed, rerank, transcribe, synthesize.
# Use to restrict a provider to specific roles (e.g., embed-only provider).
# capabilities: []
# Retry configuration for transient Chat() failures (rate limits, 5xx, network errors).
# Only applies to native providers (ollama, llamacpp) — Fantasy-based providers
# (anthropic, openai, google, openrouter, copilot, codex) have built-in retry.
# Disabled by default (max_attempts: 0).
retry:
max_attempts: 0 # 0 = disabled. Number of retry attempts before giving up.
base_delay: 1s # Initial delay between retries (exponential backoff).
max_delay: 30s # Maximum delay between retries.
Token Resolution
Authentication tokens are resolved in order:
tokenfield in the provider YAML — supports$VARand${VAR}env var expansionAURA_PROVIDERS_{NAME}_TOKENenvironment variable (automatic fallback when token is empty)- Value from env file loaded via
--env-file
This means you can reference existing environment variables directly:
openrouter:
type: openrouter
url: https://openrouter.ai/api/v1
token: ${OPENROUTER_API_KEY}
anthropic:
type: anthropic
token: ${ANTHROPIC_API_KEY}
Example
my_ollama:
url: http://host.docker.internal:11434
type: ollama
keep_alive: 15m
models:
exclude: ["*embed*"]
OpenAI-Compatible Example
The openai type works with any service that implements the OpenAI Responses API (/v1/responses):
openai:
url: https://api.openai.com/v1
type: openai
# token from AURA_PROVIDERS_OPENAI_TOKEN env var
timeout: 5m
# groq:
# url: https://api.groq.com/openai/v1
# type: openai
# deepseek:
# url: https://api.deepseek.com/v1
# type: openai
Anthropic Example
The anthropic type uses the native Anthropic Messages API with full streaming, tool use, thinking, and vision support:
anthropic:
type: anthropic
# token from AURA_PROVIDERS_ANTHROPIC_TOKEN env var
timeout: 5m
URL is optional — defaults to https://api.anthropic.com. Token uses the X-Api-Key header (handled by the SDK).
Google Gemini Example
The google type uses the native Gemini API with streaming, tool use, thinking (with signature preservation), vision, and embeddings:
google:
type: google
# token from AURA_PROVIDERS_GOOGLE_TOKEN env var
timeout: 5m
URL is optional — defaults to Google’s Gemini API endpoint. Token uses the ?key= query parameter (handled by the SDK).
GitHub Copilot Example
The copilot type routes requests to either the OpenAI Responses API (GPT models) or Anthropic Messages API (Claude models) through your GitHub Copilot subscription:
copilot:
type: copilot
timeout: 5m
Authenticate via aura login copilot (device code flow) or set AURA_PROVIDERS_COPILOT_TOKEN to a GitHub OAuth token (ghu_...).
OpenAI Plus (Codex) Example
The codex type accesses ChatGPT Plus/Pro models via the Codex endpoint:
codex:
type: codex
timeout: 5m
Authenticate via aura login codex (device code flow) or set AURA_PROVIDERS_CODEX_TOKEN to an OpenAI refresh token (rt_...).
Catwalk Registry
Model capabilities (context length, vision, thinking, thinking levels) are enriched at startup using Catwalk metadata. Aura ships with compiled-in embedded data that works offline instantly. On startup, it also attempts a live fetch from the Catwalk API and caches the result to disk. If the fetch fails, the disk cache is tried, then the embedded data.
Enrichment only fills gaps — it never overwrites capabilities already reported by the provider API. Currently applies to the anthropic, openai, and google providers. Anthropic and OpenAI listing APIs return only model IDs (no capabilities), so they get full enrichment. Google returns context length, thinking, and tools from its API, so only Vision is filled from Catwalk. The remaining providers (Ollama, OpenRouter, LlamaCPP, Copilot, Codex) build all capabilities inline from their rich API responses.
You can add as many providers as you need — create new YAML files in providers/. Any provider type can be used with custom URLs (e.g., a remote Ollama instance, a self-hosted OpenAI-compatible server).
To see the default providers that ship with aura init, inspect .aura/config/providers/ after scaffolding.