Token-Aware
A compact AST map of your repo and auto-compressed history replace raw file dumps — coherent across long sessions, easy on your budget.
Features
A compact AST map of your repo and auto-compressed history replace raw file dumps — coherent across long sessions, easy on your budget.
OpenAI, Anthropic, Mistral, Ollama, LM Studio or any OpenAI-compatible endpoint — local models included. Your keys, your machine.
Past solutions are embedded in one local vector database and recalled across every project and session.
Runs your check script, reads the failures, and iterates until it passes — end to end, no babysitting.
Quickstart
No config files. No setup wizard. Pick your method, run a command, and Umbra is reading your project.
Read the full documentationnpm install -g umbra-agent umbra iwr -useb https://umbra.expert/install.ps1 | iex umbra curl -fsSL https://umbra.expert/install.sh | sh umbra The brief
Seven answers, no marketing voice. For everything else — the docs.
Umbra is an AI coding agent for your terminal. A background daemon keeps your project context loaded while the CLI/TUI gives you an interactive session — it builds a compact map of your codebase, talks to the model you’ve configured, and can edit files, run commands, and fix failing checks on its own.
Windows, macOS, and Linux — all three are first-class, no feature gaps. Install via npm/pnpm, or the one-line PowerShell script on Windows and the shell script on macOS/Linux.
No proxying — every request goes straight from your machine to the provider you configure, using your own API key stored locally in ~/.umbra/. Only the compact context built for that request (repo map snippets, relevant code, your message) is sent, never your whole codebase. Point it at Ollama or LM Studio for fully local, offline use.
Yes. Connect a local OpenAI-compatible server — Ollama or LM Studio — via the /providers menu and Umbra runs entirely offline. Persistent memory uses local embeddings either way, so search and recall keep working without a connection.
Most tools dump entire files into the prompt and let history grow forever, burning through your token budget fast. Umbra instead builds a compact AST map of your repo, compresses tool output and session history automatically, and recalls past solutions from a local vector database — so it stays coherent on large codebases at a fraction of the cost. The --exec loop builds on that: point it at a check script and it edits, runs, reads the failure, and retries until it’s green.
Yes — Umbra itself is free and open source (MIT). Bring your own API key for providers like OpenAI or Anthropic, or start with zero setup using OpenCode Zen’s free models, or run fully local models via Ollama/LM Studio at no cost.
Update in place with umbra --update — it checks for a new version and installs it without re-running the installer. To uninstall, remove the global package with npm uninstall -g umbra-agent (or pnpm remove -g umbra-agent); local memory and config live in ~/.umbra/ and stay until you delete that folder yourself.