Engineering

Building Luna

Phased Autonomy for a Self-Directed AI Agent

Luna is an AI agent that runs on my personal infrastructure, maintains memory across sessions, and is gradually taking on more of her own work. I built her to answer a practical question: what does it actually take to give an AI agent real autonomy safely?

The answer, it turns out, is infrastructure. Not better prompts. Luna gains capabilities one phase at a time, and each phase only activates after the previous one proves stable in production. Approval queues gate what she can do. Budget ceilings limit how much she can spend. A hardware security key (Trezor) provides cryptographic proof that I, specifically, authorized an action. If something breaks, we step back down. The system is designed so that stepping back is always possible.

This post is a short overview. The whitepapers below cover the full architecture, threat model, and results from a red-team audit.

Whitepapers

Phased Autonomy: Infrastructure for Self-Directed AI Agents covers the complete architecture, safety model, and adversarial audit results.

Cryptographic Operator Authentication for AI Agents formalizes the hardware-backed authentication protocol and threat model.