Research

Papers, preprints, and technical reports.

We publish formal research on the problems we have to solve to ship sovereign AI systems in production: evaluation methodology under compliance constraints, agent architectures with persistent memory, and the systems work that makes both possible.

2026-04-29
Capability-Scoped Runtimes for Desktop Agents: Risk-Gated Execution, Durable Tasks, and Reversibility-Aware Containment
Desktop agents are advancing on planning and perception while remaining fragile on execution. We argue this is a runtime problem, not a planner problem, and present pan-agent — a managed desktop-agent runtime that composes three load-bearing mechanisms: a pre-execution risk-gated classifier that intercepts dangerous tool calls, a durable taskrunner that survives crashes and zombie processes, and a reversibility-aware containment layer that records each side-effect with a typed receipt and wires it to per-tool reversers backed by capability-probed filesystem snapshots. The contribution is framed around a taxonomy that classifies every action as local-reversible, runtime-compensable, or externally-irreversible, and a four-experiment pre-registered evaluation plan over OSWorld, OS-Harm, RedTeamCUA, and a long-horizon crash generator. Implementation is open source — approximately 32 kLOC of Go, MIT-licensed.
Euraika Labs Research Group

Papers, preprints, and technical reports.

Capability-Scoped Runtimes for Desktop Agents: Risk-Gated Execution, Durable Tasks, and Reversibility-Aware Containment