← back to projects
koan

7-Layer Self-Hosted AI Operating System — sovereign, local-first intelligence with full operator ownership of model weights, memory, and context.

ACTIVE R&D ∃⧜π LOCAL-FIRST

koan is a sovereign, local-first artificial intelligence operating system designed around a 7-layer cognitive architecture. Where most AI tools lock users into cloud dependencies and opaque inference stacks, koan is built to run entirely on a single workstation — giving the operator full ownership of model weights, memory, and context.

The name is intentional: a koan has no external answer. The system is designed to force the operator into direct relationship with their own intelligence infrastructure — no intermediary, no subscription, no dependency on external availability.

The architecture draws directly from published cognitive-agent research and unifies them into a coherent OS abstraction that has no direct peer in the open-source landscape:

  • CoALA — Cognitive Architectures for Language Agents
  • MemGPT — long-term memory management for agents
  • Voyager — open-ended embodied agent frameworks
  • Reflexion — verbal reinforcement learning for agents
  • 184 verified arXiv papers spanning speculative decoding, PEFT, multi-agent frameworks, and LLM-in-finance

The inference layer runs on Ollama with llama.cpp as the GGUF-native backend, supporting quantized models optimized for consumer GPU hardware.

Runtime
Ollama + llama.cpp
Format
GGUF-native
Quantization
4-bit, AWQ, QLoRA, BitNet b1.58
Hardware
RTX 4070 · Ryzen 9 5950X
OS
Ubuntu 24.04
Node
research-zen-WST

koan's 7-layer cognitive architecture wires together local inference, structured memory, autonomous agent loops, and a deliberate fine-tuning pipeline under one operator-controlled roof:

  • Layer 1 · Inference — local model execution and routing
  • Layer 2 · Memory — structured persistent context management
  • Layer 3 · Perception — input processing and context framing
  • Layer 4 · Reasoning — multi-step deliberation and planning
  • Layer 5 · Action — tool use and external system interface
  • Layer 6 · Learning — fine-tuning pipeline and feedback loops
  • Layer 7 · Metacognition — system self-evaluation and RMCC integration
  • DONEInfrastructure layer operational on research-zen-WST
  • DONEOllama + llama.cpp inference stack configured
  • DONE184-paper research corpus catalogued and indexed
  • WIPMemory layer — persistent context architecture
  • WIPAgent loop harness integration with Tome
  • NEXTFine-tuning pipeline (QLoRA on RTX 4070)
  • NEXTMetacognition layer — RMCC self-evaluation integration
  • SPECFull 7-layer OS abstraction — long-range target