Overview

koan is a sovereign, local-first artificial intelligence operating system designed around a 7-layer cognitive architecture. Where most AI tools lock users into cloud dependencies and opaque inference stacks, koan is built to run entirely on a single workstation — giving the operator full ownership of model weights, memory, and context.

The name is intentional: a koan has no external answer. The system is designed to force the operator into direct relationship with their own intelligence infrastructure — no intermediary, no subscription, no dependency on external availability.

Research Foundation

The architecture draws directly from published cognitive-agent research and unifies them into a coherent OS abstraction that has no direct peer in the open-source landscape:

CoALA — Cognitive Architectures for Language Agents
MemGPT — long-term memory management for agents
Voyager — open-ended embodied agent frameworks
Reflexion — verbal reinforcement learning for agents
184 verified arXiv papers spanning speculative decoding, PEFT, multi-agent frameworks, and LLM-in-finance

Inference Stack

The inference layer runs on Ollama with llama.cpp as the GGUF-native backend, supporting quantized models optimized for consumer GPU hardware.

Runtime

Ollama + llama.cpp

Format

GGUF-native

Quantization

4-bit, AWQ, QLoRA, BitNet b1.58

Hardware

RTX 4070 · Ryzen 9 5950X

OS

Ubuntu 24.04

Node

research-zen-WST

7-Layer Architecture

koan's 7-layer cognitive architecture wires together local inference, structured memory, autonomous agent loops, and a deliberate fine-tuning pipeline under one operator-controlled roof:

Layer 1 · Inference — local model execution and routing
Layer 2 · Memory — structured persistent context management
Layer 3 · Perception — input processing and context framing
Layer 4 · Reasoning — multi-step deliberation and planning
Layer 5 · Action — tool use and external system interface
Layer 6 · Learning — fine-tuning pipeline and feedback loops
Layer 7 · Metacognition — system self-evaluation and RMCC integration

Status

DONEInfrastructure layer operational on research-zen-WST
DONEOllama + llama.cpp inference stack configured
DONE184-paper research corpus catalogued and indexed
WIPMemory layer — persistent context architecture
WIPAgent loop harness integration with Tome
NEXTFine-tuning pipeline (QLoRA on RTX 4070)
NEXTMetacognition layer — RMCC self-evaluation integration
SPECFull 7-layer OS abstraction — long-range target