← back to projects
FORGE

Metacognitive Prompt Engineering Harness — structured development, versioning, and evaluation for prompts that need to work reliably, not just once.

ACTIVE · v0.3 METACOGNITIVE

FORGE is ZPAK's metacognitive prompt engineering harness — the craft layer of the intelligence stack. It provides structured development, versioning, and systematic evaluation for prompts that need to work reliably across sessions, models, and contexts — not just once in a demo.

Most prompt engineering is ad hoc: write something, see if it works, adjust, repeat. FORGE imposes engineering discipline on that process: every prompt has a version, an intent, test cases, and a documented evaluation methodology.

  • Structured prompt development — intent, context, constraints, output spec
  • Version control — prompt history with diff-level change tracking
  • Systematic evaluation — test cases, metrics, and regression detection
  • Cross-model portability — prompts tested against multiple inference backends
  • Metacognitive framing — prompts designed to reason about their own output quality
  • Tome integration — FORGE prompts feed directly into the Tome SKILL tier

A metacognitive prompt doesn't just ask for output — it asks the model to reason about whether the output meets the stated criteria, flag uncertainty, and self-correct before delivery. Combined with the RMCC five-pass quality gate, FORGE prompts produce output that is defensible, not just plausible.

This is what separates production-grade prompt engineering from hobbyist prompt hacking.

  • DONEv0.3 active — core development and versioning harness
  • DONETome SKILL tier integration established
  • DONEMERIDIAN-compliant prompt templates library
  • WIPSystematic evaluation framework — test case library
  • WIPCross-model portability testing (Claude / Ollama / DeepInfra)
  • NEXTv0.4 — automated regression detection
  • NEXTFORGE as a ZPAK client deliverable (prompt library handoff)