三層強制機制 — Three Enforcement Layers
Layer 1 — Hook
攔截直接編輯
settings.json
攔截任何直接對 SKILL.md 的修改,強制走 SOP 流程。
Blocks direct edits to SKILL.md — forces SOP entry.
Layer 2 — Runner
狀態機 + Model Routing
create-agent-runner.py
狀態機管理 12 步流程,強制 model routing,不允許跳步。
State machine manages step progression with enforced model routing.
Layer 3 — Validator
O/G 純度 + 反向驗證
check_ogsm_v2.py
O_PURITY 4/4、G_PURITY 3/3 驗證,反向鏈完整性檢查。
O_PURITY, G_PURITY, and reverse chain integrity validation.
完整流程 — 12-Step Process Flow
Setup — 初始設定
1
Agent vs Skill 判斷
判斷產出物是獨立 Agent 還是 Skill。決定後續流程分支方向。
Determine whether the output is a standalone Agent or a Skill.
Determine whether the output is a standalone Agent or a Skill.
1.5
Define Leader — 定義 Team Leader
定義 Team Leader 的 O/G/S/M 完整結構,成為所有子 agent 的對齊基準。
Define the Team Leader's full O/G/S/M — serves as alignment anchor for all sub-agents.
Define the Team Leader's full O/G/S/M — serves as alignment anchor for all sub-agents.
1.6
O Alignment — O 對齊驗證
驗證所有 sub-agent 的 O 對齊 Team Leader 的 O,無衝突、無偏移。
Verify all sub-agent O statements align with Team Leader O — no conflicts or drift.
Verify all sub-agent O statements align with Team Leader O — no conflicts or drift.
Define — 核心定義
2
Define O — 寫 Objective
寫 Objective(情感方向,無時間、無工具、無指標)。O 只能表達為什麼做,不混入 G/S 內容。
Write Objective: emotional direction only — no timelines, no tools, no metrics.
Write Objective: emotional direction only — no timelines, no tools, no metrics.
3
Define G — 寫 Goal
寫 Goal(可量化的狀態改變)。必須包含具體數字、時間框架,且可被 M 驗證。
Write Goal: measurable state change with numbers, timeframe, and M-verifiable outcome.
Write Goal: measurable state change with numbers, timeframe, and M-verifiable outcome.
4
Define S — 寫 Strategy
寫 Strategy(方法 + 為什麼 + model routing)。每條 S 必須說明 why 和資源承諾。
Write Strategy: method + rationale + model routing. Each S must state why and resource commitment.
Write Strategy: method + rationale + model routing. Each S must state why and resource commitment.
5
Define M — 寫 Measure
寫 Measure(驗證每條 S 的資源承諾是否兌現)。每個 S 必須有至少一個對應的 M。
Write Measure: verify each S's resource commitment. Every S must have at least one M.
Write Measure: verify each S's resource commitment. Every S must have at least one M.
6
Anti-patterns — 反模式清單
寫 ≥5 條 NOT/INSTEAD 反模式,明確說明不該做什麼以及替代做法。
Write ≥5 NOT/INSTEAD anti-patterns: explicit statements of what NOT to do and what to do instead.
Write ≥5 NOT/INSTEAD anti-patterns: explicit statements of what NOT to do and what to do instead.
7
Tier 1 Summary — 150 字摘要
寫 ≤150 字的 Tier 1 摘要,含 5 個必填欄位,供快速理解 agent 用途。
Write ≤150-word Tier 1 summary with 5 required fields for quick agent orientation.
Write ≤150-word Tier 1 summary with 5 required fields for quick agent orientation.
Validate — 驗證
8
Reverse Verify — 反向鏈驗證
執行反向驗證:M → S → G → O 鏈完整性。確認每個層級都能追溯到上層。
Run reverse chain check: M → S → G → O. Every level must trace back to the layer above.
Run reverse chain check: M → S → G → O. Every level must trace back to the layer above.
Package — 打包輸出
9
Save Package — 儲存三件套
儲存完整 package:SKILL.md(主體)、CLAUDE.md(規則注入)、README(說明)。
Save the full package: SKILL.md (body), CLAUDE.md (rule injection), README (documentation).
Save the full package: SKILL.md (body), CLAUDE.md (rule injection), README (documentation).
10
Dashboard — 生成視覺化頁面
自動生成 HTML dashboard,視覺化呈現 O/G/S/M 與 agent 結構。
Auto-generate HTML dashboard to visually display O/G/S/M and agent structure.
Auto-generate HTML dashboard to visually display O/G/S/M and agent structure.
Evaluate — 盲評 + 優化循環
11
3-Model Eval — 盲評
三模型獨立盲評:Sonnet + Gemini + Codex(三家族),互不知分數,取 median。
Three-model independent blind evaluation: Sonnet + Gemini + Codex (three families). Take median score.
Three-model independent blind evaluation: Sonnet + Gemini + Codex (three families). Take median score.
↻
median < 8.0 → 進入 Step 12 修正循環
12
Loop — 修正優化循環
根據盲評回饋修正 → Opus 優化 O/G/S/M → 重回 Step 11 重評,直到 median ≥ 8.0 或達 plateau。
Apply blind eval feedback → Opus refines O/G/S/M → back to Step 11, until median ≥ 8.0 or plateau.
Apply blind eval feedback → Opus refines O/G/S/M → back to Step 11, until median ≥ 8.0 or plateau.
↻
PASS → SOP 完成,agent 正式寫入 SKILL.md
CLI 操作指令 — Runner Commands
create-agent-runner.py
State machine CLI — manages step progression, model routing, and gate enforcement
# Initialize a new agent creation run
python3 tools/create-agent-runner.py init --name "team" --fleet "standalone"
# Advance to the next step (gate must pass first)
python3 tools/create-agent-runner.py next
# Submit output for a specific step
python3 tools/create-agent-runner.py submit <step> --file output.md --model <model>
# Check current status and gate results
python3 tools/create-agent-runner.py status
# Resume an interrupted run from last checkpoint
python3 tools/create-agent-runner.py resume