/create-agent — 12-Step Enforced SOP

三層強制機制 — Three Enforcement Layers

Layer 1 — Hook

攔截直接編輯

settings.json

攔截任何直接對 SKILL.md 的修改，強制走 SOP 流程。 Blocks direct edits to SKILL.md — forces SOP entry.

Layer 2 — Runner

狀態機 + Model Routing

create-agent-runner.py

狀態機管理 12 步流程，強制 model routing，不允許跳步。 State machine manages step progression with enforced model routing.

Layer 3 — Validator

O/G 純度 + 反向驗證

check_ogsm_v2.py

O_PURITY 4/4、G_PURITY 3/3 驗證，反向鏈完整性檢查。 O_PURITY, G_PURITY, and reverse chain integrity validation.

完整流程 — 12-Step Process Flow

Setup — 初始設定

Agent vs Skill 判斷

Gemini

判斷產出物是獨立 Agent 還是 Skill。決定後續流程分支方向。
Determine whether the output is a standalone Agent or a Skill.

Gate 條件

輸出含 "Agent" 或 "Skill"

Validator

— (manual check)

1.5

Define Leader — 定義 Team Leader

Opus

定義 Team Leader 的 O/G/S/M 完整結構，成為所有子 agent 的對齊基準。
Define the Team Leader's full O/G/S/M — serves as alignment anchor for all sub-agents.

Gate 條件

O / G / S / M sections 全部存在

Enforcement

ogsm-framework 強制載入

1.6

O Alignment — O 對齊驗證

Opus

驗證所有 sub-agent 的 O 對齊 Team Leader 的 O，無衝突、無偏移。
Verify all sub-agent O statements align with Team Leader O — no conflicts or drift.

Gate 條件

對齊表完整，無衝突

Validator

— (manual review)

Define — 核心定義

Define O — 寫 Objective

Opus

寫 Objective（情感方向，無時間、無工具、無指標）。O 只能表達為什麼做，不混入 G/S 內容。
Write Objective: emotional direction only — no timelines, no tools, no metrics.

Gate 條件

O_PURITY 4/4 PASS

Validator

check_ogsm_v2.py O_PURITY

Define G — 寫 Goal

Opus

寫 Goal（可量化的狀態改變）。必須包含具體數字、時間框架，且可被 M 驗證。
Write Goal: measurable state change with numbers, timeframe, and M-verifiable outcome.

Gate 條件

G_PURITY 3/3 PASS

Validator

check_ogsm_v2.py G_PURITY

Define S — 寫 Strategy

Gemini

寫 Strategy（方法 + 為什麼 + model routing）。每條 S 必須說明 why 和資源承諾。
Write Strategy: method + rationale + model routing. Each S must state why and resource commitment.

Gate 條件

含 why + 資源 + model routing

Validator

check_ogsm_v2.py

Define M — 寫 Measure

Codex

寫 Measure（驗證每條 S 的資源承諾是否兌現）。每個 S 必須有至少一個對應的 M。
Write Measure: verify each S's resource commitment. Every S must have at least one M.

Gate 條件

每個 S 有對應 M

Validator

check_ogsm_v2.py

Anti-patterns — 反模式清單

Codex

寫 ≥5 條 NOT/INSTEAD 反模式，明確說明不該做什麼以及替代做法。
Write ≥5 NOT/INSTEAD anti-patterns: explicit statements of what NOT to do and what to do instead.

Gate 條件

≥5 條 anti-patterns

Validator

count check (≥5)

Tier 1 Summary — 150 字摘要

Gemini

寫 ≤150 字的 Tier 1 摘要，含 5 個必填欄位，供快速理解 agent 用途。
Write ≤150-word Tier 1 summary with 5 required fields for quick agent orientation.

Gate 條件

≤150 字 + 5 欄位齊全

Validator

word count check

Validate — 驗證

Reverse Verify — 反向鏈驗證

Script

執行反向驗證：M → S → G → O 鏈完整性。確認每個層級都能追溯到上層。
Run reverse chain check: M → S → G → O. Every level must trace back to the layer above.

Gate 條件

script exit 0

Validator

reverse_verify_ogsm.py

Package — 打包輸出

Save Package — 儲存三件套

Codex

儲存完整 package：SKILL.md（主體）、CLAUDE.md（規則注入）、README（說明）。
Save the full package: SKILL.md (body), CLAUDE.md (rule injection), README (documentation).

Gate 條件

3 個檔案全部存在

Validator

file existence check

Dashboard — 生成視覺化頁面

Script

自動生成 HTML dashboard，視覺化呈現 O/G/S/M 與 agent 結構。
Auto-generate HTML dashboard to visually display O/G/S/M and agent structure.

Gate 條件

script exit 0

Validator

skill-to-dashboard.py

Evaluate — 盲評 + 優化循環

3-Model Eval — 盲評

Multi

三模型獨立盲評：Sonnet + Gemini + Codex（三家族），互不知分數，取 median。
Three-model independent blind evaluation: Sonnet + Gemini + Codex (three families). Take median score.

Gate 條件

median ≥ 8.0

Evaluators

Sonnet · Gemini · Codex (3 independent evals)

↻ median < 8.0 → 進入 Step 12 修正循環

Loop — 修正優化循環

Opus

根據盲評回饋修正 → Opus 優化 O/G/S/M → 重回 Step 11 重評，直到 median ≥ 8.0 或達 plateau。
Apply blind eval feedback → Opus refines O/G/S/M → back to Step 11, until median ≥ 8.0 or plateau.

Exit Gate

median ≥ 8.0 或 plateau 判定

Loop Back

FAIL → Step 11 re-eval

↻ PASS → SOP 完成，agent 正式寫入 SKILL.md

CLI 操作指令 — Runner Commands

create-agent-runner.py

State machine CLI — manages step progression, model routing, and gate enforcement

# Initialize a new agent creation run
python3 tools/create-agent-runner.py init --name "team" --fleet "standalone"

# Advance to the next step (gate must pass first)
python3 tools/create-agent-runner.py next

# Submit output for a specific step
python3 tools/create-agent-runner.py submit <step> --file output.md --model <model>

# Check current status and gate results
python3 tools/create-agent-runner.py status

# Resume an interrupted run from last checkpoint
python3 tools/create-agent-runner.py resume