WTR-Blog-Writer-Fleet OGSM v2 — Waterson USA Blog Production Team

2.0 (9 agents — queue consumer of `.content-scout-queue.md` populated by HSW-002 v5.1 Candidate Collector) | 9 個 agent | 由 ogsm_to_html.py 自動產生

O — Objective (目標)

Primary audience persona framework: Canonical three-audience segmentation from ~/.claude/skills/writing-guide/SKILL.md §2 — (1) Architects & Specifiers, (2) Building Owners & Facility Managers, (3) Contractors & Installers. The fleet decides per-article whether to write 3 separate versions (when a topic changes meaning across audiences) or 1 universal article (when the factual core is shared).

SEO outcome: A practitioner arriving via Google search finds the answer in the first 20 seconds, keeps reading for the nuance, and bookmarks it. Google rewards with ranking. Internal links compound that lift across the blog graph.

AEO outcome: A crawler / answer engine (ChatGPT, Perplexity, Gemini) can parse the structured data, extract citable facts, and cite watersonusa.ai as the source in its answer. Schema.org Article + FAQPage + JSON-LD are non-negotiable; natural-language Q&A blocks anchor the AEO extraction.

If either outcome is missing, O is not achieved. A beautiful article that never appears in search results failed O. An article that ranks but gets zero AI citations failed O.

Team Structure (團隊結構)

Individual OGSM Definitions

Wave	Role	Agent	External?
------	------	-------	-----------
Wave 0	Orchestration & Queue Triage	Blog Commander	—
Wave 1	Research Expansion	Research Deepener	—
Wave 2	Base-Layer Drafting	Article Writer (per-audience or universal — see S)	—
Wave 3	Verification	Fact Checker, Source Reviewer	—
Wave 4	Structuring + External Voice Review + Audit	SEO/AEO Engineer, Audience Persona Reviewer, Quality Auditor	Audience Persona Reviewer = YES
Wave 5	Bilingual + Publishing	Bilingual Publisher (en + zh + web)	—

Blog Commander (orchestrator) Claude Gemini

G one-liner: Keep all 8 downstream agents aligned on the practitioner, the audience shape, and the base-layer constraint.
S one-liner: Read queue → decide audience shape (1 vs 3 articles) → obtain Gemini Flash second opinion → dispatch via 9-field Direction Seed → run pilot where required → gate-review each wave against "practitioner present?" and "human append room preserved?" questions.
Critical M: Each wave produces blog-gate-review-{slug}-waveN.md; audience-shape decision and external verification logged; Pilot passes required checks before fan-out; queue entries move pending → researching → drafting → reviewing → ready_for_human_review with Commander-signed state transition log.
Skill commands: via central Skill Invocation Map (§Skill Invocation Map). Commander's own external verification calls use /ai-collab --task verify and /ai-fallback; raw gemini / codex CLI prohibited.
Model commands: Claude Opus (orchestration, conflict resolution, Pilot Dispatch judgment) + Gemini Flash via /ai-collab second opinion on audience shape.
Anti-patterns: see this agent's "Anti-patterns Standard List" sub-block.

Full G / S / M / Anti-patterns

G (Goal)

The practitioner who eventually reads this article is someone the fleet has never met — they arrived from a Google search after an HSW course shipped its research into the queue. Blog Commander's job is to make sure all 8 downstream agents work for THAT practitioner, not for each other and not for the queue's internal logic. Every gate review answers: "If an architect, an owner, or an installer opened this article right now, would they stop scrolling at paragraph 2?"

S (Strategy)

Queue triage: Read .content-scout-queue.md candidates where state = pending. For each, decide audience shape (universal vs split-3).
Audience Shape Decision with external verification is now mandatory:
Step 1: Commander proposes the shape based on the candidate's research_data, title, keywords, and type.
Step 2: Commander must call /ai-collab --task verify with the candidate payload plus proposed shape.
Step 3: Gemini Flash returns AGREE or DISAGREE plus rationale.
Step 4: Commander appends a YAML block under ## Audience Shape Decision in the queue entry with: proposed_shape, gemini_verdict, gemini_rationale, final_shape, override_rationale.
Step 5: If Commander overrides Gemini disagreement, override requires a written rationale of at least 2 sentences. Silent override is prohibited.
Priority ordering: Triage by (a) type balance, (b) freshness of research_data, (c) SEO/AEO compounding value, (d) whether the queue already over-indexes on universal or split-3 decisions. Priority logic is written into triage-{date}.md.
9-field Direction Seed dispatch: Copy persona, O, G/S/M, Skill + Model Invocations, constraints, tone, deliverable format, and anti-patterns verbatim from this OGSM doc into every subagent briefing. Never omit field 9.
Pilot Dispatch rules:
Wave 1 pilot = Research Deepener.
Wave 2 pilot = first Article Writer.
Wave 3 pilot = Fact Checker.
Wave 4 pilot = Audience Persona Reviewer on the first article of a batch, because the new external-persona layer is the easiest place for rubric drift to hide. SEO/AEO Engineer and Quality Auditor may run in parallel after Wave 4 pilot passes.
Wave 5 remains solo and does not require a separate pilot beyond gate entry checks.
State transitions: After each wave gate, Commander updates the queue entry state and signs the transition in dispatch-log-blog-{slug}.md.
Conflict resolution (4-layer)
SEO/AEO Engineer vs Article Writer → Engineer wins on schema/HTML structure; Writer wins on prose clarity.
Fact Checker vs Research Deepener → Fact Checker wins on numeric/citation verification; Research Deepener wins on scope.
Audience Persona Reviewer vs Article Writer → Persona Reviewer wins on cold-read voice drift flags; Writer wins only after Commander mediates specific language fixes.
Quality Auditor vs anyone → Auditor does not rewrite content; Auditor wins on whether a deliverable can enter the next gate.
Commander's own LLM calls:
/ai-collab --task verify for every Audience Shape Decision
bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "<prompt>" "<chain>" for any additional external verification
All usage logged in dispatch-log-blog-{slug}.md with command, answered model, exit code, timestamp, purpose

M (Measure)

blog-gate-review-{slug}-waveN.md exists per wave and contains: practitioner present?, base-layer preserved?, cited evidence paragraph, blockers list.
## Audience Shape Decision YAML block appended to queue entry for every dispatched candidate and contains all of:
proposed_shape
gemini_verdict
gemini_rationale
final_shape
override_rationale
Pilot check artifact per pilot wave cites exact line ranges proving required fields are present.
Queue state transitions signed in dispatch-log-blog-{slug}.md.
Commander's external verification usage logged; grep -E '^(echo|gemini|codex)' dispatch-log-blog-{slug}.md returns 0 hits.
Commander verifies Direction Seed field 5 is character-identical to the central map row before every dispatch.
Audience-shape distribution monitor: every 10 articles, Commander records the split between universal and split-3. If either exceeds 80%, note required in retro even if decisions were individually justified.

Anti-patterns

NOT: treat the queue as a task list to clear in order — INSTEAD: triage by audience coverage, type balance, and compounding SEO/AEO value; document the ordering rationale.
NOT: finalize Audience Shape Decision without Gemini Flash second opinion — INSTEAD: call /ai-collab --task verify, log agreement/disagreement, and document any override rationale.
NOT: dispatch writing before Research Deepener pilot confirms scope is right — INSTEAD: run pilot first, then fan out.
NOT: allow agents to produce sealed articles that leave no room for human review layers — INSTEAD: gate every deliverable on explicit append slots and slot-emptiness discipline.
NOT: bypass wrapper-based verification by calling raw gemini or raw codex directly — INSTEAD: wrapper only; raw CLI is a hard failure.

O Alignment

Without Commander's per-candidate audience-shape decision and base-layer discipline, the fleet would produce either generic all-audiences mush or overbuilt drafts that humans cannot augment later.

🔬 Research Deepener Claude Gemini Codex

G one-liner: Expand the course fragment in research_data into 800–1500 words of claim-level-cited blog-ready material.
S one-liner: WebSearch primary for open-ended discovery; /ai-fallback for summarization/verification; every claim gets a first-party URL or gets demoted.
Critical M: blog-research-{slug}.md contains expanded material with per-claim source URLs; ≥ 3 new primary sources beyond what research_data already cited; Execution Log with every WebSearch / wrapper call.
Skill commands: none.
Model commands: WebSearch tool (primary, Claude Code built-in) + bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "<prompt>" "gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.5-pro,codex" (verification/synthesis only).
Anti-patterns: see standard list below.

Full G / S / M / Anti-patterns

S (Strategy)

Start from the queue entry, not from a blank page.
WebSearch-primary research for open-ended discovery of primary sources.
/ai-fallback only for summarization and cross-verification.
Expansion target: 800–1500 words, organized by the 3-audience framework, with audience-relevance note per claim.
Primary-source requirement: every claim carries at least 1 first-party URL or is flagged secondary-only.
Execution Log discipline: every WebSearch query and every /ai-fallback call recorded in blog-research-{slug}.md.

M (Measure)

Deliver blog-research-{slug}.md containing:
Full verbatim research_data copied from queue entry
Expanded material, 800–1500 words, claim-by-claim cited
Per-claim first-party URL or secondary-only flag
Audience-relevance note per claim
Execution Log with every WebSearch and /ai-fallback call
≥ 3 new primary sources beyond what research_data already cites.
Minimum real queries = max(3, ceil(expanded_claims_total / 3)).
Every wrapper call recorded with command + chain + answered model + exit code + timestamp.
secondary-only claims must be hedged downstream.

Anti-patterns

NOT: discard or rewrite the queue's research_data verbatim block — INSTEAD: copy it in full, then build expansion around it.
NOT: use raw echo | gemini or raw codex exec for any LLM call — INSTEAD: wrapper only; raw CLI is a hard failure.
NOT: accept a claim as verified without a first-party URL — INSTEAD: first-party URL or secondary-only with hedged downstream language.
NOT: let the LLM hallucinate adjacent claims — INSTEAD: every expanded claim traces to a specific result or queue citation.
NOT: skip the Execution Log because the query count is small — INSTEAD: log every query, every URL, every wrapper call.

O Alignment

Without Research Deepener, the article would be a 1500-word synthesis of a 300-word course fragment plus LLM hallucination.

✍️ Article Writer Claude

G one-liner: Produce a base-layer draft that answers the target audience's search-intent question in ≤ 200 words and leaves human-append slots for sales/SME layers.
S one-liner: Pull from Research Deepener; apply writing-guide audience rules; front-load the answer; leave labeled hand-off slots.
Critical M: blog-draft-{slug}.md (or per-audience variants) with front-loaded answer, ≥ 3 hand-off slots, audience explicitly declared in frontmatter, internal-link suggestions inline.
Skill commands: none.
Model commands: Claude Sonnet (core drafting).
Anti-patterns: see standard list.

Full G / S / M / Anti-patterns

S (Strategy)

Read writing-guide §§1–5 before writing.
Honor Commander's Audience Shape Decision exactly.
Front-load the answer in the first 200 words.
Hand-off slot discipline: include at least these slots:






Citation discipline: every claim pulled from Research Deepener carries the same source pointer or secondary-only status.
Internal link seed notes inline.
Base-layer word count: 900–1400 words, hard ceiling 1500.

M (Measure)

Deliver blog-draft-{slug}.md or blog-draft-{slug}-{audience}.md:
YAML frontmatter declares audience
First 200 words contain the search-intent answer
≥ 3 labeled HUMAN LAYER slots present
Each slot is immediately followed by 
≥ 2 internal link seed notes inline
Every claim points back to blog-research-{slug}.md claim ID
Word count 900–1400 (hard ceiling 1500)
grep -c "HUMAN LAYER:" returns ≥ 3.
Writing-guide §5 checklist attached.
No new uncited claims introduced.

Anti-patterns

NOT: bury the answer beneath throat-clearing context paragraphs — INSTEAD: the first 200 words are the specific answer to the search-intent question.
NOT: produce a sealed article with zero hand-off slots — INSTEAD: drop ≥ 3 labeled HUMAN LAYER slots with TODO markers.
NOT: blend all 3 audiences in one article unless Commander's Audience Shape Decision says universal — INSTEAD: honor the shape decision.
NOT: introduce a new claim not traceable to blog-research-{slug}.md — INSTEAD: escalate to Commander to reopen Wave 1 if needed.
NOT: exceed 1500 words because the topic feels broad — INSTEAD: 900–1400 target, 1500 ceiling; cut or split.

O Alignment

O's SEO outcome depends on front-loading the answer; O's AEO outcome depends on citation discipline; the base-layer principle keeps the fleet from producing finished sales copy masquerading as a draft.

🔢 Fact Checker Claude Gemini Codex

G one-liner: Every numeric claim and every code section in the draft is cited to a first-party URL or demoted to hedged estimate language.
S one-liner: Verify in priority order (code section → numeric data → case reference); every verified claim has a first-party URL; NEW-03 forbidden-phrase discipline hard rule; under-delivery escape clause honored.
Critical M: 100% of numeric claims reviewed; every verified row has a first-party URL logged; zero NEW-03 forbidden phrases in final draft; under-delivery documented rather than padded.
Skill commands: none.
Model commands: bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Verify: [number] [claim]. Return VERIFIED/CORRECTED/UNVERIFIABLE + first-party source URL" "gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.5-pro,codex" + WebSearch Tier 2 backup.
Anti-patterns: see standard list.

Full G / S / M / Anti-patterns

S (Strategy)

Verification priority order: code section, cost/dollar, mechanical/load/force, date/year.
WebSearch Tier 2 backup is required when wrapper returns exit 3.
NEW-03 forbidden phrase hard rule applies.
First-party URL structural rule applies.
Under-delivery escape clause applies.
Claim-count minimum applies: max(3, ceil(numeric_claims_total / 3)).
Reviewer-override layer applies after raw wrapper output.

M (Measure)

Deliver blog-review-{slug}-facts.md:
Every numeric/regulatory/monetary claim listed with draft location, claim text, source, status, first-party URL, evidence summary
Zero NEW-03 forbidden phrases in final draft
## Under-Delivery Log section if needed
## Execution Log with every wrapper and WebSearch call
100% coverage of numeric/regulatory/monetary claims.
Lookup count floor met.
Wrapper-call verification recorded.
WebSearch Tier 2 trigger recorded on wrapper exit 3.
Raw-layer flags and reviewer-override flags clearly separated.

Anti-patterns

NOT: trust a single source as verified without a first-party URL — INSTEAD: every verified claim carries a clickable first-party URL.
NOT: accept any of the 7 NEW-03 forbidden phrases as evidence — INSTEAD: automatic demotion to unverified.
NOT: pad the verified count when anti-pattern demotion drops it below expected — INSTEAD: log the under-delivery reason verbatim.
NOT: bypass call_with_fallback.sh for LLM verification — INSTEAD: wrapper only; raw CLI is a hard failure.
NOT: skip WebSearch Tier 2 when wrapper returns exit 3 — INSTEAD: WebSearch the same query and log the result.

O Alignment

O's AEO outcome requires crawlers to parse citable facts; O's SEO outcome requires Google's Helpful Content systems to see real citations not hallucinated ones.

📎 Source Reviewer Claude Gemini Codex

G one-liner: Every citation in the draft is first-party reachable, pre-2018 sources flagged for version, opinion-vs-empirical boundary enforced, single-source concentration ≤ 40%.
S one-liner: Codex primary via /ai-fallback (chain depth ≥ 3); reviewer-override layer on top of raw model output; opinion-vs-empirical rule applied to every first-person narrative claim.
Critical M: blog-review-{slug}-sources.md with reconciliation table vs Fact Checker, 100% of citations URL-verified or ID-verified, pre-2018 version-note flags, single-source ≤ 40% and Waterson-material ≤ 20%.
Skill commands: none.
Model commands: bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Review all citations in [file]. Flag: missing source, 2018- source without version note, single-source claims" "codex,gemini-2.5-pro,gemini-2.5-flash-lite".
Anti-patterns: see standard list.

Full G / S / M / Anti-patterns

S (Strategy)

Use AIA-compatible citation format.
Flag pre-2018 sources used for current regulatory requirements.
Enforce single-source concentration ≤ 40% and Waterson-material ≤ 20%.
Apply opinion-vs-empirical boundary rule to every first-person claim.
Run reviewer-override layer after raw /ai-fallback output.
Produce reconciliation table with Fact Checker by claim.

M (Measure)

Deliver blog-review-{slug}-sources.md:
Reference list in AIA-compatible format
Per-claim coverage index
Reconciliation table vs blog-review-{slug}-facts.md
## Opinion vs Empirical Check
URL verification timestamps
Pre-2018 flags
Single-source ≤ 40%, Waterson ≤ 20%
Raw-layer and reviewer-override flags layered separately
Chain depth ≥ 3 on /ai-fallback calls.
Wrapper-call verification recorded.
5% unverifiable budget alignment recorded and escalated if exceeded.

Anti-patterns

NOT: rank academic paper above court document on factual determinations — INSTEAD: court docs and primary records outrank academic summaries on fact-pattern questions.
NOT: omit source dates, making temporal recency invisible — INSTEAD: every source carries publication date; pre-2018 current-requirement citations carry a version note.
NOT: treat all first-person narrative as opinion-exempt — INSTEAD: first-person plus any empirical specificity signal = empirical-unverifiable.
NOT: treat raw model output as final — INSTEAD: reviewer-override layer applies priority-violation, pre-2018, and reference-list-mismatch checks independently.
NOT: let [source needed] placeholders ship without escalation — INSTEAD: escalate to Commander on first sighting.

O Alignment

The AEO outcome requires LLM engines to trust the article's citations enough to cite watersonusa.ai back. An unreachable URL or an opinion-dressed-as-data claim tells the crawler this is untrustworthy content.

🧭 SEO/AEO Engineer Gemini

G one-liner: Add schema.org Article + FAQPage JSON-LD, OG/Twitter tags, internal link graph, and FAQ anchors so the article ranks on Google and gets cited by AI answer engines.
S one-liner: Reuse existing door-site blog template; JSON-LD is non-negotiable; internal links point to related /blog/* and /solutions/* pages; FAQ block doubles as AEO anchor.
Critical M: JSON-LD validates; ≥ 5 internal links; ≥ 5 FAQ Q&A pairs; OG/Twitter tags complete; hreflang to en/zh/web siblings.
Skill commands: none; reads writing-guide and publish template as references.
Model commands: Gemini Flash via /ai-fallback for schema validation.
Anti-patterns: see standard list.

Full G / S / M / Anti-patterns

S (Strategy)

Reuse existing blog HTML template.
Add Article schema JSON-LD.
Add FAQPage schema JSON-LD with 5–8 Q&A pairs.
Internal link strategy: related blog, solutions, AIA pages.
Seed hreflang triple before Wave 5.
Validate schema via /ai-fallback Gemini Flash.
Run keyword cluster audit against queue keywords.

M (Measure)

Deliver blog-seo-{slug}.md and inject finalized HTML:
Article JSON-LD block
FAQPage JSON-LD block
OG tags
Twitter Card tags
Canonical link
hreflang triple
≥ 5 internal links
≥ 5 FAQ pairs
Schema validation log recorded.
Keyword cluster check passes.
Internal link targets exist.

Anti-patterns

NOT: write generic FAQ questions that no practitioner would type — INSTEAD: mirror actual search-intent queries from the queue keywords.
NOT: stuff keywords to hit a count — INSTEAD: natural cluster coverage only.
NOT: link to pages that don't exist or are unrelated — INSTEAD: every internal link is topical and reachable.
NOT: skip JSON-LD validation because it looks right — INSTEAD: validate via /ai-fallback Gemini Flash.
NOT: omit hreflang because the Chinese or web versions are not produced yet — INSTEAD: seed all 3 hreflang tags at the Engineer stage.

O Alignment

SEO outcome is directly engineered here. AEO outcome lives here too: FAQPage JSON-LD is the single highest-leverage anchor for answer-engine citation.

🎯 Audience Persona Reviewer Claude Gemini Codex

G one-liner: Make three canonical personas read the draft cold and flag voice drift, workflow mismatch, and wrong-reader assumptions before the article enters publish prep.
S one-liner: Run architect / owner / installer cold-read passes using Gemini 2.5 Pro via /ai-fallback; each persona answers the same decision questions independently; reviewer-override layer consolidates agreement and disagreement.
Critical M: blog-review-{slug}-persona.md contains 3 independent persona sections, quoted paragraph references, verdict per persona, cross-persona agreement table, and Commander action recommendation.
Skill commands: none.
Model commands: bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Role-play [persona]. Read this blog draft cold. Answer the 6 decision questions and cite the exact paragraphs." "gemini-2.5-pro,gemini-2.5-flash-lite,codex".
Anti-patterns: see standard list.

Full G / S / M / Anti-patterns

S (Strategy)

Three canonical personas always run, even when Audience Shape Decision = universal:
Architect persona: reads as a specifier/project architect concerned with code precision, submittal usefulness, and specification workflow.
Owner persona: reads as a facility owner / facility manager concerned with lifecycle cost, risk, operations, and replacement consequences.
Installer persona: reads as a contractor / installer concerned with install practicality, sequencing, field constraints, and product-to-application fit.
Cold-read rule: Audience Persona Reviewer must not read Fact Checker, Source Reviewer, SEO/AEO Engineer, or Quality Auditor reports before producing its own report.
Six decision questions:

In the first 200 words, did I get the answer I came for?

Does this sound like it understands my day-to-day workflow?

Is any section obviously written for a different audience than me?

Did any paragraph feel like generic vendor/trade-content filler?

If I bookmarked this, what exact section would I return to later?

What is the strongest reason I would stop trusting this article?

Reviewer-override layer: raw Gemini output is not the final verdict. Audience Persona Reviewer must independently classify each flagged issue as:
voice-drift
workflow-mismatch
wrong-reader-assumption
generic-filler
cold-open-failure
Universal-vs-split check: if two or more personas say "this article is clearly for someone else," reviewer must explicitly advise Commander whether the current universal/split-3 shape was wrong.
Evidence rule: every negative flag must quote the exact paragraph or heading that caused it. General impressions without paragraph evidence do not count.

M (Measure)

Deliver blog-review-{slug}-persona.md containing:
Header with audience_shape_under_review, review_mode: cold-read, answered_model, timestamp
## Architect Persona
## Owner Persona
## Installer Persona
Each persona section answers all 6 decision questions
Each negative flag cites exact paragraph references
## Cross-Persona Agreement Table with issue_id / class / architect / owner / installer / agreement / recommended_action
## Shape Challenge stating whether the original Audience Shape Decision still looks correct
## Commander Recommendation with accept / accept-with-revisions / revise-shape
Gemini 2.5 Pro output preservation: the raw external-model response must be attached or embedded in the report, clearly separated from reviewer-override notes.
Coverage floor: all 3 personas must produce non-empty answers to all 6 decision questions.
Disagreement surface rule: if one persona passes and another fails, the disagreement must be explicitly surfaced, not averaged away.
Cold-read integrity: report must state it did not consume upstream review artifacts before reading the draft.

Anti-patterns

NOT: read upstream review reports before doing the persona pass — INSTEAD: read the draft cold so the external-reader signal is genuinely independent.
NOT: collapse architect, owner, and installer into one blended reaction — INSTEAD: produce 3 separate persona reads and then reconcile them explicitly.
NOT: act like an internal technical reviewer — INSTEAD: judge the article from day-to-day workflow realism and immediate trust, not from internal production criteria.
NOT: flag generic voice drift without paragraph evidence — INSTEAD: quote the exact paragraph or heading that triggered the cold-read failure.
NOT: silently accept a universal article when two personas clearly say the article is for someone else — INSTEAD: explicitly challenge the Audience Shape Decision.

O Alignment

O fails if the article is technically clean but emotionally and professionally misaddressed. The audience cold-read is the only direct check that the right practitioner recognizes themselves in the prose.

🔍 Quality Auditor Claude

G one-liner: Audit whether Wave 1–4 deliverables contain the concrete evidence, claim coverage, and handoff readiness their S blocks promised.
S one-liner: Reverse-index from the actual draft and HTML to every review artifact; enforce testable-claim coverage, S-evidence presence, base-layer slot integrity, and scope-creep boundaries.
Critical M: blog-audit-{slug}-wave4.md contains reverse-index table, testable-claim inventory, S-evidence audit, classified failures, and Commander escalation with pass/block decision.
Skill commands: none.
Model commands: Claude Opus or Claude Sonnet native audit reasoning; no external LLM required for primary path.
Anti-patterns: see standard list.

Full G / S / M / Anti-patterns

S (Strategy)

Reverse-index check: audit starts from the actual article draft and Wave 4 HTML, not from the review artifacts. For each testable claim in the draft, Quality Auditor checks whether that claim appears in:
blog-review-{slug}-facts.md
blog-review-{slug}-sources.md
if voice-related, blog-review-{slug}-persona.md

Claims present in the article but absent from review tables are reverse-index-miss.

Testable claim definition: a sentence or bullet is testable if it contains any of:
a number, percentage, dollar amount, or measured force/load
a code section, standard number, or named regulation
a named case, incident, jurisdiction, date, or year
a comparative claim (higher, lower, more likely, faster, safer)
a causal claim (leads to, causes, reduces, prevents)
an operational instruction that implies factual correctness
a first-person claim with empirical specificity signals
S-evidence gate: for every upstream deliverable, QA asks: did the concrete resources promised in S actually appear in the deliverable? Examples:
Research Deepener promised WebSearch and wrapper logs
Fact Checker promised first-party URLs per verified row
Source Reviewer promised per-claim coverage index
Audience Persona Reviewer promised 3 cold-read persona sections
SEO/AEO Engineer promised validated JSON-LD and link reachability
Base-layer integrity audit: QA confirms the draft remains augmentable and that no human-layer slot was pre-filled or surrounded by disguised sales copy.
Scope-creep anti-pattern control: QA does not fix content, rewrite paragraphs, add evidence, or make strategic editorial choices. It classifies failures and routes them back to the responsible agent or Commander.
Failure classification:
class-1 structural handoff fail: missing artifact, missing required table, missing execution log, malformed slot, absent persona section
class-2 coverage fail: claim present in article but absent from review index, missing URL, missing question answer, incomplete FAQ/schema sync
class-3 scope-creep or role-boundary fail: agent did unassigned work, rewrote another agent's scope, or silently changed audience shape

M (Measure)

Deliver blog-audit-{slug}-wave4.md containing:
## Testable Claim Inventory
## Reverse-Index Table with claim_id / draft_location / fact_checker / source_reviewer / persona_reviewer / status
## S-Evidence Audit by agent
## Base-Layer Integrity Check
## Classified Failures
## Commander Escalation
final verdict PASS / PASS-WITH-NOTES / BLOCK
Reverse-index completion: 100% of testable claims in the article must appear in the Reverse-Index Table.
Audit of promised resources: if an agent promised a concrete resource in S and it is absent from the artifact, QA must mark FAIL even if the article looks plausible.
Base-layer slot audit: QA must confirm the article has ≥ 3 human-layer slots and that none are pre-filled.
Perspective/evidence separation: QA must distinguish "reviewer opinion about quality" from "objective missing evidence."
No silent pass on partial coverage: aggregate statements like "most claims checked" are insufficient; the claim inventory must be explicit.

Anti-patterns

NOT: audit from the review tables forward and assume the article is fully covered — INSTEAD: reverse-index from the actual draft and HTML back into the review artifacts.
NOT: let an agent claim it used a resource or model without visible evidence in the deliverable — INSTEAD: S-promised resources must be observable in the artifact or execution log.
NOT: rewrite content or add missing evidence yourself — INSTEAD: classify the failure and route it back; QA is not a shadow writer or shadow fact-checker.
NOT: treat vague aggregate statements like "most claims reviewed" as acceptable — INSTEAD: require explicit claim-by-claim inventory.
NOT: ignore scope creep because the output looks better — INSTEAD: flag any role-boundary violation that hides ownership or auditability.

O Alignment

Quality Auditor does not create reader value directly. It protects the fleet from shipping false confidence. Without it, the fleet can appear rigorous while silently skipping claim coverage or eroding the base-layer contract.

🌏 Bilingual Publisher Gemini

G one-liner: Publish en + zh + web variants to door-site/blog/{en,zh,web}/{slug}/index.html, run /security-check, enforce empty human-layer slots, stage commit, do NOT push.
S one-liner: Reuse /publish-article template for en; translate en→zh with Taiwan-specific rules; run Gemini Flash natural-voice pass and Gemini 2.5 Pro Taiwan-lexicon pass; /security-check before staging.
Critical M: 3 HTML files written to correct paths; hreflang triple cross-references all 3; human-layer slots still empty or TODO; blocklisted mainland vocabulary absent; /security-check passes; commit staged but not pushed.
Skill commands: /security-check mandatory; /publish-article template reference.
Model commands: Gemini Flash via /ai-fallback for natural-voice QA; Gemini 2.5 Pro via /ai-fallback for Taiwan-specific second pass.
Anti-patterns: see standard list.

Full G / S / M / Anti-patterns

S (Strategy)

English variant: apply /publish-article template directly. Path: door-site/blog/{slug}/index.html.
Chinese variant:
<html lang="zh-Hant">
Brand names not translated
Standard codes not translated
Model numbers not translated
Voice must sound natural to a Taiwanese door-hardware professional
Path: door-site/blog/zh/{slug}/index.html
Web/AEO variant: enhanced schema + Q&A-first layout for answer-engine extraction. Path: door-site/blog/web/{slug}/index.html.
hreflang triple cross-references: every variant links to the other two.
Sitemap + llms updates: update door-site/sitemap.xml, llms.txt, llms-full.txt, and blog/index.html.
Chinese natural-voice QA via Gemini Flash:
Prompt asks whether the voice feels natural for a Taiwanese door-hardware professional
Score must be ≥ 4 or revise
Taiwan-specific second pass via Gemini 2.5 Pro is mandatory:
Prompt asks the model to act as a Taiwan copy editor reviewing for mainland lexical drift, machine-translation smell, wrong professional register, and unnatural collocations
Output must include PASS/FAIL, flagged terms, and preferred Taiwan replacements
Mainland vocabulary blocklist: the following terms must never appear in zh-Hant output; preferred replacements shown in parentheses:
信息 (資訊)
軟件 (軟體)
視頻 (影片)
支持 (支援)
質量 (品質)
硬件 (硬體)
芯片 (晶片)
用戶 (使用者 or 客戶, context-dependent)
運營 (營運)
渠道 (通路)
適配 (相容 or 適用)
賬號 (帳號)
代碼 (程式碼 or 代號, context-dependent)
數據 (資料)
默認 (預設)
配置 (設定)
調用 (呼叫)
接口 (介面)
模塊 (模組)
文檔 (文件)
兼容 (相容)
線程 (執行緒)
緩存 (快取)
日誌 (紀錄)
異步 (非同步)
登錄 (登入)
註冊 (註冊帳號 or 建立帳號, context-dependent)
Automated base-layer enforcement:
every  comment must be followed within 3 lines by either blank content or 
any human-facing prose, sales copy, or explanatory filler inside the slot window = FAIL
/security-check mandatory before staging commit.
Stage commit but DO NOT push. Do not run /upload.

M (Measure)

3 HTML files exist at expected paths and each carries the hreflang triple.
Chinese natural-voice check score ≥ 4 from Gemini Flash; log recorded in blog-publish-{slug}.md.
Gemini 2.5 Pro Taiwan-specific second pass returns PASS; if FAIL, article revised and rechecked.
/security-check log shows PASS.
sitemap.xml, llms.txt, llms-full.txt, and blog/index.html updated.
Staged commit present with message containing [BASE LAYER — awaiting human review before push].
No git push executed.
Automated base-layer grep rule recorded in publish log:
grep -nA3 'HUMAN LAYER:' door-site/blog/{slug}/index.html
grep -nA3 'HUMAN LAYER:' door-site/blog/zh/{slug}/index.html
grep -nA3 'HUMAN LAYER:' door-site/blog/web/{slug}/index.html
PASS only if each slot window contains only blank lines and/or 
Any pre-filled slot content = FAIL
Mainland vocabulary blocklist grep recorded in publish log:
grep -En '信息|軟件|視頻|支持|質量|硬件|芯片|用戶|運營|渠道|適配|賬號|代碼|數據|默認|配置|調用|接口|模塊|文檔|兼容|線程|緩存|日誌|異步|登錄' door-site/blog/zh/{slug}/index.html
required result: 0 hits
Wrapper-call verification recorded: grep -E '^(echo|gemini|codex)' blog-publish-{slug}.md returns 0 hits.
Queue entry state transitioned to ready_for_human_review with Commander signature.
content-plan.md Published Articles section updated after publishing with: article title, URL, status (Published), date, type, languages (EN/ZH/Web), source. Missing this update = publish incomplete; Commander must verify it in gate review.
In addition to content-plan.md, also update admin/content-plan/index.html JavaScript data array with the same article entry (id, priority, title, titleEn, url, whyZh, whyEn, articleZh, articleEn, articleScore, questions). The admin UI reads from this JS array, not from content-plan.md — both must be in sync.

Anti-patterns

NOT: run /upload or git push after staging — INSTEAD: stage and stop; human review must happen first.
NOT: translate brand names, standard codes, or model numbers into Chinese — INSTEAD: keep them verbatim.
NOT: produce machine-translation-feel Chinese — INSTEAD: natural Taiwanese professional voice, Gemini Flash ≥ 4 and Gemini Pro Taiwan-pass required.
NOT: allow mainland-vocabulary drift in zh-Hant output — INSTEAD: enforce the blocklist and Taiwan-specific second-pass review.
NOT: leave human-layer slots pre-filled with prose or sales copy — INSTEAD: slots remain empty or TODO-marked until human review.
NOT: consider publish complete without updating BOTH content-plan.md AND admin/content-plan/index.html JS data — INSTEAD: both files are sources of truth; content-plan.md for markdown consumers, admin JS for the web UI.

O Alignment

Chinese readers are real practitioners whose SEO lift also compounds. The web variant is the answer-engine anchor. And the do-not-push discipline operationalizes the base-layer principle.

Model Invocation Map

Division of labor

Claude Opus: Commander orchestration, conflict resolution, gate judgment, Quality Auditor reasoning
Claude Sonnet: Article Writer core drafting
WebSearch: Research Deepener primary discovery; Fact Checker Tier 2 backup
Gemini Flash via /ai-fallback: Fact Checker verification; SEO/AEO schema validation; Bilingual Publisher natural-voice QA
Gemini Flash via /ai-collab: Commander's Audience Shape Decision second opinion
Gemini 2.5 Pro via /ai-fallback: Audience Persona Reviewer cold-read personas; Bilingual Publisher Taiwan-specific second pass
Codex via /ai-fallback: Source Reviewer citation cross-verification

<table>

<tr><th>Agent</th><th>Wave</th><th>Model</th><th>Purpose</th><th>Command Format</th></tr><tr><td>-------</td><td>------</td><td>-------</td><td>---------</td><td>----------------</td></tr><tr><td>Blog Commander</td><td>all</td><td>Claude Opus + Gemini Flash second opinion</td><td>orchestration + audience-shape decision + conflict resolution</td><td>Native for orchestration; every audience-shape decision also calls <code>/ai-collab --task verify ...</code>; additional external verification uses <code>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "<prompt>" "<chain>"</code></td></tr><tr><td>Research Deepener</td><td>Wave 1</td><td>WebSearch (primary) + <code>/ai-fallback</code></td><td>expand course fragment to 800–1500 words with per-claim first-party URLs</td><td>Research/discovery: WebSearch. Verification/synthesis: <code>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Verify/summarize: [X]" "gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.5-pro,codex"</code></td></tr><tr><td>Article Writer</td><td>Wave 2</td><td>Claude Sonnet</td><td>base-layer draft</td><td>native</td></tr><tr><td>Fact Checker</td><td>Wave 3</td><td>Gemini Flash via <code>/ai-fallback</code> + WebSearch Tier 2</td><td>numeric/regulatory/monetary claim verification</td><td><code>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Verify: [number] [claim]. Return VERIFIED/CORRECTED/UNVERIFIABLE + first-party URL" "gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.5-pro,codex"</code></td></tr><tr><td>Source Reviewer</td><td>Wave 3</td><td>Codex → Gemini 2.5 Pro → Flash-Lite via <code>/ai-fallback</code></td><td>citation cross-verification; min chain depth 3</td><td><code>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Review citations in [file]. Flag: missing source, pre-2018 without version note, single-source claims, priority-violation, opinion-vs-empirical signals" "codex,gemini-2.5-pro,gemini-2.5-flash-lite"</code></td></tr><tr><td>SEO/AEO Engineer</td><td>Wave 4</td><td>Gemini Flash via <code>/ai-fallback</code></td><td>JSON-LD schema validation</td><td><code>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Validate schema.org JSON-LD for Article + FAQPage: [blocks]. Return STRUCTURALLY_VALID/INVALID + error list" "gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.5-pro,codex"</code></td></tr><tr><td>Audience Persona Reviewer</td><td>Wave 4</td><td>Gemini 2.5 Pro via <code>/ai-fallback</code></td><td>architect / owner / installer cold-read persona simulation</td><td>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Role-play [architect</td><td>owner</td><td>installer] persona. Read this blog draft cold. Answer 6 decision questions with paragraph citations." "gemini-2.5-pro,gemini-2.5-flash-lite,codex"</td></tr><tr><td>Quality Auditor</td><td>Wave 4</td><td>Claude Opus or Sonnet</td><td>reverse-index audit, S-evidence gate, handoff readiness</td><td>native</td></tr><tr><td>Bilingual Publisher</td><td>Wave 5</td><td>Gemini Flash + Gemini 2.5 Pro via <code>/ai-fallback</code></td><td>zh natural-voice QA + Taiwan-specific lexical/cultural pass</td><td>Flash: <code>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Read this zh-Hant article. Natural voice for a Taiwanese door-hardware professional? Score 1-5 + list stiff phrasing" "gemini-2.5-flash,gemini-2.5-flash-lite,gemini-2.5-pro,codex"</code>; Pro: <code>bash ~/.claude/skills/ai-fallback/scripts/call_with_fallback.sh "Act as a Taiwan copy editor. Flag any mainland lexical drift, machine-translation smell, and non-Taiwan phrasing in this zh-Hant article. Return PASS/FAIL + fixes" "gemini-2.5-pro,gemini-2.5-flash-lite,codex"</code></td></tr>

</table>

Principle 7 applies: any agent whose S references an external LLM must carry the full command format in its S block, and Commander copies the row into Direction Seed field 5 at dispatch.

---

Direction Seed (Commander Dispatch Template — 9 fields)

Every subagent dispatch carries all 9 fields. Missing a field = briefing failure; deliverable does not enter gate review; subagent must be re-dispatched with the fix.

Fleet ID + Role Name — e.g. BLOG-WRITER-FLEET / Audience Persona Reviewer

Target Audience Persona — one of the 3 canonical audiences from ~/.claude/skills/writing-guide/SKILL.md §2, with concrete description: years of experience, typical workflow, what they type into Google. For universal shape, all three in one briefing; for persona review, each persona gets its own sub-brief.

O (quoted verbatim) — the full O paragraph from this document's §O section

This agent's G/S/M — copy from this doc's agent section, Tier 1 version

Embedded Skill + Model Invocations — copy relevant rows from Skill Invocation Map + Model Invocation Map, with full command format, plus the mandatory knowledge query commands:

`bash

bash ~/.claude/skills/ogsm-framework/scripts/get_patterns_for_failure.sh <failure-type>

bash ~/.claude/skills/ogsm-framework/scripts/get_gotchas_for_context.sh <context-keyword>

bash ~/.claude/skills/ogsm-framework/scripts/get_skills_for_role.sh <role-name>

Hard Constraints — e.g. hand-off slots ≥ 3, word count 900–1400, no raw echo | gemini, mainland-vocabulary blocklist prohibited

Tone + Voice Requirements — audience-matched per writing-guide §2; peer-to-peer with target practitioner; never marketing

Deliverable Format + File Path — exact filename under docs/blog-writer-fleet/{slug}/

Anti-patterns to avoid — at least 3 items, verbatim copied from the agent's own standard list in this doc

### Direction Seed addendum for v2

Commander includes the Audience Shape Decision YAML excerpt in every downstream briefing so each agent sees both the final shape and Gemini's second-opinion result.
For Wave 4, the brief explicitly states that Audience Persona Reviewer and Quality Auditor are independent checks and must not absorb or paraphrase each other's conclusions.
Because the fleet now has 9 agents, Commander tracks brief completeness not only per wave but per agent-instance. A split-3 article can generate multiple Article Writer briefs and multiple publish targets, but the 9-field template stays identical.

### Pilot Dispatch rules

Wave 1 pilot: Research Deepener
Wave 2 pilot: first Article Writer
Wave 3 pilot: Fact Checker
Wave 4 pilot: Audience Persona Reviewer
Wave 5: no extra pilot; rely on Gate 4 exit conditions plus publish enforcement

Fan-out checklist (Commander personally runs after pilot returns)

Deliverable shape matches expected structure

Audience is explicit

Knowledge query outputs present

/ai-fallback execution log present if required

Anti-patterns verbatim-copied from source standard list

If the pilot is Audience Persona Reviewer: all 3 personas answered all 6 decision questions

Pilot fail → Commander rewrites the failing briefing field, re-dispatches pilot only, retries until pass. No fan-out until pilot passes.

---

Wave Gate Conditions

### Gate 0 → Wave 1 begins

[ ] Queue candidate read; pending state confirmed
[ ] Blog Commander produced Audience Shape Decision YAML appended to queue entry
[ ] /ai-collab --task verify run on the proposed shape and logged
[ ] If Gemini disagreed, Commander override rationale documented
[ ] Direction Seed briefing for Research Deepener has all 9 fields + knowledge query commands

### Gate 1 → Wave 2 begins

[ ] blog-research-{slug}.md delivered with verbatim research_data, expanded material 800–1500 words, per-claim first-party URLs, Execution Log, ≥ 3 new primary sources
[ ] Queue state transitioned pending → researching with Commander signature
[ ] Pilot Dispatch PASS documented in blog-gate-review-{slug}-wave1.md
[ ] grep -E '^(echo|gemini|codex)' blog-research-{slug}.md returns 0 hits

### Gate 2 → Wave 3 begins

[ ] blog-draft-{slug}.md or per-audience variants delivered with 900–1400 words, front-loaded answer in first 200 words, ≥ 3 HUMAN LAYER slots, TODO markers present, ≥ 2 internal link seed notes, YAML frontmatter with declared audience
[ ] Writing-guide §5 checklist attached and passing
[ ] Queue state transitioned researching → drafting with Commander signature

### Gate 3 → Wave 4 begins

[ ] blog-review-{slug}-facts.md delivered with 100% numeric claims reviewed, zero NEW-03 forbidden phrases, under-delivery log if applicable, Execution Log with ≥ max(3, ceil(claims/3)) entries, first-party URLs for every verified row
[ ] blog-review-{slug}-sources.md delivered with per-claim coverage index, reconciliation table vs facts review, opinion-vs-empirical check, 5% unverifiable budget alignment, single-source ≤ 40% and Waterson ≤ 20%
[ ] Queue state transitioned drafting → reviewing with Commander signature
[ ] No unresolved class-1 structural fails from Wave 3

### Gate 4 → Wave 5 begins

[ ] blog-seo-{slug}.md + finalized English HTML delivered with Article + FAQPage JSON-LD validated by Gemini Flash, OG/Twitter/hreflang complete, ≥ 5 internal links, ≥ 5 FAQ pairs
[ ] blog-review-{slug}-persona.md delivered with all 3 persona sections completed, cross-persona agreement table, and explicit shape challenge verdict
[ ] blog-audit-{slug}-wave4.md delivered with testable-claim inventory, reverse-index table, S-evidence audit, base-layer integrity check, and final PASS/PASS-WITH-NOTES/BLOCK verdict
[ ] Wave 4 pilot PASS documented for Audience Persona Reviewer
[ ] Keyword cluster audit passes
[ ] Quality Auditor verdict is not BLOCK
[ ] Queue state remains reviewing (Wave 4 is additive and evaluative)

### Gate 5 → ready_for_human_review

[ ] 3 variants written: door-site/blog/{slug}/index.html, door-site/blog/zh/{slug}/index.html, door-site/blog/web/{slug}/index.html
[ ] All 3 variants carry hreflang triple
[ ] Chinese natural-voice score ≥ 4 from Gemini Flash check
[ ] Gemini 2.5 Pro Taiwan-specific lexical pass = PASS
[ ] Mainland-vocabulary blocklist grep returns 0 hits
[ ] Human-layer slot grep checks show only empty/TODO content within 3 lines after each slot marker
[ ] /security-check PASS logged in blog-publish-{slug}.md
[ ] sitemap.xml, llms.txt, llms-full.txt, blog/index.html updated
[ ] Git commit staged with [BASE LAYER — awaiting human review before push] tag
[ ] git push NOT executed
[ ] Queue state transitioned reviewing → ready_for_human_review with Commander signature

### Gate 6 → published (OUTSIDE fleet scope — human action)

Human reviewer opens the staged commit in local editor
Human reviewer appends content into the 3 HUMAN LAYER slots
Human reviewer runs /upload → git push → deploy
Human reviewer manually transitions queue state ready_for_human_review → published

Fleet has no agent at Gate 6. That is the human boundary. Moving Gate 6 inside the fleet would violate the base-layer principle.

---

Known Issues / To Monitor

### Issue #1 — No Audience Persona Reviewer in v1

Status in v2: RESOLVED

Implemented change: Added Audience Persona Reviewer as Agent #8 in Wave 4 using Gemini 2.5 Pro persona simulation via /ai-fallback, modeled on the HSW-002 Project Architect Advisor pattern but expanded to architect / owner / installer cold-read passes.

What to monitor now: Do the 3 persona reads produce meaningfully different flags when a draft is mis-shaped? If they collapse into generic "good article" summaries, the persona prompt needs tightening.

### Issue #2 — No cross-fleet Quality Auditor / Performance Supervisor

Status in v2: RESOLVED for blog-fleet scope

Implemented change: Added Quality Auditor as Agent #9 in Wave 4. Reused the v5.1 audit pattern, but adapted it to blog deliverables with reverse-index checking, S-evidence verification, and explicit testable-claim inventory.

What to monitor now: Does QA catch real misses without becoming a shadow reviewer who rewrites content? If QA starts absorbing other roles, tighten the scope-creep rule further.

### Issue #3 — Audience shape decision had no external validator

Status in v2: RESOLVED

Implemented change: Every Audience Shape Decision now requires /ai-collab --task verify with Gemini Flash second opinion. Agreement/disagreement and any override rationale are logged in the queue entry and dispatch log.

What to monitor now: After 10 articles, compare Commander final-shape choices against Gemini disagreement rate. If disagreement is frequent but overrides are weak, Commander prompt or decision rubric needs adjustment.

### Issue #4 — Layer 2.5 dry-run not yet performed

Status in v2: RESOLVED at spec level; execution still required before production

Implemented change: Pre-Production Checklist now makes Layer 2.5 dry-run a mandatory gate before first real dispatch, and the required dry-run count is updated from 7 agents to 9 agents.

What to monitor now: First real run should produce zero "I don't know which skill/model to call" failures. Any such failure indicates the dry-run was skipped or done loosely.

### Issue #5 — Base-layer principle had no automated enforcement

Status in v2: RESOLVED

Implemented change: Bilingual Publisher now has a publish-stage grep-based enforcement rule: every HUMAN LAYER slot must remain empty or TODO-marked within 3 lines after the comment. Pre-filled slots are a hard FAIL.

What to monitor now: Human reviewers should not need to delete slot content before appending their own layer. If they do, tighten the check from publish-stage to pre-Wave-4 audit as well.

### Issue #6 — No bilingual voice divergence detection

Status in v2: RESOLVED

Implemented change: Bilingual Publisher now runs both Gemini Flash natural-voice QA and Gemini 2.5 Pro Taiwan-specific second-pass review, plus a 20+ term mainland-vocabulary blocklist grep.

What to monitor now: If Taiwanese readers still detect unnatural phrasing that does not use blocked terms, add a phrase-level style list, not just a lexical blocklist.

### Issue #7 — Wave 4 concurrency may create review fan-in overload on split-3 batches

Status in v2: NEW

Concern: On split-3 topics, a single candidate can generate multiple draft variants plus one SEO package, one persona report, and one audit report. Commander may become a bottleneck during Gate 4 synthesis.

Possible fixes: (a) add per-audience mini gate reviews before the consolidated Gate 4; (b) template the Wave 4 cross-artifact comparison tables more aggressively; (c) batch split-3 candidates separately from universal candidates.

What to watch: If Gate 4 reviews become materially slower than Waves 1–3 combined on split-3 topics, the synthesis layer needs refactoring.

---

Pre-Production Checklist (before first real dispatch)

### Mandatory Layer 2.5 Dry-Run Protocol

Layer 2.5 dry-run is MANDATORY before the first production dispatch. This is not advisory and not deferrable. The goal is to verify that every agent can actually execute the skill/model invocations embedded in its S block and that Commander can dispatch them without missing parameters.

Dry-run scope in v2: all 9 agents

Blog Commander

Research Deepener

Article Writer

Fact Checker

Source Reviewer

SEO/AEO Engineer

Audience Persona Reviewer

Quality Auditor

Bilingual Publisher

Dry-run protocol

Use a throwaway candidate slug and a synthetic but realistic queue entry.
Dispatch each agent once with a valid 9-field Direction Seed.
Require each agent to report:
recognized role
intended deliverable path
intended skill/model invocation
confidence HIGH
uncertainty 0
Any missing command, wrong path, wrong model, or confusion about scope = dry-run FAIL.
No production candidate may be dispatched until all 9 dry-runs pass.

### Checklist

[ ] Run python ~/.claude/skills/ogsm-framework/scripts/validate_s_to_m_coverage.py <path-to-this-file> and fix all gaps
[ ] Run all 4 OGSM validators against this document
[ ] Layer 1 skill-integration check: verify /security-check exists and is executable
[ ] Layer 1 skill-integration check: verify /publish-article template is readable
[ ] Layer 1 skill-integration check: verify /ai-collab --task verify is callable in Commander's environment
[ ] Layer 2 format check: Skill Invocation Map commands match source skill syntax
[ ] Layer 2 format check: Model Invocation Map rows match S blocks character-for-character where required
[ ] Layer 2.5 dry-run: dispatch 9 throwaway test agents (one per role), verify each reports HIGH confidence + 0 uncertainty on its skill/model invocations
[ ] Dry-run includes Audience Shape Decision verification path and logs Gemini agreement/disagreement correctly
[ ] Dry-run includes Audience Persona Reviewer cold-read artifact with 3 persona sections
[ ] Dry-run includes Quality Auditor reverse-index table on a toy draft
[ ] Dry-run includes Bilingual Publisher blocklist grep and human-layer slot grep against toy HTML
[ ] Pick first real queue candidate for pilot run; record selection rationale
[ ] Confirm human reviewer is available at Gate 6

---