Files

T

claude (blind_chess) 1674695eef docs: AI Phase 1 shipped — context, decisions, handoff

- CLAUDE.md: phase line moved to "Phase 1 deployed"; key files lists
  the new bot module, game-end extraction, and selfplay harness.
- DECISIONS.md: new "Phase 1 implementation outcomes" subsection records
  the CasualBrain-engine reversal, the FEN-vanilla-only invariant, why
  blind keeps heuristic, and the bot-slot token randomization. The
  earlier "Stockfish deferred" entry is partially superseded.
- .claude/handoffs/: handoff document for the next session.

2026-04-28 15:20:24 -04:00

13 KiB

Raw Blame History

Handoff: AI Phase 1 (Casual bot) shipped

Session Metadata

Created: 2026-04-28 ~19:15 UTC
Project: /home/claude/bin/blind_chess
Branch: feat/ai-player-phase-1-casual (16 commits ahead of main; pending merge as final step of this handoff)
Repo: git.sethpc.xyz/Seth/blind_chess
Live URL: https://chess.sethpc.xyz (Phase 1 deployed and verified)

Handoff Chain

Continues from: 2026-04-28-170713-ai-player-spec.md — AI player spec written and approved.
Supersedes: None.

Current State Summary

Phase 1 of the AI player feature (Casual bot) is deployed and live. Playing vs a Casual bot is now an option from the landing page, alongside the existing "play with a friend" flow.

This session executed docs/superpowers/plans/2026-04-28-ai-player-phase-1-casual.md via subagent-driven development: 13 tasks, dispatched as fresh subagents per task with two-stage review (spec compliance + code quality). Several tasks surfaced real plan bugs that subagents fixed inline; the most consequential reversal was during Task 11 (self-play harness): the hand-rolled scoring algorithm in CasualBrain lost to a random-move baseline 7-7 in 100-game self-play, far below the spec's ≥80% acceptance bar. Solution: swapped vanilla-mode CasualBrain to delegate to js-chess-engine (level 2, randomness=30); blind mode kept the heuristic. Casual now wins 96-97% vs Random in vanilla, in both colors.

Architecture Overview (what's deployed)

packages/server/src/bot/ — new module:
- brain.ts — Brain interface, BrainInput/BrainAction/CandidateMove/AttemptHistoryEntry types. BrainInput.fen set ONLY in vanilla mode (preserves view-filter invariant).
- candidates.ts — legalCandidates(game, color). Vanilla: chess.js .moves({verbose: true}). Blind: geometricMoves over own pieces + promotion expansion.
- casual-brain.ts — CasualBrain implements Brain. Vanilla: delegates to js-chess-engine at level 2; blind: heuristic scoring (capture proxy / development / center / advance). Promotion default: queen. Draw response based on own material count.
- driver.ts — BotDriver per-game orchestrator. Mutex via decideInFlight, retry cap of 5, dispatches via handleCommit/announce, on game end calls brain.dispose?.().
- index.ts — public re-exports.
packages/server/src/game-end.ts — extracted from ws.ts: endGame/finalizeIfEnded. Both ws.ts and bot/driver.ts use it.
packages/server/src/games.ts — bot driver registry (attachBotDriver, getBotDriver, disposeBotDriver). createGame accepts optional vsAi: { brain } and fills the bot's slot with a synthetic player slot (random token, no socket). pruneFinished cleans the registry.
packages/server/src/state.ts — Game gains optional aiOpponent?: { color; brain } (informational) and required lastBroadcastIdx: { w: number; b: number } (per-color watermark for slice broadcasting).
packages/server/src/ws.ts — refactored: pokeBot(game) helper called after every state-mutating handler; broadcastSinceLast(game) replaces the old broadcastNewAnnouncements (slices game.announcements from each color's watermark). Handlers are async; router uses void casts to discard handler Promises.
packages/server/src/server.ts — POST /api/games handles vsAi: { brain: 'casual' }: instantiates CasualBrain + BotDriver, attaches to registry. vsAi.brain === 'recon' returns 503 (Phase 2 not implemented). joinUrl: null for AI games.
packages/shared/src/protocol.ts — CreateGameRequest.vsAi, CreateGameResponse.joinUrl: string | null, aiOpponent on joined and update server messages.
packages/server/src/validation.ts — Zod schema for vsAi.
Client (packages/client/) — landing page split into two sections (friend / vs computer). In-game UI shows a "Casual bot" badge in the topbar; turn label says "Casual bot is moving…" when bot's turn. The "Opponent disconnected" banner is suppressed for AI games.
scripts/selfplay.ts — operator CLI. pnpm selfplay --white casual --black random --games 100 --mode vanilla. Reports W/B/D/MaxPly/Err and end-reason histogram. Supports --transcripts for per-game logs.

Phase 1 Acceptance — Met

Check	Result
100 Casual self-play vanilla games complete	✅ Err=0 across all runs
Median ply 20-200 in self-play	✅ avgPly~52 (engine vs random), ~116 (Casual vs Casual)
Casual ≥80% vs Random, both colors	✅ 97% as W, 96% as B
All unit + integration tests pass	✅ 75/75 (21 shared + 54 server)
Live smoke checklist	✅ /api/health, AI game creation, recon→503, no journald errors
Branch merged + deployed	⏳ Pending merge (final step of this session)

Critical Files

File	Status	Notes
`docs/superpowers/specs/2026-04-28-ai-player-design.md`	Unchanged	Original spec; still the source of truth for Phase 2.
`docs/superpowers/plans/2026-04-28-ai-player-phase-1-casual.md`	Unchanged	Phase 1 plan; can be archived or marked "executed" if useful.
`CLAUDE.md`	✅ Updated	"Current State" reflects Phase 1 deployed; "Key files" lists new bot module.
`DECISIONS.md`	✅ Updated	New "Phase 1 implementation outcomes" section; the previous "Stockfish deferred" entry is now strikethrough (partial supersede — using `js-chess-engine` instead).
`packages/server/src/bot/`	✅ New	Brain, BotDriver, CasualBrain, candidates, index.
`packages/server/src/game-end.ts`	✅ New	Extracted endGame/finalizeIfEnded.
`scripts/selfplay.ts`	✅ New	Self-play harness. Run via `pnpm selfplay`.
`.secrets.baseline`	✅ Refreshed	The previous baseline was stale (~6087 lines → 8196 after refresh). pnpm-lock.yaml integrity hashes for js-chess-engine were tripping the secret-detection hook.

Decisions Made (highlights — full list in DECISIONS.md)

CasualBrain reversal: vanilla mode now delegates to js-chess-engine at level 2. Hand-rolled scorer lost to random — empirically broken. Engine swap brought it to 96-97% vs random.
BrainInput.fen is vanilla-only: blind mode omits the FEN to preserve the view-filter invariant. The engine cannot smuggle opponent positions past the security boundary.
Blind mode keeps the heuristic: a chess engine isn't useful when the bot only sees its own pieces. That gap is what Phase 2 (Recon) addresses with belief-state-from-announcements.
Bot-slot tokens are randomized: not a fixed placeholder. Closes a hijack vector caught in code review.
endGame/finalizeIfEnded extracted to game-end.ts: both ws and driver need to set finished state; duplication risk eliminated.
pokeBot → broadcastSinceLast order is load-bearing: the bot's response (move + announcements) must be in game.announcements before broadcasting, so the human sees the bot's reply in the same WS message they receive after their own move.

Immediate Next Steps

Merge feat/ai-player-phase-1-casual to main (final step of this handoff).

git checkout main
git merge --ff-only feat/ai-player-phase-1-casual || git merge --no-ff feat/ai-player-phase-1-casual
git push origin main

Soak Phase 1 for a few days of real play before starting Phase 2. Watch for:
- Bot-driver errors in journald (journalctl -u blind-chess | grep "bot driver error").
- Mid-game crashes or stuck games.
- User feedback on Casual's strength (too weak / too strong / fine).
When ready, write Phase 2 plan — docs/superpowers/plans/2026-04-28-ai-player-phase-2-recon.md against the existing spec. Phase 2 reuses the Brain and BotDriver infrastructure unchanged; new pieces are OllamaClient, ollama-endpoints (preflight + failover), prompt, parse, ReconBrain, plus aiInfo protocol field, 'ai_unavailable' end reason, post-game reasoning reveal UI.

Blockers / Open Questions

Casual at level 2 may be too strong for some users. Beats random 96-97% which is the intended acceptance bar, but a careful human is supposed to win against Casual. If users report Casual is unbeatable, drop to level 1. If users report it's trivial, raise to level 3. (packages/server/src/bot/casual-brain.ts:33 — change the default in CasualOpts.)
Blind mode self-play games are very short (avgPly=16, all resignations). The heuristic exhausts its retry cap (5) when the bot picks a move that can't legally proceed in blind mode. This is functional but observation: blind Casual is much weaker than vanilla Casual. Consider raising retry cap or improving heuristic if blind Casual feels broken in real play.
js-chess-engine declares engines: { node: '>=24' } but works on Node 22.22.2. Engines is advisory by default. If a future Node update breaks it, pin to v1.x of the package (npm i js-chess-engine@^1.0.0) — older API but compatible.

Deferred Items (Phase 2 work)

All from the original AI spec, untouched:

ReconBrain (gemma4:26b chat agent on steel141 RTX 3090 Ti, pve197 V100 fallback).
Mid-game GPU failover, preflight, AI-unavailable end state.
Persistent chat history per game; post-game reasoning reveal UI.
aiInfo protocol field (model + GPU + host).
Acceptance bar: Recon wins ≥60% over 50 Recon-vs-Casual self-play games.

Important Context for Future Sessions

The bot's BoardView is the only egress to the engine, in vanilla mode. This invariant is preserved structurally: the FEN is set in BrainInput only when mode === 'vanilla'. Phase 2 ReconBrain will not need this field at all (it gets the view + announcements only — same input shape as a human player who can't see the FEN of the actual game).
Casual and Recon brains are both architecturally instances of Brain. Phase 2 just adds another Brain implementation against the same BotDriver. The driver's mutex / retry / dispatch / dispose lifecycle does NOT need changes.
Watermark advance only on successful dispatch (in BotDriver.runDecisionCycle). On retry, the brain still sees the FSM's rejection announcement in newAnnouncements. This matters for ReconBrain (Phase 2) which uses announcements as evidence; CasualBrain ignores them.
scripts/selfplay.ts is the canonical evaluation tool. Phase 2 will extend it to support --white recon --black casual etc. The harness sets game.aiOpponent = undefined; game.status = 'active' after createGame returns — that's how it transitions out of "waiting" without a hello.
The pre-commit hook is detect-secrets-hook --baseline .secrets.baseline in /home/claude/.config/git/hooks/pre-commit. If you add a new dep and pnpm-lock.yaml hashes get flagged, run detect-secrets scan > .secrets.baseline to refresh.

Files Modified / Added This Session

File	Change
(new) `packages/server/src/bot/{brain,candidates,casual-brain,driver,index}.ts`	The bot module (~600 LoC).
(new) `packages/server/src/game-end.ts`	Extracted from ws.ts.
(new) `packages/server/test/unit/bot/{candidates,casual-brain,driver}.test.ts`	27 unit tests.
(new) `packages/server/test/integration/ai-game-casual.test.ts`	5 integration tests.
(new) `scripts/selfplay.ts`	Operator CLI.
(new) `docs/superpowers/plans/2026-04-28-ai-player-phase-1-casual.md`	The plan.
`packages/server/src/state.ts`, `games.ts`, `validation.ts`, `server.ts`, `ws.ts`	Wired up.
`packages/shared/src/protocol.ts`	Added `vsAi`, `aiOpponent`, nullable `joinUrl`.
`packages/client/src/lib/Landing.svelte`, `Game.svelte`, `stores/game.svelte.ts`	UI.
`package.json`, `pnpm-lock.yaml`, `packages/server/package.json`	Added `js-chess-engine`, `tsx`.
`CLAUDE.md`, `DECISIONS.md`	Context updates.
`.secrets.baseline`	Refreshed.

Environment State

CT 690 / blind-chess.service: running. systemctl is-active returns active. Uptime measured from the deploy-restart at 2026-04-28 ~19:14 UTC.
Active processes: none session-relevant. The deploy was a normal restart of the systemd unit.
Environment variables: none added/changed.
Secrets: none added; .secrets.baseline was refreshed to a clean state (the old one had ~4500 lines of stale per-file entries).

Live URL: https://chess.sethpc.xyz — Phase 1 live.
Repo: https://git.sethpc.xyz/Seth/blind_chess — feat/ai-player-phase-1-casual branch (pending merge to main).
Spec: docs/superpowers/specs/2026-04-28-ai-player-design.md.
Plan: docs/superpowers/plans/2026-04-28-ai-player-phase-1-casual.md.
Decisions: DECISIONS.md "AI / computer player" section + new "Phase 1 implementation outcomes" subsection.
Project identity: CLAUDE.md.
Prior handoffs: 2026-04-28-170713-ai-player-spec.md, 2026-04-28-152000-mvp-deployed.md, 2026-04-28-104344-spec-approved-ready-for-plan.md, 2026-04-28-kickoff.md.

Security Reminder: This handoff describes Phase 1 deployment; no credentials, secrets, or sensitive endpoints are exposed in the handoff or the deployed code. The bot uses no external services in Phase 1 (Phase 2 will add Ollama endpoints).

13 KiB Raw Blame History