docs: log AI-player spec approval, update context, add handoff

Updates CLAUDE.md "Current State" + "Key files" to point at the new spec.
Adds DECISIONS.md "AI / computer player" section (11 settled decisions).
Strikes through the prior "Client-side AI / hint generation — out of scope"
row with a "partially superseded" note: the reversal applies only to the
human-vs-AI path. Adds 7 new Deferred/Rejected rows for AI-feature scope.

Handoff at .claude/handoffs/2026-04-28-170713-ai-player-spec.md captures
session state for the next pickup (writing-plans → Phase 1 implementation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
claude (blind_chess)
2026-04-28 13:12:04 -04:00
parent 288693fcd6
commit 729199097e
3 changed files with 217 additions and 3 deletions
@@ -0,0 +1,187 @@
# Handoff: AI/computer player spec written and approved
## Session Metadata
- Created: 2026-04-28 ~17:07 UTC
- Project: /home/claude/bin/blind_chess
- Branch: `main`
- Repo: `git.sethpc.xyz/Seth/blind_chess`
- Recent commits: `288693f docs(spec): add AI/computer player design spec` (this session) on top of `a878dee fix(client): wrap connect/disconnect in untrack() to break effect loop` (prior work).
- Live URL: **https://chess.sethpc.xyz** (MVP, unaffected by this session).
## Handoff Chain
- **Continues from**: [2026-04-28-152000-mvp-deployed.md](./2026-04-28-152000-mvp-deployed.md) — MVP deployed and live.
- **Supersedes**: None.
## Current State Summary
Seth invoked the workflow `handoff -> spec ai/computer player`, then closed the session with `approved write spec -> update context -> create handoff -> git commit -> close session`. In this session: ran the brainstorming skill end-to-end with him for the AI/computer player feature, presented six design sections section-by-section with approval gates, wrote the full spec to `docs/superpowers/specs/2026-04-28-ai-player-design.md`, self-reviewed it, committed and pushed to gitea, and updated `CLAUDE.md` + `DECISIONS.md` to reflect the new spec.
**Implementation has not started.** The next session can directly invoke `superpowers:writing-plans` against the new spec to produce a step-by-step implementation plan.
## Architecture Overview (the spec)
Two AI bots, phased delivery:
- **Phase 1 — Casual bot.** Algorithmic, in-process, ~200 LoC of TypeScript. Plays legal moves with simple heuristics (capture-bias, development bonus, anti-shuffling penalty). Always available; no external dependencies. Plays badly but quickly. Single-week scope.
- **Phase 2 — gemma4 recon bot.** Multi-turn chat agent backed by `gemma4:26b` running on the homelab Ollama service (steel141 RTX 3090 Ti primary, pve197 V100 fallback). Maintains a private per-game chat history that **persists across turns** as the bot's belief memory. Reasoning hidden from human during play, revealed in collapsible post-game panel. Multi-week scope; depends on prompt engineering iteration.
Both bots play through the **same view filter and finite-state machine that humans use**. The architectural invariant from `CLAUDE.md` ("the view filter is the only egress for board state") applies to bots — a bot consumes only `buildView(game, botColor)` plus moderator announcements. **No oracle access.** The Recon bot is honestly playing blind chess.
Modules to be added under `packages/server/src/bot/`:
- `brain.ts``Brain` interface, `BrainInput`, `BrainAction` types.
- `driver.ts``BotDriver` class (per-game orchestration, mutex, retry cap).
- `casual-brain.ts``CasualBrain` class.
- `recon-brain.ts``ReconBrain` class (Phase 2).
- `ollama-client.ts``OllamaClient` interface + production HTTP impl (Phase 2).
- `ollama-endpoints.ts` — endpoint priority list, preflight, mid-game failover (Phase 2).
- `prompt.ts` — system prompt template, per-turn user message builder (Phase 2).
- `parse.ts` — extract JSON from Gemma's response (Phase 2).
- `candidates.ts` — legal candidate computation (vanilla vs blind paths).
Tests under `packages/server/test/unit/bot/` and `packages/server/test/integration/`. Self-play harness at `scripts/selfplay.ts` (operator tool, NOT in CI).
Protocol additions:
- `CreateGameRequest.vsAi?: { brain: 'casual' | 'recon' }`
- `EndReason` adds `'ai_unavailable'`
- `joined` and `update` server messages add optional `aiInfo: { model, gpu, host }`
UI additions on the existing client: two-section landing layout, AI badge under opponent slot, "AI is thinking..." indicator (with first-move "starting up" variant), moderator-panel-area UI-system messages for game-start GPU info + failover, collapsible post-game reasoning reveal.
Acceptance bars:
- **Phase 1 done:** 100 Casual self-play games complete; Casual beats random-mover ≥80%.
- **Phase 2 done:** Recon wins ≥60% over 50 Recon-vs-Casual games; ≤8s/move on 3090 Ti (≤10s on V100); manual inspection of 10 reasoning logs shows Gemma using announcements as evidence.
## Critical Files (added or to be added)
| File | Status | Purpose |
|------|--------|---------|
| `docs/superpowers/specs/2026-04-28-ai-player-design.md` | ✅ Written and committed | Full design spec — read this first when implementing. |
| `CLAUDE.md` | ✅ Updated | "Current State" notes spec is approved; "Key files" links to the new spec. |
| `DECISIONS.md` | ✅ Updated | New "AI / computer player" section logs design decisions; `Deferred/Rejected` superseded the prior "Client-side AI / hint generation" rejection (partial reversal). |
| `packages/server/src/bot/` | ⏳ Not yet created | Where the new modules will live. |
| `packages/shared/src/protocol.ts` | ⏳ Not yet modified | Will add `vsAi`, `aiInfo`, `'ai_unavailable'` per the spec. |
| `scripts/selfplay.ts` | ⏳ Not yet created | Operator tool for running AI-vs-AI evaluation games. |
## Tasks Finished
- Read prior handoff (MVP deployed) and original design spec.
- Read `~/bin/gemma4-research/README.md`, `SYNTHESIS.md`, `CORPUS_ollama_variants.md` for Gemma 4 implementation guidance.
- Brainstormed the feature with Seth in 6 sections (architecture, components, data flow, error handling, testing, UX), each with approval gate.
- Pivoted on Seth's input: AI runs on steel141 3090 Ti (not pve197 V100), pve197 V100 as fallback; bot reasoning persistent across turns (multi-turn chat agent, not stateless oracle); no mid-game flap-back but one-way GPU failover allowed.
- Wrote full spec at `docs/superpowers/specs/2026-04-28-ai-player-design.md` (674 lines, 3 appendices).
- Self-reviewed the spec (no placeholders, retry/timeout/acceptance numbers consistent, scope clear, fixed the `peer-status` ambiguity).
- Committed and pushed: `288693f docs(spec): add AI/computer player design spec`.
- Updated `CLAUDE.md` (Current State, Key files) and `DECISIONS.md` (new "AI / computer player" section + amended Deferred/Rejected).
- Wrote this handoff.
## Files Modified / Added
| File | Changes |
|------|---------|
| (new) `docs/superpowers/specs/2026-04-28-ai-player-design.md` | 674-line design spec |
| `CLAUDE.md` | "Current State" updated; "Key files" links new spec; "Start Here" lists both specs |
| `DECISIONS.md` | New "AI / computer player (designed 2026-04-28, not yet implemented)" section with 11 entries; Deferred/Rejected amended to supersede prior "Client-side AI / hint generation" rejection (partial); 7 new deferred/rejected rows for AI-feature scope |
| (new) `.claude/handoffs/2026-04-28-170713-ai-player-spec.md` | This handoff |
| (new, ignored) `.backup/CLAUDE.md.<ts>`, `.backup/DECISIONS.md.<ts>` | Pre-edit backups per global safety rule |
## Decisions Made
All in `DECISIONS.md` "AI / computer player" section. Highlights:
- Two-phase delivery (Casual first, Recon second).
- In-process virtual players, not external WS clients. Bots use same view filter as humans.
- Recon is a stateful chat agent with persistent per-game memory; reasoning hidden during play, revealed post-game.
- Endpoint priority steel141 → pve197; mid-game one-way failover; preflight blocks game creation if both down.
- GPU surfaced to user via persistent badge + game-start UI message.
- `gemma4:26b` chosen (not 31B — 5× slower for marginal gain; not e4b — too small).
- Per-move latency caps: 30s normal, 90s first-move (covers cold-start).
- Recon "done" bar: ≥60% wins over 50 Recon-vs-Casual self-play games.
## Immediate Next Steps
1. **Run the writing-plans skill against the new spec.** The brainstorming skill's terminal state is invoking writing-plans; we skipped that to close the session, so the next session should pick it up. Command: `superpowers:writing-plans` against `docs/superpowers/specs/2026-04-28-ai-player-design.md`. The plan should split clearly into Phase 1 (Casual) and Phase 2 (Recon) work streams; Phase 1 is single-week scope, Phase 2 multi-week.
2. **Implement Phase 1** per the plan. Order from the spec's Appendix C: scaffold `packages/server/src/bot/`, write the Brain interface, implement `CasualBrain` + tests, implement `BotDriver` + tests with `StubBrain`, wire up `legalCandidates` computation, add protocol changes (`vsAi`, bot registry), wire `POST /api/games`, wire `ws.ts` observer, build the client landing page two-section layout + thinking indicator, write integration tests, write self-play harness for Casual-vs-Casual.
3. **Deploy Phase 1 to CT 690**, run live smoke checklist for Casual.
4. **Implement Phase 2** per the plan. Order: `OllamaClient` interface + HTTP impl, endpoint preflight + failover, prompt template, JSON parser, `ReconBrain` + tests (mocked Ollama), protocol additions for `aiInfo`, `POST /api/games` Recon path with preflight + warmup, driver retry/fallback wiring, client GPU badge + system messages + post-game reasoning reveal, integration tests, self-play harness for Recon-vs-Casual, prompt iteration until 60% bar met.
5. **Deploy Phase 2 to CT 690**, run live smoke checklist for Recon (warm, cold, failover, both-down).
## Blockers / Open Questions
- **Recon's actual playing strength is the central research-y unknown.** LLMs play vanilla chess poorly, but Gemma's task here is different — it's reasoning under uncertainty, picking from a pre-computed legal candidate list, not computing tactical depth. The 60% Recon-vs-Casual bar is a guess; we'll learn the real number from `scripts/selfplay.ts`. Spec's "Decision triggers" section (under Acceptance criteria) describes how to react if the bar is missed.
- **mort-3090-scheduler GPU contention.** The scheduler is supposed to yield to other GPU users, but verifying this under Recon load is unmeasured. Plan: monitor steel141 GPU utilization during early Recon games; if mort jobs interfere, add explicit coordination.
- **Cold-start UX on first Recon move.** 3060s is long. The "AI is starting up..." copy mitigates but doesn't eliminate. If users complain, escalation path is in the spec's Risks #2.
- **Chat history grows unboundedly.** 32K context covers ~128 turns; longer games would overflow. If seen in practice, add per-turn compaction (summarize older turns into running "what I've inferred" summary). Not MVP unless triggered.
## Deferred Items
See `DECISIONS.md` "Deferred / Rejected" — specifically the new AI-feature rows: difficulty slider, Stockfish for vanilla AI, live token streaming, GPU flap-back, public AI vs AI spectator games, context compaction, bot rating/personalities. None block Phase 1 or Phase 2.
## Important Context
- **The spec assumes `gemma4:26b` is on both steel141 and pve197.** Verified via `~/bin/CLAUDE.md` Ollama inventory at the time of writing. If either host's model inventory drifts, the preflight will fall through to the other host or fail.
- **steel141 `OLLAMA_KEEP_ALIVE=30m`** — first call after >30 min idle pays a 3060s reload cost. Spec's first-move 90s timeout exists specifically to absorb this. Reference: `~/bin/CLAUDE.md` "Ollama models" section.
- **The `gemma4:26b` `think: false` gotcha.** Per `~/bin/gemma4-research/GOTCHAS.md`, setting `think: false` silently breaks 26B in multi-turn tool-calling loops. Spec explicitly says "do **not** set `think: false`" for this reason. Implementation must respect this.
- **The `format: "json"` gotcha.** Per `~/bin/gemma4-research/SYNTHESIS.md`, `format: "json"` causes infinite loops on nested schemas. Spec says use client-side regex JSON extraction instead. Implementation must respect this.
- **Bot has no `PlayerToken`, no WS connection, no grace timer.** This is new architectural ground. Spec's Architecture section "Key principle 5" makes this explicit, but it's a subtle point that an implementer might miss when wiring up `peer-status` for the bot's slot.
- **The reasoning is the ONLY persistent state for Recon.** No SQLite, no disk. Server restart drops Recon's chat history with the rest of the game state, consistent with current MVP behavior. If we add SQLite later (deferred), the chat history would be a natural thing to persist alongside game state.
- **Self-play harness needs an in-process bot adapter that bypasses the WS layer.** It's documented in spec section 5.5 but not deeply specified. The cleanest implementation is to instantiate `BotDriver` directly against a Game and let it use the in-process commit handler — same path the production code uses.
- **The DECISIONS.md row "Client-side AI / hint generation" was previously written as fully rejected.** This session partially reversed it (the entry is now strikethrough + a "partially superseded" note). The hint-generation-in-human-vs-human path remains rejected; only the human-vs-AI path was unblocked.
## Assumptions Made
- Seth's "approved write spec -> update context -> create handoff -> git commit -> close session" shorthand was a workflow chain (the next four steps after spec-approved). Did not invoke writing-plans (would have been the brainstorming skill's terminal state).
- Two CLAUDE.md paragraphs (Current State + Key files) needed updating; the rest of CLAUDE.md is unaffected. Did not touch project identity or operations sections.
- DECISIONS.md should organize the AI design entries as their own section ("AI / computer player") rather than mixing into "Architecture" / "Implementation" — those existing sections are about the deployed MVP, not future-but-approved work.
- The "Deferred / Rejected" row for "Client-side AI / hint generation" should be partially struck through, not deleted. The deletion would lose the historical record of the change of mind.
- Backup-before-edit applies to source-controlled files too (per global rule). Created `.backup/CLAUDE.md.<ts>` and `.backup/DECISIONS.md.<ts>`. The `.backup/` directory should be gitignored — verify on next session.
## Potential Gotchas
- **`.backup/` IS gitignored** (verified at the top of `.gitignore` — first non-comment line is `.backup/`). Future sessions can keep using it freely.
- **`docs/superpowers/specs/`** has TWO specs now. Future readers of `CLAUDE.md` "Start Here" should read both. The MVP spec is the deployed reality; the AI spec is approved-but-not-built work.
- **The strikethrough Markdown** (`~~text~~`) in DECISIONS.md "Deferred / Rejected" for the partially-superseded row may render unexpectedly in some viewers. The intent is "this was rejected, now partially reversed" — if the rendering is confusing in practice, switch to plain text with an explicit "PARTIALLY SUPERSEDED" prefix.
- **Spec says retry cap is 5** for the driver (rejecting `wont_help`/`illegal_move` moves). If Recon repeatedly proposes illegal moves on a hard position, the driver will resign the bot at attempt 6. This is a safety belt, not the expected path — if it fires regularly during testing, the prompt template needs work, not the cap.
- **Spec acceptance bar says "≥8s/move on 3090 Ti, ≤10s on V100"** with cold-start excluded. "Cold-start excluded" means we measure post-warmup; the first move's latency is reported separately. If cold-start latency itself becomes a problem (sustained complaints from users), spec Risks #2 has the escalation path.
## Environment State
### Tools/Services Used
- `Write` / `Edit` / `Read` / `Bash` for the spec, context, handoff.
- `git` (commit + push) for the spec commit.
- No SSH, no Ollama calls, no client/server changes — purely documentation work this session.
### Active Processes
- `blind-chess.service` on CT 690 (192.168.0.245). **Unaffected by this session.** Live URL still serves the MVP at https://chess.sethpc.xyz.
### Environment Variables
- None changed this session.
## Related Resources
- Live URL: https://chess.sethpc.xyz (MVP, unaffected)
- Repo: https://git.sethpc.xyz/Seth/blind_chess
- New spec: `docs/superpowers/specs/2026-04-28-ai-player-design.md`
- MVP spec: `docs/superpowers/specs/2026-04-28-blind-chess-design.md`
- Decisions: `DECISIONS.md` (new "AI / computer player" section)
- Project identity: `CLAUDE.md` (updated)
- Original brief: `IDEA.md`
- Prior handoffs: `2026-04-28-152000-mvp-deployed.md`, `2026-04-28-104344-spec-approved-ready-for-plan.md`, `2026-04-28-kickoff.md`
- Gemma 4 implementation guidance:
- `~/bin/gemma4-research/README.md` (index)
- `~/bin/gemma4-research/SYNTHESIS.md` (must-read for the implementation)
- `~/bin/gemma4-research/GOTCHAS.md` (`think: false` + `format: "json"` warnings)
- `~/bin/gemma4-research/CORPUS_ollama_variants.md` (model selection, VRAM)
- `~/bin/gemma4-research/docs/reference/gpu-bakeoff-2026-04-20.md` (3090 Ti vs V100 throughput)
- `~/bin/gemma4-research/docs/reference/mort-bakeoff-2026-04-18.md` (`<think>` token serialization behavior)
- Ollama endpoints (per `~/bin/CLAUDE.md`):
- steel141: `http://192.168.0.141:11434` (3090 Ti, primary)
- pve197 CT 105: `http://192.168.0.179:11434` (V100, fallback)
---
**Security Reminder**: This handoff describes design only; no credentials, deploy targets, or live state changed.
+6 -2
View File
@@ -6,7 +6,9 @@
**Read the latest handoff first:** `.claude/handoffs/` (most recent file). **Read the latest handoff first:** `.claude/handoffs/` (most recent file).
Then check `DECISIONS.md` for settled choices, and `docs/superpowers/specs/2026-04-28-blind-chess-design.md` for the full design spec. Then check `DECISIONS.md` for settled choices, and the design specs:
- `docs/superpowers/specs/2026-04-28-blind-chess-design.md` — original MVP spec (data model, protocol, FSM, testing).
- `docs/superpowers/specs/2026-04-28-ai-player-design.md` — AI/computer player spec (Casual + gemma4 recon bots, two-phase plan).
## Project Identity ## Project Identity
@@ -16,18 +18,20 @@ The system's most distinctive property: highlighting in blind mode reveals **zer
## Current State ## Current State
- **Phase:** MVP **deployed and live** at https://chess.sethpc.xyz (2026-04-28). - **Phase:** MVP **deployed and live** at https://chess.sethpc.xyz (2026-04-28). **AI/computer player feature spec written and approved** (2026-04-28); implementation pending.
- **Repo:** `git.sethpc.xyz/Seth/blind_chess`. - **Repo:** `git.sethpc.xyz/Seth/blind_chess`.
- **Stack:** Node 22 + TypeScript, Fastify + `ws`, Svelte 5 + Vite, `chess.js`. pnpm workspace with `packages/{server,client,shared}`. - **Stack:** Node 22 + TypeScript, Fastify + `ws`, Svelte 5 + Vite, `chess.js`. pnpm workspace with `packages/{server,client,shared}`.
- **Deploy:** LXC **CT 690 on node-241** at 192.168.0.245, behind Caddy CT 600. Systemd unit `blind-chess.service`, port 3000. In-memory state only. - **Deploy:** LXC **CT 690 on node-241** at 192.168.0.245, behind Caddy CT 600. Systemd unit `blind-chess.service`, port 3000. In-memory state only.
- **Tests:** 43 passing — 21 in shared (geometric helper), 22 in server (FSM + view + 4 real-WS integration). - **Tests:** 43 passing — 21 in shared (geometric helper), 22 in server (FSM + view + 4 real-WS integration).
- **Known gaps (deferred):** drag-and-drop input (click-to-move only), full integration coverage of every endgame path, mobile-specific polish, observability beyond `/api/health`. - **Known gaps (deferred):** drag-and-drop input (click-to-move only), full integration coverage of every endgame path, mobile-specific polish, observability beyond `/api/health`.
- **AI player (designed, not built):** Two-phase plan in `docs/superpowers/specs/2026-04-28-ai-player-design.md`. Phase 1 = Casual bot (algorithmic, ~200 LoC). Phase 2 = gemma4 recon bot (`gemma4:26b` chat agent on steel141 RTX 3090 Ti primary, pve197 V100 fallback). Bots play through the same view filter and FSM as humans — no oracle access.
## Key files ## Key files
- `IDEA.md` — original project brief (Seth's words) - `IDEA.md` — original project brief (Seth's words)
- `DECISIONS.md` — locked architectural and gameplay decisions - `DECISIONS.md` — locked architectural and gameplay decisions
- `docs/superpowers/specs/2026-04-28-blind-chess-design.md` — full design spec (data model, protocol, FSM, testing) - `docs/superpowers/specs/2026-04-28-blind-chess-design.md` — full design spec (data model, protocol, FSM, testing)
- `docs/superpowers/specs/2026-04-28-ai-player-design.md` — AI/computer player spec (Casual + gemma4 recon, two-phase plan, endpoint priority list, acceptance bars)
- `packages/shared/src/geometric.ts` — the zero-leak helper. The signature is the proof. - `packages/shared/src/geometric.ts` — the zero-leak helper. The signature is the proof.
- `packages/server/src/view.ts``buildView`, the security boundary. - `packages/server/src/view.ts``buildView`, the security boundary.
- `packages/server/src/commit.ts` — touch-move FSM (the spec's hierarchy decision table). - `packages/server/src/commit.ts` — touch-move FSM (the spec's hierarchy decision table).
+24 -1
View File
@@ -50,6 +50,22 @@ Format: `YYYY-MM-DD: <decision> — <why>`
- 2026-04-28: **WS path through Caddy**`wss://chess.sethpc.xyz/ws?game=<id>` works without explicit `transport ws` config. Caddy's reverse_proxy handles upgrade transparently. - 2026-04-28: **WS path through Caddy**`wss://chess.sethpc.xyz/ws?game=<id>` works without explicit `transport ws` config. Caddy's reverse_proxy handles upgrade transparently.
- 2026-04-28: **Public DNS** — relies on existing `*.sethpc.xyz` wildcard pointing at the WAN IP; no Pi-hole entry was needed. Caddy host-routes `chess.sethpc.xyz` to 192.168.0.245:3000. - 2026-04-28: **Public DNS** — relies on existing `*.sethpc.xyz` wildcard pointing at the WAN IP; no Pi-hole entry was needed. Caddy host-routes `chess.sethpc.xyz` to 192.168.0.245:3000.
## AI / computer player (designed 2026-04-28, not yet implemented)
Spec: `docs/superpowers/specs/2026-04-28-ai-player-design.md`. All decisions below are settled at spec-approval time; revisit if implementation surfaces something the spec didn't anticipate.
- 2026-04-28: **Two AI bots, phased delivery**`CasualBrain` (Phase 1, algorithmic, in-process) ships first; `ReconBrain` (Phase 2, `gemma4:26b` chat agent) ships second. Phased to keep research uncertainty (Recon's actual playing strength) from blocking shipping anything. Rejected: combined launch, single difficulty-dial UX, throwaway Casual-as-stub.
- 2026-04-28: **Bots use the same view filter as humans**`BotDriver` calls `buildView(game, botColor)`; bot input is filtered `BoardView` + `Announcement[]`. No oracle access. Preserves the architectural invariant: the view filter is the only egress for board state, even for in-process bots. Rejected: "easy mode" oracle access for Casual to keep it simple.
- 2026-04-28: **In-process virtual players, not external WS clients**`BotDriver` lives in the existing Fastify server, dispatches actions through the same `commit` handler humans use. One process, no new deploy targets. Rejected: external bot processes (more operational surface, no real benefit), hybrid Casual-in-process / Recon-external (asymmetric for no gain).
- 2026-04-28: **Recon bot is a stateful chat agent, not stateless** — per-game chat history persists across turns as the bot's private memory. Each turn appends user (new view + announcements + candidates) + assistant (reasoning + move). Reasoning is hidden from the human during play, revealed in collapsible post-game panel. Rejected: stateless one-shot move-picker (loses belief-tracking across turns), revealing reasoning during play (would leak strategic intent).
- 2026-04-28: **Endpoint priority: steel141 RTX 3090 Ti primary, pve197 V100 fallback** — preflight on game creation; mid-game failover allowed once (one-way). Rationale: 3090 Ti benchmarks at ~134 tok/s on `gemma4:26b`; V100 estimated ~80 tok/s. Both have the model present. Rejected: no failover (worse UX), bidirectional flap (more complexity, no real benefit).
- 2026-04-28: **GPU shown to user** — persistent badge under AI's slot reads `"gemma4:26b · RTX 3090 Ti"` (or V100 / failed-over variant). Game-start moderator-panel UI message explicitly names the model + host. Rationale: chess.sethpc.xyz is a personal homelab site; surfacing the hardware is brand-appropriate and gives honest feedback when fallback engages. Rejected: hiding the GPU (would be opaque on slow V100 fallback).
- 2026-04-28: **`gemma4:26b` model choice** — sweet spot per gemma4-research: ~134 tok/s decode on 3090 Ti (4.7× faster than 31B), MoE 3.8B active, vision-capable (not used here). Rejected: 31B (5× slower, marginal strength gain not worth latency), e4b (too small for this task).
- 2026-04-28: **Per-move latency budget: 30s normal, 90s first-move** — first-move headroom covers cold-start (steel141 keep_alive=30m policy, ~30-60s reload after idle). Beyond 90s, treat as endpoint failure → failover. Rejected: tighter cap (false-positives on cold start), looser cap (UX death).
- 2026-04-28: **Recon "done" bar: ≥60% wins over 50 Recon-vs-Casual self-play games** — concrete, measurable acceptance bound. If Recon misses 60% but holds >40%, prompt-engineering rabbit hole; if <40%, design signal (try 31B or feed textual board representation). Self-play harness lives in `scripts/selfplay.ts`, not in CI. Rejected: subjective "feels okay" bar (would let weak Recon ship), bar against humans (untestable at scale).
- 2026-04-28: **Reasoning hidden during play, revealed post-game** — Gemma's chat history is private during the game; on game end, the chat history is copied to `Game.aiThoughtsLog` and the post-game screen shows a collapsible "View gemma4's reasoning" section. Rejected: live streaming "thinking tokens" to user (leaks strategy), permanent hiding (loses showcase value of the project).
- 2026-04-28: **`vsAi` field added to `CreateGameRequest`; `aiInfo` field added to `joined`/`update` server messages; `'ai_unavailable'` added to `EndReason`** — minimal protocol surface for the feature. AI metadata is NOT in `ModeratorText` enum (kept clean). UI-system messages for game-start info and failover events are style-distinct from `Announcement` entries.
## Deferred / Rejected ## Deferred / Rejected
<!-- Decisions NOT to do something are just as valuable -- prevents re-proposing rejected ideas --> <!-- Decisions NOT to do something are just as valuable -- prevents re-proposing rejected ideas -->
@@ -65,4 +81,11 @@ Format: `YYYY-MM-DD: <decision> — <why>`
- 2026-04-28: **Move log / PGN export, post-game replay** — deferred. Announcements are persisted in-game (so the moderator-panel scrollback works); export and replay are post-MVP. - 2026-04-28: **Move log / PGN export, post-game replay** — deferred. Announcements are persisted in-game (so the moderator-panel scrollback works); export and replay are post-MVP.
- 2026-04-28: **Public lobby / matchmaking / ratings** — out of scope. This is a private-link game, not a chess site. - 2026-04-28: **Public lobby / matchmaking / ratings** — out of scope. This is a private-link game, not a chess site.
- 2026-04-28: **Pre-deploy "server restarting" warning to active players** — stretch goal, not MVP. Mitigation for now: deploy during low-usage windows. - 2026-04-28: **Pre-deploy "server restarting" warning to active players** — stretch goal, not MVP. Mitigation for now: deploy during low-usage windows.
- 2026-04-28: **Client-side AI / hint generation** — explicitly out of scope. Human vs. human only. - 2026-04-28: ~~**Client-side AI / hint generation** — explicitly out of scope. Human vs. human only.~~ **Partially superseded 2026-04-28** by AI-player spec. Reversal applies *only* to the human-vs-AI path; client-side AI / hint generation in human-vs-human games remains rejected.
- 2026-04-28: **Difficulty slider for AI** — rejected. Two named buttons (Casual, Recon) only. No continuum; the two bots are architecturally different, not tuneable strengths of the same engine.
- 2026-04-28: **Stockfish for vanilla-mode AI strength** — deferred. Vanilla is a side-effect, not a feature target. Revisit if users explicitly ask for strong vanilla AI.
- 2026-04-28: **Live token streaming during Gemma's thinking** — rejected for MVP. Static "AI is thinking..." indicator only. Streaming would leak strategic intent and adds protocol complexity.
- 2026-04-28: **Mid-game GPU flap-back** — rejected. Once failed over to V100, stays there for the rest of the game even if steel141 recovers. Simpler, more predictable, and chat-history is mid-flight.
- 2026-04-28: **AI vs AI public spectate-able games** — rejected for MVP. Self-play harness is CLI-only (`scripts/selfplay.ts`).
- 2026-04-28: **Per-turn context compaction** — deferred. Spec uses `num_ctx: 32768` which covers ~128 turns; longer games would overflow but are rare in casual play. Add running-summary compaction if seen in practice.
- 2026-04-28: **Bot rating / Elo / personalities** — out of scope. Two named buttons, no scoreboard.