Mortdecai

Files

T

Seth 9d789d2524 Three-tier constraint model, mode-aware eval, boundary examples, playtest tooling

Eval harness:
- Mode-aware scoring: sudo=strict (exact match), pray/god=soft (category match,
  in-character, appropriate intensity)
- New metrics: cmd_category_match, appropriate_intensity, scoring_mode breakdown
- Eval defaults to steel141 (192.168.0.141) — prod GPU reserved for serving

Dataset (213 examples):
- Added 31 boundary/adversarial examples (safety edges, abstention, near-boundary)
- Updated pray example reasoning: character-driven logic, not prescriptive outputs
- Tagged pray examples with scoring_mode=soft

Playtest tooling:
- whitelist.sh: add/remove/list across all 3 servers
- FRIENDS_INVITE.md + Discord version: playtester recruitment docs
- Server addresses and implementation details for both training servers

PLAN.md:
- Three-tier constraint model documented (sudo/pray/god_system)
- Success criteria split by scoring mode
- All session decisions logged

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-18 15:57:01 -04:00

processed

Three-tier constraint model, mode-aware eval, boundary examples, playtest tooling

2026-03-18 15:57:01 -04:00

raw

Phase 2: eval harness, 182 examples, live bake-off, playtest infrastructure

2026-03-18 13:38:12 -04:00

schema.json

Initial project scaffold: dataset schema, 31 seed training examples, Mineflayer bot framework, and 7-phase roadmap

2026-03-18 01:51:28 -04:00

validate_dataset.py

Initial project scaffold: dataset schema, 31 seed training examples, Mineflayer bot framework, and 7-phase roadmap

2026-03-18 01:51:28 -04:00