Commit Graph

5 Commits

Author SHA1 Message Date
Mortdecai 924f16b9da 22-tool architecture: log.query, user.ask, journal system deployed
New tools implemented and deployed to dev gateway:
- log.query: focused event queries (chat/deaths/joins/actions), replaces 200-line dump
- user.ask: risk-scaled clarifying questions, async with tellraw
- journal.read/write: per-player files, cross-mode (God+Sudo share)

All wired into langgraph_gateway.py _execute_tool and model-driven tool loop.
Tool schemas updated (22 total). Deployed to CT 644 dev server.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 21:04:01 -04:00
Mortdecai 9c2c9a2310 1200+ distilled gold examples, journal system, redstone mastery, safety awareness
Distilled Training Data (1,203 examples):
- 341 initial gold (plugins, enchantments, builds, effects, god, errors)
- 165 buildings + pipeline (100 structures built on dev, 65 request→query→act)
- 24 safety-aware (worldborder, safe tp, intentional harm, gamemode checks)
- 17 advanced logic (decanonized items, redstone gates, iterative builds)
- 12 redstone mastery (NOT/OR/AND/XOR/RS-latch/T-flip-flop/comparator/clock)
- 7 circuit verification and diagnosis
- 1 compact comparator gates
- 10 redstone methodology (build→test→save→recall→learn from mistakes)
- 8 player journal usage
- 29 creative+uncommon+pipeline+god with full tool chains

Player Journal System:
- agent/tools/player_journal.py — per-player text files (1-10 lines)
- journal.read + journal.write tool schemas added
- Cross-contaminated: God and Sudo share same journal per player
- Includes sentiment, relationship, builds, preferences, skill level

Redstone Engineering:
- agent/prompts/redstone_rules.md — baked-in wall torch, dedicated lead, repeater rules
- Learned from 4 iterations of 8-switch circuit: wall_torch on back face, not top
- T-junction bypass prevention: dedicated lead wire between merge and NOT block
- RCON limitation: can build circuits but cannot test them (lever toggle doesn't propagate)

Training Data Cleaning:
- 466 @s→@p fixes, 10 template commands removed
- 12 outdated refusals replaced with correct plugin commands
- Data de-duped across all sources

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 20:50:52 -04:00
Mortdecai f5118505b1 0.5.0 bake-off results, knowledge lookup tools, training progress chart
Bake-off (0.5.0 vs 0.4.0):
- Overall: 46.8% vs 45.2% (+1.6%), 0 errors vs 2
- Enchantments: +47% (20% → 67%)
- EssentialsX: +60% (0% → 60%)
- Effects: +25% (0% → 25%)
- Regressions: fill_build -67%, world -20%

Knowledge Lookup Tools (4 new):
- plugin.docs_lookup: WorldGuard, WorldEdit, CoreProtect, EssentialsX, LuckPerms docs
- minecraft.changelog_lookup: version history from Minecraft Wiki
- paper.docs_lookup: Paper server-specific documentation
- Wired into gateway model-driven tool loop and exploration self-play

Exploration Self-Play:
- General (vanilla MC) and plugins focus modes
- Wiki-grounded: model researches before acting, validates through RCON
- 2,243 exploration examples generated, 150 kept after quality filtering

Training Progress Chart:
- SVG chart showing training examples and inverse loss across versions
- Added to MODEL_CARD.md for Gitea display

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:28:09 -04:00
Mortdecai da8f557219 GPU scheduler, 14-tool architecture, plugin deployment, event dispatcher
GPU Scheduler (gpu.sethpc.xyz):
- Live dashboard with 4 GPUs, training monitor, loss sparklines
- Preset-based job scheduler with 3 triggers (time, finish_training, cost)
- Model selection per GPU, pipeline configuration
- Tool self-play and training pipeline types
- Behind Google OAuth, live-refresh without page reload

Tool Architecture (14 tools):
- 3 new tools: world.nearby_entities, memory.read, memory.write
- 7 script.* tools: write, validate, execute, read, list, delete, schedule
- ScriptManager: full mcfunction datapack CRUD with RCON validation
- Training data: 1,430 tool examples (up from 1,159)

Plugin Deployment (paper-ai-25567):
- WorldGuard 7.0.12, CoreProtect CE 23.1, EssentialsX 2.21.2, Vault 1.7.3
- Fresh greenfield world reset
- 104 RCON-validated plugin training examples

Event Dispatcher:
- Watches server log for deaths, joins, advancements, PvP kills
- Configurable trigger probability and cooldowns per event type
- Deployed to dev server, fires god_system prompts on events
- 21 event-response training examples

Training Infrastructure:
- train_lora.py: --save-steps 50, --resume from checkpoint
- run_training.sh: stops Ollama, activates conda, restarts after
- Passwordless sudo for ollama services on steel141
- Dev server added to MCSManager with autoStart

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:14:45 -04:00
Seth ee764cd22a Tool-calling training: 1,159 multi-turn examples with error correction
Tool schemas (agent/tools/tool_schemas.py):
- rcon.execute: execute commands, get success/error results
- minecraft.wiki_lookup: look up syntax and item info
- world.player_info: player health, position, inventory
- world.server_state: time, weather, online players
- 10 RCON error patterns with corrections
- 12 common error scenarios for training

Training data generator (training/scripts/generate_tool_training.py):
- Converts seed dataset to multi-turn tool conversations
- Error correction: model tries wrong command → gets error → self-corrects
- Wiki/player/server lookups for uncertainty scenarios
- Qwen3 native tool-calling format with <tool_call> tags

1,159 examples: 1043 success, 79 error correction, 24 error scenarios,
13 tool lookups. Ready for v4 training.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 18:49:08 -04:00