0.6.0 training session: Oracle Bot, RL combat, Mind's Eye, multilingual pipeline

Major changes from this session: Training: - 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL - 7,256 merged training examples (up from 3,183) - New training data: failure modes (85), midloop messaging (27), prompt injection defense (29), personality (32), gold from quarantine bank (232), new tool examples (30), claude's own experience (10) - All training data RCON-validated at 100% pass rate - Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56% Oracle Bot (Mind's Eye): - Invisible spectator bot (mineflayer) streams world state via WebSocket - HTML5 Canvas frontend at mind.mortdec.ai - Real-time tool trace visualization with expandable entries - Streaming model tokens during inference - Gateway integration: fire-and-forget POST /trace on every tool call Reinforcement Learning: - Gymnasium environment wrapping mineflayer bot (minecraft_env.py) - PPO training via Stable Baselines3 (10K param policy network) - Behavioral cloning pretraining (97.5% accuracy on expert policy) - Infinite training loop with auto-restart and checkpoint resume - Bot learns combat, survival, navigation from raw experience Bot Army: - 8-soldier marching formation with autonomous combat - Combat bots using mineflayer-pvp, pathfinder, armor-manager - Multilingual prayer bots via translategemma:27b (18 languages) - Frame-based AI architecture: LLM planner + reactive micro-scripts Infrastructure: - Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser) - Billing gateway now tracks all LAN traffic (LAN auto-auth) - Gateway fallback for empty god-mode responses - Updated mortdec.ai landing page Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 20:22:50 -04:00
parent baab24f8b1
commit 5b28002001
44 changed files with 20873 additions and 4352 deletions
@@ -129,8 +129,32 @@ def load_tool_dataset(path: str) -> list:
    return examples


+def load_merged_dataset(path: str) -> list:
+    """Load pre-merged training data (from merge_datasets.py).
+
+    Handles both formats:
+    - 'conversations': list of role/content dicts (already formatted)
+    - 'text': pre-formatted Qwen3 text
+    """
+    examples = []
+    with open(path) as f:
+        for line in f:
+            if not line.strip():
+                continue
+            ex = json.loads(line)
+            if "conversations" in ex or "text" in ex:
+                examples.append(ex)
+    return examples
+
+
 def load_dataset(seed_path: str, tool_path: str = None) -> list:
    """Load and merge all training datasets."""
+    # If pointing at the merged v06 file, use direct loader
+    if "merged_training" in seed_path:
+        examples = load_merged_dataset(seed_path)
+        print(f"  Merged examples: {len(examples)}")
+        return examples
+
    examples = load_seed_dataset(seed_path)
    print(f"  Seed examples:  {len(examples)}")