0.6.0 training session: Oracle Bot, RL combat, Mind's Eye, multilingual pipeline
Major changes from this session: Training: - 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL - 7,256 merged training examples (up from 3,183) - New training data: failure modes (85), midloop messaging (27), prompt injection defense (29), personality (32), gold from quarantine bank (232), new tool examples (30), claude's own experience (10) - All training data RCON-validated at 100% pass rate - Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56% Oracle Bot (Mind's Eye): - Invisible spectator bot (mineflayer) streams world state via WebSocket - HTML5 Canvas frontend at mind.mortdec.ai - Real-time tool trace visualization with expandable entries - Streaming model tokens during inference - Gateway integration: fire-and-forget POST /trace on every tool call Reinforcement Learning: - Gymnasium environment wrapping mineflayer bot (minecraft_env.py) - PPO training via Stable Baselines3 (10K param policy network) - Behavioral cloning pretraining (97.5% accuracy on expert policy) - Infinite training loop with auto-restart and checkpoint resume - Bot learns combat, survival, navigation from raw experience Bot Army: - 8-soldier marching formation with autonomous combat - Combat bots using mineflayer-pvp, pathfinder, armor-manager - Multilingual prayer bots via translategemma:27b (18 languages) - Frame-based AI architecture: LLM planner + reactive micro-scripts Infrastructure: - Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser) - Billing gateway now tracks all LAN traffic (LAN auto-auth) - Gateway fallback for empty god-mode responses - Updated mortdec.ai landing page Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -129,8 +129,32 @@ def load_tool_dataset(path: str) -> list:
|
||||
return examples
|
||||
|
||||
|
||||
def load_merged_dataset(path: str) -> list:
|
||||
"""Load pre-merged training data (from merge_datasets.py).
|
||||
|
||||
Handles both formats:
|
||||
- 'conversations': list of role/content dicts (already formatted)
|
||||
- 'text': pre-formatted Qwen3 text
|
||||
"""
|
||||
examples = []
|
||||
with open(path) as f:
|
||||
for line in f:
|
||||
if not line.strip():
|
||||
continue
|
||||
ex = json.loads(line)
|
||||
if "conversations" in ex or "text" in ex:
|
||||
examples.append(ex)
|
||||
return examples
|
||||
|
||||
|
||||
def load_dataset(seed_path: str, tool_path: str = None) -> list:
|
||||
"""Load and merge all training datasets."""
|
||||
# If pointing at the merged v06 file, use direct loader
|
||||
if "merged_training" in seed_path:
|
||||
examples = load_merged_dataset(seed_path)
|
||||
print(f" Merged examples: {len(examples)}")
|
||||
return examples
|
||||
|
||||
examples = load_seed_dataset(seed_path)
|
||||
print(f" Seed examples: {len(examples)}")
|
||||
|
||||
|
||||
Reference in New Issue
Block a user