0.6.0 training session: Oracle Bot, RL combat, Mind's Eye, multilingual pipeline

Major changes from this session:

Training:
- 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL
- 7,256 merged training examples (up from 3,183)
- New training data: failure modes (85), midloop messaging (27),
  prompt injection defense (29), personality (32), gold from quarantine
  bank (232), new tool examples (30), claude's own experience (10)
- All training data RCON-validated at 100% pass rate
- Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56%

Oracle Bot (Mind's Eye):
- Invisible spectator bot (mineflayer) streams world state via WebSocket
- HTML5 Canvas frontend at mind.mortdec.ai
- Real-time tool trace visualization with expandable entries
- Streaming model tokens during inference
- Gateway integration: fire-and-forget POST /trace on every tool call

Reinforcement Learning:
- Gymnasium environment wrapping mineflayer bot (minecraft_env.py)
- PPO training via Stable Baselines3 (10K param policy network)
- Behavioral cloning pretraining (97.5% accuracy on expert policy)
- Infinite training loop with auto-restart and checkpoint resume
- Bot learns combat, survival, navigation from raw experience

Bot Army:
- 8-soldier marching formation with autonomous combat
- Combat bots using mineflayer-pvp, pathfinder, armor-manager
- Multilingual prayer bots via translategemma:27b (18 languages)
- Frame-based AI architecture: LLM planner + reactive micro-scripts

Infrastructure:
- Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser)
- Billing gateway now tracks all LAN traffic (LAN auto-auth)
- Gateway fallback for empty god-mode responses
- Updated mortdec.ai landing page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Seth
2026-03-22 20:22:50 -04:00
parent baab24f8b1
commit 5b28002001
44 changed files with 20873 additions and 4352 deletions
+32 -6
View File
@@ -82,7 +82,7 @@ SOURCES = {
},
"audit": {
"path": "data/processed/filtered_audit.jsonl",
"format": "audit",
"format": "tool_messages", # Now in messages[] format after filter_audit.py
"default_ratio": 0.5, # Large set, needs dilution
"description": "Filtered audit log data",
},
@@ -137,6 +137,14 @@ RAW_TRAINING_FILES = [
"data/raw/script_tool_training.jsonl",
"data/raw/suffocation_training.jsonl",
"data/raw/worldedit_training.jsonl",
# New 0.6.0 training data
"data/raw/failure_mode_training.jsonl",
"data/raw/midloop_messaging_training.jsonl",
"data/raw/prompt_injection_defense_training.jsonl",
"data/raw/personality_training.jsonl",
"data/raw/gold_from_bank_training.jsonl",
"data/raw/new_tool_training.jsonl",
"data/raw/distilled_multitool.jsonl",
]
# ── Format converters ─────────────────────────────────────────────────────────
@@ -255,7 +263,9 @@ def _tool_messages_passthrough(record: dict) -> dict:
def _raw_training_to_conversations(record: dict) -> dict:
"""Convert raw training files (same as seed format)."""
"""Convert raw training files — handles both old dict and messages[] format."""
if "messages" in record and isinstance(record.get("messages"), list):
return _tool_messages_passthrough(record)
return _seed_to_conversations(record)
@@ -345,6 +355,8 @@ def main():
help="Include chat app training exports")
parser.add_argument("--include-raw", action="store_true", default=True,
help="Include raw training files (default: true)")
parser.add_argument("--eval-split", type=float, default=0.05,
help="Fraction of data for eval set (default: 0.05 = 5%%)")
parser.add_argument("--seed", type=int, default=42,
help="Random seed for reproducibility")
args = parser.parse_args()
@@ -431,18 +443,32 @@ def main():
# Shuffle
random.shuffle(deduped)
# Train/eval split
eval_size = max(1, int(len(deduped) * args.eval_split))
train_set = deduped[eval_size:]
eval_set = deduped[:eval_size]
print(f" Train/eval split: {len(train_set)} train, {len(eval_set)} eval ({args.eval_split*100:.0f}%)")
if args.dry_run:
print("\n [DRY RUN] No output written.")
return
# Write
# Write train set
args.output.parent.mkdir(parents=True, exist_ok=True)
with open(args.output, "w") as f:
for r in deduped:
for r in train_set:
f.write(json.dumps(r, ensure_ascii=False) + "\n")
print(f"\n Wrote {len(deduped)} examples to {args.output}")
print(f" File size: {args.output.stat().st_size / 1e6:.1f} MB")
# Write eval set
eval_path = args.output.with_name(args.output.stem + "_eval.jsonl")
with open(eval_path, "w") as f:
for r in eval_set:
f.write(json.dumps(r, ensure_ascii=False) + "\n")
print(f"\n Wrote {len(train_set)} train examples to {args.output}")
print(f" Wrote {len(eval_set)} eval examples to {eval_path}")
print(f" Train size: {args.output.stat().st_size / 1e6:.1f} MB")
print(f" Eval size: {eval_path.stat().st_size / 1e6:.1f} MB")
if __name__ == "__main__":