0.6.0 training session: Oracle Bot, RL combat, Mind's Eye, multilingual pipeline
Major changes from this session: Training: - 0.6.0 training running: 9B on steel141 3090 Ti, 27B on rented H100 NVL - 7,256 merged training examples (up from 3,183) - New training data: failure modes (85), midloop messaging (27), prompt injection defense (29), personality (32), gold from quarantine bank (232), new tool examples (30), claude's own experience (10) - All training data RCON-validated at 100% pass rate - Bake-off: gemma3:27b 66%, qwen3.5:27b 61%, translategemma:27b 56% Oracle Bot (Mind's Eye): - Invisible spectator bot (mineflayer) streams world state via WebSocket - HTML5 Canvas frontend at mind.mortdec.ai - Real-time tool trace visualization with expandable entries - Streaming model tokens during inference - Gateway integration: fire-and-forget POST /trace on every tool call Reinforcement Learning: - Gymnasium environment wrapping mineflayer bot (minecraft_env.py) - PPO training via Stable Baselines3 (10K param policy network) - Behavioral cloning pretraining (97.5% accuracy on expert policy) - Infinite training loop with auto-restart and checkpoint resume - Bot learns combat, survival, navigation from raw experience Bot Army: - 8-soldier marching formation with autonomous combat - Combat bots using mineflayer-pvp, pathfinder, armor-manager - Multilingual prayer bots via translategemma:27b (18 languages) - Frame-based AI architecture: LLM planner + reactive micro-scripts Infrastructure: - Fixed mattpc.sethpc.xyz billing gateway (API key + player list parser) - Billing gateway now tracks all LAN traffic (LAN auto-auth) - Gateway fallback for empty god-mode responses - Updated mortdec.ai landing page Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -82,7 +82,7 @@ SOURCES = {
|
||||
},
|
||||
"audit": {
|
||||
"path": "data/processed/filtered_audit.jsonl",
|
||||
"format": "audit",
|
||||
"format": "tool_messages", # Now in messages[] format after filter_audit.py
|
||||
"default_ratio": 0.5, # Large set, needs dilution
|
||||
"description": "Filtered audit log data",
|
||||
},
|
||||
@@ -137,6 +137,14 @@ RAW_TRAINING_FILES = [
|
||||
"data/raw/script_tool_training.jsonl",
|
||||
"data/raw/suffocation_training.jsonl",
|
||||
"data/raw/worldedit_training.jsonl",
|
||||
# New 0.6.0 training data
|
||||
"data/raw/failure_mode_training.jsonl",
|
||||
"data/raw/midloop_messaging_training.jsonl",
|
||||
"data/raw/prompt_injection_defense_training.jsonl",
|
||||
"data/raw/personality_training.jsonl",
|
||||
"data/raw/gold_from_bank_training.jsonl",
|
||||
"data/raw/new_tool_training.jsonl",
|
||||
"data/raw/distilled_multitool.jsonl",
|
||||
]
|
||||
|
||||
# ── Format converters ─────────────────────────────────────────────────────────
|
||||
@@ -255,7 +263,9 @@ def _tool_messages_passthrough(record: dict) -> dict:
|
||||
|
||||
|
||||
def _raw_training_to_conversations(record: dict) -> dict:
|
||||
"""Convert raw training files (same as seed format)."""
|
||||
"""Convert raw training files — handles both old dict and messages[] format."""
|
||||
if "messages" in record and isinstance(record.get("messages"), list):
|
||||
return _tool_messages_passthrough(record)
|
||||
return _seed_to_conversations(record)
|
||||
|
||||
|
||||
@@ -345,6 +355,8 @@ def main():
|
||||
help="Include chat app training exports")
|
||||
parser.add_argument("--include-raw", action="store_true", default=True,
|
||||
help="Include raw training files (default: true)")
|
||||
parser.add_argument("--eval-split", type=float, default=0.05,
|
||||
help="Fraction of data for eval set (default: 0.05 = 5%%)")
|
||||
parser.add_argument("--seed", type=int, default=42,
|
||||
help="Random seed for reproducibility")
|
||||
args = parser.parse_args()
|
||||
@@ -431,18 +443,32 @@ def main():
|
||||
# Shuffle
|
||||
random.shuffle(deduped)
|
||||
|
||||
# Train/eval split
|
||||
eval_size = max(1, int(len(deduped) * args.eval_split))
|
||||
train_set = deduped[eval_size:]
|
||||
eval_set = deduped[:eval_size]
|
||||
print(f" Train/eval split: {len(train_set)} train, {len(eval_set)} eval ({args.eval_split*100:.0f}%)")
|
||||
|
||||
if args.dry_run:
|
||||
print("\n [DRY RUN] No output written.")
|
||||
return
|
||||
|
||||
# Write
|
||||
# Write train set
|
||||
args.output.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(args.output, "w") as f:
|
||||
for r in deduped:
|
||||
for r in train_set:
|
||||
f.write(json.dumps(r, ensure_ascii=False) + "\n")
|
||||
|
||||
print(f"\n Wrote {len(deduped)} examples to {args.output}")
|
||||
print(f" File size: {args.output.stat().st_size / 1e6:.1f} MB")
|
||||
# Write eval set
|
||||
eval_path = args.output.with_name(args.output.stem + "_eval.jsonl")
|
||||
with open(eval_path, "w") as f:
|
||||
for r in eval_set:
|
||||
f.write(json.dumps(r, ensure_ascii=False) + "\n")
|
||||
|
||||
print(f"\n Wrote {len(train_set)} train examples to {args.output}")
|
||||
print(f" Wrote {len(eval_set)} eval examples to {eval_path}")
|
||||
print(f" Train size: {args.output.stat().st_size / 1e6:.1f} MB")
|
||||
print(f" Eval size: {eval_path.stat().st_size / 1e6:.1f} MB")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
Reference in New Issue
Block a user