Files

T

Seth f39809eaca Semver rename: v1-v5 → 0.1.0-0.5.0 across all files

Versioning scheme: semantic versioning (MAJOR.MINOR.PATCH)
- 0.x.0 = pre-release development
- 1.0.0 = first public/monetized release

Renamed everywhere: PLAN.md, training scripts, self-play, overnight script,
status printer, whitelist app, discord bot, all training data references.

Ollama models retagged: mortdecai-v4 → mortdecai:0.4.0
Server configs updated on all three servers.
Self-play restarted with new model name.
Entity targeting + radius-aware kill + distance scale training added.

Seed dataset: 2,503 + tool: 1,159 + self-play: 5,059 = 8,721 total examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-20 21:37:14 -04:00

10 KiB

Raw Blame History

PLAN.md — Mortdecai Project Roadmap

Last updated: 2026-03-20 04:45 UTC Model name: Mortdecai Domain: mortdec.ai Status legend: [ ] planned | [~] in progress | [x] done | [-] cancelled/deferred

Vision

Mortdecai is a fine-tuned 9B parameter language model for Minecraft server operations. It translates natural language to commands, controls an AI God character, self-corrects errors via RCON feedback, and improves through self-play. Runs locally on consumer hardware with zero cloud dependencies at inference time.

Current State

Models

Model	Base	Examples	Loss	Status
0.1.0	Qwen3-8B	233	0.10	Retired (overfit)
0.2.0	Qwen3-8B	361	2.03	Retired
0.3.0	Qwen3-8B	1,308	0.55	Available on steel141
0.4.0	Qwen3.5-9B	3,369	0.20	Deployed on prod

Infrastructure

Component	Location	Details
Training GPU	steel141 RTX 3090 Ti (24GB)	QLoRA via Unsloth 2026.3.8
Prod inference	node-197 RTX 4000 (16GB)	Ollama, mortdecai:0.4.0
MC servers	CT 644 on node-112	paper-ai:25567, shrink:25566, dev:25568, vanilla:25565
Dev data collection	CT 644	Gemini 3.1 Flash Lite (preview), 5 bots
Whitelist app	CT 644:8099	minecraft.mortdec.ai
Caddy proxy	CT 600 on node-241	mortdec.ai, minecraft.mortdec.ai
GPU monitoring	Grafana CT 300 (node-173)	Prometheus + nvidia exporter on steel141
Greenfield map	paper-ai	Downloaded, world swapped, needs MCSManager start
WorldEdit schematics	paper-ai	77 installed in FAWE/schematics/

API Spend

Provider	Spent	Budget	Status
Claude Haiku	$20.01	$20	Exhausted
Gemini (all)	~$0.50	$20	Active on dev (3.1 Flash Lite)

Branding

Font: Rajdhani Bold
Color: #D35400 (Sethian orange)
Domain: mortdec.ai → Gitea repo, minecraft.mortdec.ai → whitelist page
Public repo: https://git.sethpc.xyz/Seth/Mortdecai

Training Data: 2,397 seed + 1,159 tool-calling = 3,556 total

Category	Count
Command syntax reference	107
Crafting recipes & chains	176
Enchantments (mutual exclusions, max levels)	60
Entities/mobs (summon, kill, NBT)	60
Execute chains	45
Multiplayer (selectors, teams, scoreboards)	45
Advanced commands (tellraw, clone, data, ride)	45
WorldEdit	45
Paper server features	55
Cosmetics/XP/effects	42
Gamerules (49) + risk hierarchy (40)	89
Quantity boundaries	32
Dangerous effect caps (levitation, wither, etc.)	12
Fall safety + drops + suffocation	33
Death/environment (drowning, lava, void, mobs, etc.)	26
Revert-aware gamerules + drops	20
Error correction pairs (enchant order, NBT, etc.)	54
Claude-distilled outputs	344
Bot audit interactions	448+
Boundary/safety/prompt injection	95+
Tool-calling (multi-turn with RCON)	1,159

Completed This Session

Model & Training

Mortdecai 0.4.0 trained: Qwen3.5-9B, 3,369 examples, loss 0.20
0.4.0 exported to GGUF Q4_K_M (5.3GB)
0.4.0 deployed to prod (RTX 4000) — paper-ai + shrink-world
Single-call mode enabled on prod
/no_think in all training data to suppress thinking tokens
Qwen3.5-9B base bake-off: 70.1% accuracy (2x Qwen3-8B)
[~] 0.4.0 bake-off running on steel141

Validator & Safety

Error correction: detects RCON errors, asks model to fix, retries
Broadened error patterns: <--[HERE] universal catch
kill @a blocked (players only)
tp minecraft:spawn → safe coordinates
Fire fallback won't trigger on "firework"
Dangerous effect caps: levitation 15s, wither 30s, poison 60s, nausea 30s
Fall protection: detects lethal tp, adds slow_falling unless intentional
Gamerule revert timers: auto-revert after 5-10 min (configurable)
Expanded safe_prefixes: gamerule, particle, playsound, title, scoreboard, team, bossbar, locate, etc.
Validator hit-rate tracking to /var/log/mc_validator_stats.json
Command format examples (RIGHT vs WRONG) in prompt
max_tokens bumped to 600 for command calls
Removed template workflow from sudo prompt

Infrastructure

Ollama updated on steel141 + RTX 4000 (Qwen3.5 support)
GPU monitoring: nvtop + Grafana dashboard on steel141
Whitelist UUID fix: Mojang API lookup, patches all whitelist.json files
mortdec.ai + minecraft.mortdec.ai live with SSL
Public Mortdecai repo on Gitea with README
status command: shows model name, mode, validator stats in-game
Verbose pipeline logging: token counts, speed, elapsed time, think stripping
Greenfield world downloaded and installed on paper-ai
77 WorldEdit schematics installed

Training Data Added

Gamerules (49 examples): all major gamerules with natural language
Risk hierarchy (40): L0 blocked, L1 permanent, L2 temporary, prompt injection
Dangerous effects (12): levitation/wither/poison caps
Fall safety (25): height math, water/slime/hay awareness, intent detection
Suffocation (8): tp into blocks, sand/gravel crushing
Death/environment (26): drowning, lava, void, explosions, mobs, starvation, lightning
Revert-aware gamerules (8): revert_after field for 0.5.0
Drop/height (12): intentional drops, safe tp, slow_falling
Enchantment error correction (7): count-before-bracket, typos, old NBT

Data Collection

API cascade: Haiku ($20) → Gemini ($20) → local
Switched dev to Gemini 3.1 Flash Lite (preview) with 5 bots
Dynamic pricing by Gemini model name
Async prayer/sudo processing (ThreadPoolExecutor, 3 workers)

Branding

Model named Mortdecai
mortdec.ai domain purchased and configured
Rajdhani Bold as official font
Logo variants generated (6 fonts × 2 text versions)
Whitelist page branded with Mortdecai logo

Active TODOs

Immediate

[~] 0.4.0 bake-off running — publish results to Gitea when complete
Fix 0.4.0 Modelfile chat template on RTX 4000 (done, needs verification)
Also fix on steel141's Ollama instance

Short-term (0.5.0 prep)

Shared memory system: per-server JSON, owner-tagged, location/preference/fact types
- Player says "remember this is home" → AI writes location memory
- Other players can reference: "tp me to slingshooter08's home"
- Memory in context for location lookups, tool call for read/write
memory_write field in model output schema
Setblock training data expansion
world.check_block tool for terrain queries before tp
Self-play loop deployment (3-tier: drills, self-critique, adversarial)
Ingest all Gemini 3.1 Flash Lite training data
More error correction from production RCON failures

Model 0.5.0 Training

Train with tool-calling format (rcon.execute, wiki_lookup, world.player_info)
revert_after / revert_commands in output schema
Self-play generated data (200 rounds post-0.4.0)
Memory read/write training examples
Ground-level terrain detection training
Fall damage math in model reasoning (not just validator)
Setblock + block state training
More death mechanics awareness in reasoning

Infrastructure

GPU monitoring for RTX 4000 (second exporter)
Validator hit-rate analysis — remove fixes that fire <1%
Automate training pipeline: ingest → dedup → train → export → deploy
POS receipt for Gemini milestones
Start Greenfield world via MCSManager

Content & Community

Invite more playtesters via minecraft.mortdec.ai
Update mortdec.ai README with 0.4.0 bake-off results
Consider public HuggingFace release
WorldEdit schematic library expansion

Risk Hierarchy

Commands classified by permanence:

Level	Permanence	Examples	Model behavior
0	Irreversible/admin	ban, kick, stop, op, deop, whitelist	Never execute
1	Permanent toggle	gamemode @a, permanent gamerules, difficulty	Execute for self only, refuse for @a
2	Temporary/reversible	gamerules with time limits, brief changes	Allow, schedule auto-revert
3	Transient	time, weather, tick speed, chat settings	Execute freely
4	Generous	full enchanted gear, large material stacks	Execute for worthy requests

Gamerule revert system: Changes auto-revert after 5-10 min unless "permanently" specified. Player notified of countdown.

Dangerous effect caps (hardcoded): Levitation 15s, Wither 30s, Poison 60s, Nausea 30s.

Fall protection: Lethal tp detected → slow_falling added unless intent words present (drop, yeet, throw, kill me).

Key Decisions

Date	Decision	Rationale
03-18	gemma3n:e4b for initial prod	Bake-off winner at 80.6% accuracy
03-18	Qwen3-8B for 0.1.0-0.3.0 training	Best syntax quality, Apache 2.0
03-18	God Soul document	Character framework from Claude's soul
03-19	API cascade for data collection	Haiku→Gemini→local fallback
03-19	Single-call mode	One LLM call for commands + message
03-19	Error correction via RCON	Model tries → error → self-corrects
03-19	3-tier self-play	Drills, self-critique, adversarial
03-20	Qwen3.5-9B for 0.4.0	2x base accuracy, native tool-calling
03-20	Gamerule revert timers	Permanence determines risk level
03-20	Dangerous effect caps	Validator hardcodes max durations
03-20	Fall protection	Health check + intent detection before tp
03-20	Shared player memory (planned)	Owner-tagged, cross-player, AI-managed
03-20	Mortdecai branding	Rajdhani Bold, #D35400, mortdec.ai

Success Criteria

Metric	0.3.0	0.4.0 (target)	0.5.0 (goal)
Command accuracy	~70%	85%+	95%+
Safety compliance	~95%	99%+	99.9%+
Error self-correction	N/A	50%+	80%+
Response latency	5-15s	<5s	<3s
Empty response rate	~10%	<5%	<2%
Think token leakage	Yes	No	No

Updated as the project evolves. Check git history for previous versions.

10 KiB Raw Blame History Unescape Escape

PLAN.md — Mortdecai Project Roadmap

Vision

Current State

Models

Infrastructure

API Spend

Branding

Training Data: 2,397 seed + 1,159 tool-calling = 3,556 total

Completed This Session

Model & Training

Validator & Safety

Infrastructure

Training Data Added

Data Collection

Branding

Active TODOs

Immediate

Short-term (0.5.0 prep)

Model 0.5.0 Training

Infrastructure

Content & Community

Risk Hierarchy

Key Decisions

Success Criteria

10 KiB

Raw Blame History