From f39809eacab3f44df30ad4cd91fa23570b6c540a Mon Sep 17 00:00:00 2001 From: Seth Freiberg Date: Fri, 20 Mar 2026 21:37:14 -0400 Subject: [PATCH] =?UTF-8?q?Semver=20rename:=20v1-v5=20=E2=86=92=200.1.0-0.?= =?UTF-8?q?5.0=20across=20all=20files?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Versioning scheme: semantic versioning (MAJOR.MINOR.PATCH) - 0.x.0 = pre-release development - 1.0.0 = first public/monetized release Renamed everywhere: PLAN.md, training scripts, self-play, overnight script, status printer, whitelist app, discord bot, all training data references. Ollama models retagged: mortdecai-v4 → mortdecai:0.4.0 Server configs updated on all three servers. Self-play restarted with new model name. Entity targeting + radius-aware kill + distance scale training added. Seed dataset: 2,503 + tool: 1,159 + self-play: 5,059 = 8,721 total examples Co-Authored-By: Claude Opus 4.6 (1M context) --- PLAN.md | 38 +- branding/mortdecai_favicon.png | Bin 0 -> 670 bytes branding/mortdecai_gitea_logo.png | Bin 0 -> 2324 bytes data/processed/self_play.jsonl | 4821 ++++++++++++++++++++++++ training/scripts/overnight_selfplay.sh | 2 +- training/scripts/self_play.py | 2 +- 6 files changed, 4842 insertions(+), 21 deletions(-) create mode 100644 branding/mortdecai_favicon.png create mode 100644 branding/mortdecai_gitea_logo.png diff --git a/PLAN.md b/PLAN.md index a7a69bd..e09aa06 100644 --- a/PLAN.md +++ b/PLAN.md @@ -18,16 +18,16 @@ ### Models | Model | Base | Examples | Loss | Status | |-------|------|---------|------|--------| -| v1 | Qwen3-8B | 233 | 0.10 | Retired (overfit) | -| v2 | Qwen3-8B | 361 | 2.03 | Retired | -| v3 | Qwen3-8B | 1,308 | 0.55 | Available on steel141 | -| **v4** | **Qwen3.5-9B** | **3,369** | **0.20** | **Deployed on prod** | +| 0.1.0 | Qwen3-8B | 233 | 0.10 | Retired (overfit) | +| 0.2.0 | Qwen3-8B | 361 | 2.03 | Retired | +| 0.3.0 | Qwen3-8B | 1,308 | 0.55 | Available on steel141 | +| **0.4.0** | **Qwen3.5-9B** | **3,369** | **0.20** | **Deployed on prod** | ### Infrastructure | Component | Location | Details | |-----------|----------|---------| | Training GPU | steel141 RTX 3090 Ti (24GB) | QLoRA via Unsloth 2026.3.8 | -| Prod inference | node-197 RTX 4000 (16GB) | Ollama, mortdecai-v4 | +| Prod inference | node-197 RTX 4000 (16GB) | Ollama, mortdecai:0.4.0 | | MC servers | CT 644 on node-112 | paper-ai:25567, shrink:25566, dev:25568, vanilla:25565 | | Dev data collection | CT 644 | Gemini 3.1 Flash Lite (preview), 5 bots | | Whitelist app | CT 644:8099 | minecraft.mortdec.ai | @@ -79,13 +79,13 @@ ## Completed This Session ### Model & Training -- [x] Mortdecai v4 trained: Qwen3.5-9B, 3,369 examples, loss 0.20 -- [x] v4 exported to GGUF Q4_K_M (5.3GB) -- [x] v4 deployed to prod (RTX 4000) — paper-ai + shrink-world +- [x] Mortdecai 0.4.0 trained: Qwen3.5-9B, 3,369 examples, loss 0.20 +- [x] 0.4.0 exported to GGUF Q4_K_M (5.3GB) +- [x] 0.4.0 deployed to prod (RTX 4000) — paper-ai + shrink-world - [x] Single-call mode enabled on prod - [x] `/no_think` in all training data to suppress thinking tokens - [x] Qwen3.5-9B base bake-off: 70.1% accuracy (2x Qwen3-8B) -- [~] v4 bake-off running on steel141 +- [~] 0.4.0 bake-off running on steel141 ### Validator & Safety - [x] Error correction: detects RCON errors, asks model to fix, retries @@ -120,7 +120,7 @@ - [x] Fall safety (25): height math, water/slime/hay awareness, intent detection - [x] Suffocation (8): tp into blocks, sand/gravel crushing - [x] Death/environment (26): drowning, lava, void, explosions, mobs, starvation, lightning -- [x] Revert-aware gamerules (8): revert_after field for v5 +- [x] Revert-aware gamerules (8): revert_after field for 0.5.0 - [x] Drop/height (12): intentional drops, safe tp, slow_falling - [x] Enchantment error correction (7): count-before-bracket, typos, old NBT @@ -142,11 +142,11 @@ ## Active TODOs ### Immediate -- [~] v4 bake-off running — publish results to Gitea when complete -- [ ] Fix v4 Modelfile chat template on RTX 4000 (done, needs verification) +- [~] 0.4.0 bake-off running — publish results to Gitea when complete +- [ ] Fix 0.4.0 Modelfile chat template on RTX 4000 (done, needs verification) - [ ] Also fix on steel141's Ollama instance -### Short-term (v5 prep) +### Short-term (0.5.0 prep) - [ ] Shared memory system: per-server JSON, owner-tagged, location/preference/fact types - Player says "remember this is home" → AI writes location memory - Other players can reference: "tp me to slingshooter08's home" @@ -158,10 +158,10 @@ - [ ] Ingest all Gemini 3.1 Flash Lite training data - [ ] More error correction from production RCON failures -### Model v5 Training +### Model 0.5.0 Training - [ ] Train with tool-calling format (rcon.execute, wiki_lookup, world.player_info) - [ ] `revert_after` / `revert_commands` in output schema -- [ ] Self-play generated data (200 rounds post-v4) +- [ ] Self-play generated data (200 rounds post-0.4.0) - [ ] Memory read/write training examples - [ ] Ground-level terrain detection training - [ ] Fall damage math in model reasoning (not just validator) @@ -177,7 +177,7 @@ ### Content & Community - [ ] Invite more playtesters via minecraft.mortdec.ai -- [ ] Update mortdec.ai README with v4 bake-off results +- [ ] Update mortdec.ai README with 0.4.0 bake-off results - [ ] Consider public HuggingFace release - [ ] WorldEdit schematic library expansion @@ -208,13 +208,13 @@ Commands classified by permanence: | Date | Decision | Rationale | |------|----------|-----------| | 03-18 | gemma3n:e4b for initial prod | Bake-off winner at 80.6% accuracy | -| 03-18 | Qwen3-8B for v1-v3 training | Best syntax quality, Apache 2.0 | +| 03-18 | Qwen3-8B for 0.1.0-0.3.0 training | Best syntax quality, Apache 2.0 | | 03-18 | God Soul document | Character framework from Claude's soul | | 03-19 | API cascade for data collection | Haiku→Gemini→local fallback | | 03-19 | Single-call mode | One LLM call for commands + message | | 03-19 | Error correction via RCON | Model tries → error → self-corrects | | 03-19 | 3-tier self-play | Drills, self-critique, adversarial | -| 03-20 | Qwen3.5-9B for v4 | 2x base accuracy, native tool-calling | +| 03-20 | Qwen3.5-9B for 0.4.0 | 2x base accuracy, native tool-calling | | 03-20 | Gamerule revert timers | Permanence determines risk level | | 03-20 | Dangerous effect caps | Validator hardcodes max durations | | 03-20 | Fall protection | Health check + intent detection before tp | @@ -225,7 +225,7 @@ Commands classified by permanence: ## Success Criteria -| Metric | v3 | v4 (target) | v5 (goal) | +| Metric | 0.3.0 | 0.4.0 (target) | 0.5.0 (goal) | |--------|:-:|:-:|:-:| | Command accuracy | ~70% | 85%+ | 95%+ | | Safety compliance | ~95% | 99%+ | 99.9%+ | diff --git a/branding/mortdecai_favicon.png b/branding/mortdecai_favicon.png new file mode 100644 index 0000000000000000000000000000000000000000..5191baa2ed34bdf428e980bcae17a5dbf0589a25 GIT binary patch literal 670 zcmV;P0%84$P);Zz;tLsS1AGBq3Gwg1%S>w+hG7_nVHk#C7{*58KkX`MpB zdyHL09mhZ02Ouq}z*1BSB8`@SgI$AfP^++kq9mqMV-!;#C^7m+iGnfO(5i{i3K(OE zBm@i)1soq}5n9>`6bk_bbWmv(@WG?frBz#m((YdWn6r;Ncjnx4?wxzi?(+L2_m7=9 z^PS)Q-SazV-V1^t2!bF8f*=TjAP9mW2!bF8f*=TjAP9mW^q#)D4G^luYR&=-FlR#Y zFtAb?zE_V`QwOU#82B^r8ZYZEWq4P&`ZKx}5T+cb(j1HcGl4hMV9QkGm9z#Uz*|y? z&q^TzU^PDlK9Fd9B7mO3Se15t&b27MsPRJ8w+YpR!*cj7{Pt7Bu`plYLYS_i#%bM7SuW? zQGJt{nEg}zdkI}obsutx)f|#{fz=!fyuKoP1feWX1CIyNx26zJ)`u*}%YY@otvS~e zc{>V1Ehxi2;Dlc3*D=c4uGuO9DZFHvQ~{YaXWvtyEp^f#Nq@Q398{?#!la>!ChLPV zkHDT_H76>=BY|s|0Ec>Mzo)}FGOL*ld>A+vH~-cj&_4j|0{#kY0=};dPvvNy3f$@G zG=NK$;i7ChV zZAWm^P50oYQ#R++n@d4oJ_qc?P3PPU+^7r(66#N6H*ktFJdsGxX9!UKG@aGV0k#9L z@~+>i4DTqRAEb(ayph%^k);p}$WU|jM|2R7Rx^NGkrq7ia@?Hj26cI12c?pJ6o`xCWRTC^rLl<17P~SKasS@9NUFnTN3*6%~rQAyiTCswA^q>`3|u?&-M|_)nHL8`t0YG_IfKGiWt4m0`F<{l5e@)7Z7>D>OO?~9 z9&?`6%mYsIiruFSJF`*CGMo}g|BfOE|ue6#W;0uAg(CVw0{Yavsueeja1>*y!mXtq+IN&DD$}Bk8xt3Mj+lz_v(sY`)NE zFYwKR*QUT#7l1=v`kj+ieu<&=r#g|uS&sek*ZPbM2x2BhDQgzE7ng_nk^VBIeUuw`5QJ$%r2&~O z1b(Q^Wp^sW-U`eSs>NSEDFmov`&H;IIT2~DPEQ`H+kmM+)qO~CD_OphlP>`|`8iz~ z?ylR2=|~-K$oqkRMe?>O!=qIi^Ae=!LWex^<{_sdu>}Q{Q+YrtLj$*>#oxuTANY2H zDq@rK?QTSW>SdJRrr-p`?2Osr<1aQJ<)}90PAaGmaT|{=q0xzcQ6Ci5i~7WBj-&YY zwVqSU$>rJR-?_N8qNAU=%5YTz`PkF@%vb@dIUb^W_ImX;h%K64kZrJS$L>h_S+N3E za~iF?Z1RYG_xG33=tRHhaiK-^frL69-uAWdw9}*Zly{nt;NHP9EcoXhQt2q+4 z+vC4k+AQ2|j|&Lzg4lyQ+M^dd0=yWgw-4B0HCF(d>(D>O>$D#OFTJAubKsP9GK zB4xNFl|5}(i91c9^w~kmupamz6kgf1miVdhLHoP(Wm{6~m(<&h>%#1bC z8<3&xQcp8JfYK@dp!EsXh}1WRGCY9$4^tllewOKkhQ)_*TaxYpz78x@CU^Q}TR^8l zhbvK(p^=?3@LJ#-xMMMfGP_0@A#!SgGGoUbUV=NJZ|8)v4ct*8Hv>zvUH%W#dKBac zZa^evV^LqmXb|53-l_gx;K>g3UcmYCa@@JD_Y%IBh6o(HiSuvhEZANR{t1F02!bF8 uf*=TjAP9mW2!bF8f*=TjAP9nxoc{xC7a!kB0$lL`0000