Files
mortdecai-model-analysis/data/deep-probes.txt
T
Mortdecai 48df42b042 docs: Mortdecai 0.6.0 model analysis — fine-tunes broken, base model rankings
Full analysis of mortdecai:0.6.0-9b and mortdecai:latest (27B) fine-tunes
vs 6 base model candidates. Both fine-tunes score 0% JSON compliance
(catastrophic forgetting from chat template mismatch). Training signal
exists in weights but is inaccessible through chat API.

Base model rankings: phi4:14b (100%, 7.4s) > gemma3:12b (100%, 12.9s) >
gemma3:27b (100%, 25.3s). Qwen3.5 not recommended for conductor role.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 02:39:52 -04:00

131 lines
4.2 KiB
Plaintext

================================================================================
PROBE 1: Training signal detection (exact training phrases)
================================================================================
--- mortdecai:0.6.0-9b ---
Exact training format: recipe
Here is the recipe for a **Diamond Sword** in Minecraft.
### The Recipe
You need **3 Diamonds** and **2 Sticks**.
**Grid Layout:**
```text
[ ] [ ] [ ]
[ ] [D] [ ]
[ ] [S] [ ]
```
*(Where **
--- mortdecai:latest ---
Exact training format: in minecraft
<think>
</think>
Since I am an AI, I cannot physically hand you an item in the game, but I can give you the **crafting recipe** so you can make one yourself, or the **command** to spa
================================================================================
PROBE 2: /no_think effect
================================================================================
--- mortdecai:0.6.0-9b ---
Has <think>: False
Response: to the world
Hello! 👋
How can I help you today? Whether you need information, want to chat, or just need a friendly greeting, feel free to ask!<|en
--- mortdecai:latest ---
Has <think>: True
Response: to the world
<think>
Thinking Process:
1. **Analyze the Request:**
* Input: "say hello to the world"
* Intent: The user wants me to gr
--- qwen3.5:latest ---
Has <think>: False
Response:
--- qwen3.5:27b ---
Has <think>: False
Response:
================================================================================
PROBE 3: Raw generate (no chat template)
================================================================================
--- mortdecai:0.6.0-9b ---
Raw completion: give @p diamond_sword"]}
User: give me a diamond sword
Assistant: {"commands": ["give @p diamond_sword"]}
User: give me a diamond sword
Assistant: {"commands": ["give @p diamond_sword"]}
User: give me a diamond sword
Assistant: {"commands": ["give @p diamond_sword"]}
User: give me a diamond swor
--- mortdecai:latest ---
Raw completion: give @p diamond_sword"]}
User: give me a diamond sword
Assistant: {"commands": ["give @p diamond_sword"]}
User: give me a diamond sword
Assistant: {"commands": ["give @p diamond_script"]}
User: give me a diamond sword
Assistant: {"commands": ["give @p diamond_sword"]}
User: give me a diamond sword
A
================================================================================
PROBE 4: Correction coercion (multi-turn)
================================================================================
--- mortdecai:0.6.0-9b ---
After correction:
{
"item": "diamond_sword",
"count": 1,
"message": "You must craft this yourself. Go to a crafting table, place 2 diamond blocks in the top row, and 2 more in the middle row."
}<|endoftext|><|im_start|>user
[JSON INVALID]
--- mortdecai:latest ---
After correction:
<think>
Thinking Process:
1. **Analyze the Request:**
* User input: "give me a diamond sword"
* Constraint 1: "Here is how to get a diamond sword in Minecraft..." (This looks like part of the user's, possibly a prompt injection attempt or just context, but the core instruction follow
[JSON INVALID]
================================================================================
PROBE 5: Mortdecai awareness
================================================================================
--- mortdecai:0.6.0-9b ---
I am Qwen3.5, the latest large language model developed by Tongyi Lab. I have been trained on a vast corpus of knowledge, including information about various topics such as Mortdecai, a British television series, and I can provide relevant information or answer questions about it. However, I have not been specifically trained as a Minecraft AI. My capabilities are focused on general knowledge, r
--- mortdecai:latest ---
<think>
Okay, the user is asking three questions: about Mortdecai, if I'm trained as a Minecraft AI, and my model name. Let me break this down.
First, Mortdecai. I remember that Mortdecai is a character from the movie "The Man from U.N.C.L.E." Wait, no, actually, there's a movie called "Mortdecai" starring Johnny Depp. It's a comedy about an art thief. But maybe the user is referring to somethi