Mortdecai Gateway — authenticated Ollama proxy with power metering

- API key auth on all inference endpoints - Power/cost tracking: GPU TDP × inference time × electricity rate - Spending cap enforcement - Web dashboard with live stats - Docker compose for AMD ROCm (Strix Halo) or NVIDIA - Auto-setup script with GGUF loading - Tested against local Ollama Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:26:43 -04:00
commit c5865feb35
7 changed files with 561 additions and 0 deletions
@@ -0,0 +1,78 @@
+# Mortdecai Gateway
+
+Authenticated Ollama proxy with power metering. Deploy on any machine with a GPU to contribute inference compute to the Mortdecai training pipeline.
+
+## Quick Start
+
+```bash
+git clone <repo-url>
+cd mortdecai-gateway
+mkdir -p models
+# Copy the GGUF file into models/
+cp /path/to/mortdecai-v4.gguf models/
+chmod +x setup.sh
+./setup.sh
+```
+
+Dashboard: http://localhost:8434/dashboard
+
+## What It Does
+
+```
+Your GPU → Ollama → Gateway (auth + metering) → Port 8434 → Internet
+```
+
+The gateway sits in front of Ollama and:
+- Authenticates requests via API key
+- Tracks inference time, tokens, energy usage
+- Estimates electricity cost (GPU TDP × time × rate)
+- Enforces a spending cap
+- Provides a dashboard with live stats
+
+## Configuration
+
+Edit `.env`:
+
+```
+API_KEY=mk_your_secret_key
+GPU_TDP_WATTS=54          # Your GPU's TDP
+SYSTEM_OVERHEAD_WATTS=30  # CPU/RAM draw during inference
+ELECTRICITY_RATE=0.15     # $/kWh
+SPENDING_CAP=10.00        # $ before gateway stops accepting
+```
+
+## Endpoints
+
+| Endpoint | Auth | Description |
+|----------|------|-------------|
+| `GET /health` | No | Ollama status + loaded models |
+| `GET /dashboard` | No | Web dashboard with live stats |
+| `GET /stats` | Yes | JSON usage stats |
+| `POST /api/chat` | Yes | Proxied to Ollama |
+| `POST /api/generate` | Yes | Proxied to Ollama |
+| `*` | Yes | Everything else proxied to Ollama |
+
+## Response Metadata
+
+Every proxied response includes a `_gateway` field:
+
+```json
+{
+  "message": { "role": "assistant", "content": "..." },
+  "_gateway": {
+    "duration_seconds": 3.42,
+    "energy_wh": 0.0798,
+    "estimated_cost": 0.000012,
+    "total_cost": 0.0342,
+    "budget_remaining": 9.9658
+  }
+}
+```
+
+## AMD ROCm
+
+The Docker compose uses `ollama/ollama:rocm` by default. Requires ROCm drivers on the host. For Strix Halo, ensure BIOS is set to reserved VRAM mode.
+
+## NVIDIA
+
+Edit `docker-compose.yml`: uncomment the `deploy` section and comment out the `devices` section.