Files
VIBECODE-THEORY/research/31-ai-cost-curves-data.md
Mortdecai d34f447e1f docs: research corpus — 35 deep-dive files from overnight Gemini swarm
Six Gemini agents ran autonomously through 35 research tasks covering
falsifiability, retrocausality, consciousness, game theory, agricultural
revolution, meaning crisis, AI cost curves, adoption S-curves, and more.
304KB of primary-source research with scholars, counterarguments, and data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 08:31:13 -04:00

71 lines
6.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Task 31: AI Cost Curves — Actual Data
## Executive Summary
* **The Price of Cognition is Crashing:** API pricing for frontier models has dropped by approximately 80-90% over the last 24 months (2023-2025). "Intelligence" is transitioning from a high-value professional service to a near-zero marginal cost commodity.
* **Performance-to-Cost Arbitrage:** New models (e.g., Claude 3.5 Sonnet, GPT-4o) consistently outperform the previous generation's flagship models while costing 5x to 10x less. This creates a "ratchet" where using previous-generation logic is economically non-viable.
* **Blackwell Leap:** NVIDIAs Blackwell architecture (B200/GB200) represents a 4x to 15x leap in inference performance per superchip compared to the Hopper (H100) generation, ensuring the continued downward pressure on cognitive computation prices.
* **Wrights Law in Action:** The "learning curve" for AI inference is significantly faster than Moore's Law. While hardware power doubles every ~2 years, the *cost of intelligence* (API pricing) is halving nearly every 12 months due to algorithmic efficiencies (distillation, quantization).
## Key Scholars and Works
* **Seth Lloyd:** *Programming the Universe*. Defined the "ultimate physical limits of computation" (Bremermann's Limit).
* **Theodore Wright:** Wrights Law (1936). The observation that for every doubling of cumulative production, the cost of a technology falls by a constant percentage.
* **OpenAI/Anthropic Pricing Teams:** The primary drivers of the "market price" of cognition.
## Data Points
### OpenAI API Pricing Evolution (per 1M tokens)
| Date | Model | Input Cost | Output Cost | % Change (Input) |
|------|-------|------------|-------------|------------------|
| Mar 2023 | GPT-4 (original) | $30.00 | $60.00 | - |
| Nov 2023 | GPT-4 Turbo | $10.00 | $30.00 | -66% |
| May 2024 | GPT-4o | $5.00 | $15.00 | -50% |
| Aug 2024 | GPT-4o-mini | $0.15 | $0.60 | -97% |
### Anthropic API Pricing Evolution (per 1M tokens)
| Date | Model | Input Cost | Output Cost | Notes |
|------|-------|------------|-------------|-------|
| July 2023 | Claude 2 | $8.00 | $24.00 | Flagship |
| Mar 2024 | Claude 3 Opus | $15.00 | $75.00 | High-end |
| June 2024 | Claude 3.5 Sonnet | $3.00 | $15.00 | Faster/Better than Opus |
| Mar 2026 | Claude 4.6 | $1.00 | $5.00 | Projected/Reported |
### GPU Performance-to-Price (NVIDIA)
| Chip | Release | Cost (Est.) | AI PetaFLOPs (FP8/4) | PetaFLOPs per $10k |
|------|---------|-------------|----------------------|--------------------|
| A100 | 2020 | $10,000 | 0.6 | 0.6 |
| H100 | 2023 | $30,000 | 4.0 | 1.3 |
| B200 | 2025 | $45,000 | 20.0 | 4.4 |
| GB200 | 2025 | $70,000 | 40.0 | 5.7 |
## Supporting Evidence
* **Algorithmic Efficiency:** The 2024 "frontier" of 7B and 8B parameter models (Llama 3, Mistral) achieves performance comparable to the 175B parameter GPT-3.5 at 1/20th the compute cost.
* **Cloud Rental Trends:** Rental prices for H100s have dropped from ~$4.00/hour in 2023 to ~$2.50/hour in 2025, with spot instances available for as low as $1.13/hour.
* **The "Intelligence Catastrophe" Hypothesis:** Melvin Vopsons data suggests that at current growth rates, information processing will consume 50% of the planet's energy/mass resources within 200-300 years, unless the cost curves continue to steepen.
## Counterarguments and Critiques
* **The Data Wall:** Critics argue that as we run out of high-quality human data to train on, the cost of incremental improvement will rise exponentially, potentially breaking Wrights Law for AI.
* **Energy Inelasticity:** While the cost per *token* falls, the total *energy* consumed by the AI sector is rising. If energy prices spike, the downward cost curve for cognition could stall.
* **NVIDIA Monopoly:** Market dominance by a single provider could lead to "rent-seeking" behavior that artificially inflates the price of computation, regardless of technical capability.
## Historical Parallels and Case Studies
* **The Price of Light:** Between 1800 and 2000, the price of artificial light fell by a factor of 500,000. Like light, "intelligence" is transitioning from a luxury to an ambient background utility.
* **Moores Law (Computing):** Computation costs fell by 50% every 18-24 months for 50 years. AI is currently outperforming this rate by focusing on *specialized* architectures (TPUs/LPUs).
* **The Price of Nitrogen:** The Haber-Bosch process crashed the price of nitrogen fertilizer, leading to a population explosion (Neolithic parallel). AI is "Haber-Bosch for the mind."
## Connections to the Series
* **Paper 005 (The Cognitive Surplus):** The data proves that we are entering a period of massive cognitive surplus. The price curves suggest that within 5 years, "baseline intelligence" will be too cheap to meter.
* **Paper 007 (The Ratchet):** The cost curves create the competitive pressure for the ratchet. If your competitor uses GPT-4o-mini at $0.15/1M tokens, you cannot afford to use a human professional at $50.00/hour for the same task. The dependency is economically forced.
* **Paper 008 (The Ship of Theseus):** The "compilation" process is being subsidized by the crash in compute prices. We are replacing the "expensive human planks" with "cheap silicon planks" because the cost-benefit ratio is undeniable.
## Rabbit Holes Worth Pursuing
* **Energy-per-Token:** Research the specific Joules required to generate 1 million tokens across generations.
* **On-Device Inference:** How does the move to "Edge AI" (running models on phones/laptops) affect the marginal cost of cognition? (It potentially drops to zero for the user).
* **Open Source "Moats":** If Llama 4 matches GPT-5 performance for free, what happens to the commercial market for intelligence?
## Sources
* OpenAI. (2023-2024). "API Pricing and Model Updates."
* Anthropic. (2024). "Claude 3.5 Sonnet Release Notes."
* NVIDIA. (2024-2025). "Blackwell Architecture Technical Specifications."
* Epoch. (2023). "Trends in the Compute Cost of AI."
* Vopson, M. M. (2022). "The Information Catastrophe." *AIP Advances*.