How far along the continuum or the curve are we, moving from training to inference? In other words, when will inference dominate demand, rather than training now dominating demand? And when inference trends toward 100% demand how close are we to "technological singularity," and when might that occur? You’re really asking two different—but tightly linked—questions: 1. Where are we on the training → inference shift? 2. Does inference dominance = approaching “singularity”? They connect, but not in a straight line. Let’s unpack carefully. ⸻
1) Where we are right now (2026): Still training-heavy—but turning
Today’s frontier is still defined by training runs:
• OpenAI
• Google DeepMind
• Anthropic …are all still spending billions on:
• larger pretraining runs
• post-training (RLHF, synthetic data loops)
• multimodal expansion
But here’s the key shift: Inference demand is already exploding faster than training demand.
Think of it this way:
• Training = one massive event
• Inference = millions to billions of daily queries We are roughly here on the curve: ~60–70% training / 30–40% inference (by compute intensity at the frontier) BUT Inference is growing ~2–5x faster ⸻
2) Why inference inevitably dominates
This isn’t speculative—it’s structural.
A. Economics flips the equation
Training is:
• episodic
• capital intensive
• but infrequent Inference is:
• continuous
• usage-based
• tied to real-world adoption
Once AI is embedded everywhere: • every app call • every workflow • every agent loop → becomes inference demand ⸻
B. “Token explosion” effect As models get better:
• People use them more often
• Tasks get longer + more complex • Agents call models recursively
Result: Inference scales superlinearly with usefulness ⸻
C. Enterprise + agents = the inflection The real tipping point isn’t ChatGPT-style usage. It’s:
• autonomous agents
• enterprise workflows
• machine-to-machine AI calls That’s when: inference becomes invisible infrastructure ⸻
3) When does inference dominate?
Likely timeline (high-confidence directional view):
• 2023–2025 → Training dominates (we just lived this)
• 2026–2028 → Crossover phase
• ~2028–2030 → Inference clearly dominates (70%+ of compute) We’re probably early in the crossover right now. ⸻
4) What changes when inference dominates?
This is where your earlier point about business models becomes critical.
The winners shift:
Training era winners:
• NVIDIA (high-end GPUs)
• hyperscalers (capex scale)
• frontier labs
Inference era winners:
• Low-cost, high-efficiency silicon (ASICs, edge chips)
• vertically integrated ecosystems (again, Apple matters here)
• distributed compute networks
• possibly telecom + edge infrastructure players ⸻
Infrastructure changes:
• From: giant centralized clusters
• To: distributed + latency-sensitive systems
Inference pushes toward:
• edge compute
• regional micro data centers
• on-device AI ⸻
5) Now—the “singularity” question
Let’s be precise, because this term gets abused.
Technological Singularity = a point where:
• AI improves itself recursively • progress becomes uncontrollable / exponential ⸻
Critical insight: Inference dominance ≠ singularity They are related but not equivalent. ⸻
6) How they are connected
Inference dominance enables something important:
A. Continuous learning loops
When inference is everywhere:
• models observe the world in real time
• generate data constantly
• feed training pipelines continuously
This creates: quasi-continuous training systems ⸻
B. Agent-driven self-improvement
If agents:
• write code
• run experiments
• evaluate results
Then you start getting: partial recursive self-improvement ⸻
7) So how close are we?
Blunt answer: We are NOT close to singularity yet.
Even with rapid progress:
• Models still lack:
• true autonomy
• robust long-horizon reasoning
• reliable self-improvement loops ⸻
Realistic staging:
Stage 1 (now → ~2028)
• Inference dominates
• Agents proliferate
• AI everywhere, but still tool-like
Stage 2 (~2028–2035)
• Partial self-improving systems
• AI accelerates R&D meaningfully
• “soft takeoff” dynamics begin Stage 3 (uncertain, post-2035?)
• Possible rapid recursive improvement • This is where singularity debates become real ⸻
8) The deeper takeaway (this is the non-obvious part)
The real signal to watch is NOT:
• inference %
• model size
It’s: Does AI meaningfully accelerate its own improvement cycle? That’s the hinge. ⸻
9) Bringing it back to your original framing
Your LDC / infrastructure question becomes even more important:
• Training world → centralized, capex-heavy
• Inference world → distributed, usage-heavy
And the big question: Do hyperscalers still dominate inference? or Does inference fragment across edge + specialized providers? ⸻
Bottom line
• We are early in the shift from training → inference dominance (mid-innings, not late)
• Inference likely dominates within ~3–5 years
• This shift does NOT equal singularity, but:
• it lays the groundwork for continuous learning systems
And the real “clock” on singularity isn’t inference share— It’s when AI systems start improving themselves faster than humans can meaningfully track or control. ⸻
If you want to go one level deeper, the next question is fascinating: Does inference becoming cheap and ubiquitous actually slow down frontier training? (counterintuitive, but very possible)
No comments:
Post a Comment