Friday, March 27, 2026

Inference Dominating Training -- Demand And The Technological Singularity Question -- March 27, 2026

How far along the continuum or the curve are we, moving from training to inference? In other words, when will inference dominate demand, rather than training now dominating demand? And when inference trends toward 100% demand how close are we to "technological singularity," and when might that occur? You’re really asking two different—but tightly linked—questions: 1. Where are we on the training → inference shift? 2. Does inference dominance = approaching “singularity”? They connect, but not in a straight line. Let’s unpack carefully. ⸻ 

 

1) Where we are right now (2026): Still training-heavy—but turning 

Today’s frontier is still defined by training runs: 

• OpenAI 

• Google DeepMind 

 • Anthropic …are all still spending billions on: 

 • larger pretraining runs 

 • post-training (RLHF, synthetic data loops)

 • multimodal expansion 

But here’s the key shift: Inference demand is already exploding faster than training demand. 

Think of it this way: 

 • Training = one massive event 

 • Inference = millions to billions of daily queries We are roughly here on the curve: ~60–70% training / 30–40% inference (by compute intensity at the frontier) BUT Inference is growing ~2–5x faster ⸻ 

 2) Why inference inevitably dominates 

This isn’t speculative—it’s structural. 

 A. Economics flips the equation 

Training is: 

 • episodic 

 • capital intensive 

 • but infrequent Inference is: 

• continuous 

 • usage-based 

 • tied to real-world adoption 

 Once AI is embedded everywhere: • every app call • every workflow • every agent loop → becomes inference demand ⸻ 

B. “Token explosion” effect As models get better: 

 • People use them more often 

• Tasks get longer + more complex • Agents call models recursively 

 Result: Inference scales superlinearly with usefulness ⸻ 

 C. Enterprise + agents = the inflection The real tipping point isn’t ChatGPT-style usage. It’s: 

 • autonomous agents 

 • enterprise workflows 

• machine-to-machine AI calls That’s when: inference becomes invisible infrastructure ⸻ 

3) When does inference dominate? 

Likely timeline (high-confidence directional view): 

 • 2023–2025 → Training dominates (we just lived this) 

 • 2026–2028 → Crossover phase 

• ~2028–2030 → Inference clearly dominates (70%+ of compute) We’re probably early in the crossover right now. ⸻ 

 

4) What changes when inference dominates? 

This is where your earlier point about business models becomes critical. 

The winners shift: 

Training era winners: 

 • NVIDIA (high-end GPUs) 

• hyperscalers (capex scale) 

• frontier labs 

Inference era winners: 

 • Low-cost, high-efficiency silicon (ASICs, edge chips) 

vertically integrated ecosystems (again, Apple matters here) 

 • distributed compute networks 

• possibly telecom + edge infrastructure players ⸻ 

 

Infrastructure changes: 

 • From: giant centralized clusters 

 • To: distributed + latency-sensitive systems 

Inference pushes toward: 

edge compute 

 • regional micro data centers 

• on-device AI ⸻ 

 

5) Now—the “singularity” question 

 Let’s be precise, because this term gets abused. 

Technological Singularity = a point where: 

• AI improves itself recursively • progress becomes uncontrollable / exponential ⸻ 

 

 Critical insight: Inference dominance ≠ singularity They are related but not equivalent. ⸻ 

 

 6) How they are connected 

 Inference dominance enables something important: 

A. Continuous learning loops 

When inference is everywhere: 

 • models observe the world in real time 

• generate data constantly 

• feed training pipelines continuously 

This creates: quasi-continuous training systems ⸻ 

B. Agent-driven self-improvement 

If agents: 

 • write code 

• run experiments 

• evaluate results 

Then you start getting: partial recursive self-improvement ⸻ 

7) So how close are we? 

 Blunt answer: We are NOT close to singularity yet

Even with rapid progress: 

• Models still lack: 

 • true autonomy 

• robust long-horizon reasoning 

 • reliable self-improvement loops ⸻ 

Realistic staging: 

Stage 1 (now → ~2028) 

 • Inference dominates 

• Agents proliferate 

• AI everywhere, but still tool-like 

Stage 2 (~2028–2035) 

 • Partial self-improving systems 

• AI accelerates R&D meaningfully 

 • “soft takeoff” dynamics begin Stage 3 (uncertain, post-2035?) 

• Possible rapid recursive improvement • This is where singularity debates become real ⸻ 

 

 8) The deeper takeaway (this is the non-obvious part) 

The real signal to watch is NOT: 

 • inference % 

 • model size 

 It’s: Does AI meaningfully accelerate its own improvement cycle? That’s the hinge.

 

 9) Bringing it back to your original framing 

Your LDC / infrastructure question becomes even more important: 

• Training world → centralized, capex-heavy 

 • Inference world → distributed, usage-heavy 

And the big question: Do hyperscalers still dominate inference? or Does inference fragment across edge + specialized providers? ⸻ 

 

Bottom line 

  • We are early in the shift from training → inference dominance (mid-innings, not late) 

• Inference likely dominates within ~3–5 years 

This shift does NOT equal singularity, but

 • it lays the groundwork for continuous learning systems 

And the real “clock” on singularity isn’t inference share— It’s when AI systems start improving themselves faster than humans can meaningfully track or control. ⸻ 

 

If you want to go one level deeper, the next question is fascinating: Does inference becoming cheap and ubiquitous actually slow down frontier training? (counterintuitive, but very possible)

No comments:

Post a Comment

Music -- Hillbilly -- The Wallflowers "One Headlight" -- Hillbilly Moon Explosion's "My Love For Ever More" -- April 17, 2026

AI prompt : Wallflowers' "One Headlight" is similar to Hillbilly Moon Explosion "My Love For Evermore." Thoughts? Hi...