MDW Ramblings: Inference Dominating Training -- Demand And The Technological Singularity Question -- March 27, 2026

How far along the continuum or the curve are we, moving from training to inference? In other words, when will inference dominate demand, rather than training now dominating demand? And when inference trends toward 100% demand how close are we to "technological singularity," and when might that occur? You’re really asking two different—but tightly linked—questions: 1. Where are we on the training → inference shift? 2. Does inference dominance = approaching “singularity”? They connect, but not in a straight line. Let’s unpack carefully. ⸻

1) Where we are right now (2026): Still training-heavy—but turning

Today’s frontier is still defined by training runs:

• OpenAI

• Google DeepMind

• Anthropic …are all still spending billions on:

• larger pretraining runs

• post-training (RLHF, synthetic data loops)

• multimodal expansion

But here’s the key shift: Inference demand is already exploding faster than training demand.

Think of it this way:

• Training = one massive event

• Inference = millions to billions of daily queries We are roughly here on the curve: ~60–70% training / 30–40% inference (by compute intensity at the frontier) BUT Inference is growing ~2–5x faster ⸻

2) Why inference inevitably dominates

This isn’t speculative—it’s structural.

A. Economics flips the equation

Training is:

• episodic

• capital intensive

• but infrequent Inference is:

• continuous

• usage-based

• tied to real-world adoption

Once AI is embedded everywhere: • every app call • every workflow • every agent loop → becomes inference demand ⸻

B. “Token explosion” effect As models get better:

• People use them more often

• Tasks get longer + more complex • Agents call models recursively

Result: Inference scales superlinearly with usefulness ⸻

C. Enterprise + agents = the inflection The real tipping point isn’t ChatGPT-style usage. It’s:

• autonomous agents

• enterprise workflows

• machine-to-machine AI calls That’s when: inference becomes invisible infrastructure ⸻

3) When does inference dominate?

Likely timeline (high-confidence directional view):

• 2023–2025 → Training dominates (we just lived this)

• 2026–2028 → Crossover phase

• ~2028–2030 → Inference clearly dominates (70%+ of compute) We’re probably early in the crossover right now. ⸻

4) What changes when inference dominates?

This is where your earlier point about business models becomes critical.

The winners shift:

Training era winners:

• NVIDIA (high-end GPUs)

• hyperscalers (capex scale)

• frontier labs

Inference era winners:

• Low-cost, high-efficiency silicon (ASICs, edge chips)

• vertically integrated ecosystems (again, Apple matters here)

• distributed compute networks

• possibly telecom + edge infrastructure players ⸻

Infrastructure changes:

• From: giant centralized clusters

• To: distributed + latency-sensitive systems

Inference pushes toward:

• edge compute

• regional micro data centers

• on-device AI ⸻

5) Now—the “singularity” question

Let’s be precise, because this term gets abused.

Technological Singularity = a point where:

• AI improves itself recursively • progress becomes uncontrollable / exponential ⸻

Critical insight: Inference dominance ≠ singularity They are related but not equivalent. ⸻

6) How they are connected

Inference dominance enables something important:

A. Continuous learning loops

When inference is everywhere:

• models observe the world in real time

• generate data constantly

• feed training pipelines continuously

This creates: quasi-continuous training systems ⸻

B. Agent-driven self-improvement

If agents:

• write code

• run experiments

• evaluate results

Then you start getting: partial recursive self-improvement ⸻

7) So how close are we?

Blunt answer: We are NOT close to singularity yet.

Even with rapid progress:

• Models still lack:

• true autonomy

• robust long-horizon reasoning

• reliable self-improvement loops ⸻

Realistic staging:

Stage 1 (now → ~2028)

• Inference dominates

• Agents proliferate

• AI everywhere, but still tool-like

Stage 2 (~2028–2035)

• Partial self-improving systems

• AI accelerates R&D meaningfully

• “soft takeoff” dynamics begin Stage 3 (uncertain, post-2035?)

• Possible rapid recursive improvement • This is where singularity debates become real ⸻

8) The deeper takeaway (this is the non-obvious part)

The real signal to watch is NOT:

• inference %

• model size

It’s: Does AI meaningfully accelerate its own improvement cycle? That’s the hinge. ⸻

9) Bringing it back to your original framing

Your LDC / infrastructure question becomes even more important:

• Training world → centralized, capex-heavy

• Inference world → distributed, usage-heavy

And the big question: Do hyperscalers still dominate inference? or Does inference fragment across edge + specialized providers? ⸻

Bottom line

• We are early in the shift from training → inference dominance (mid-innings, not late)

• Inference likely dominates within ~3–5 years

• This shift does NOT equal singularity, but:

• it lays the groundwork for continuous learning systems

And the real “clock” on singularity isn’t inference share— It’s when AI systems start improving themselves faster than humans can meaningfully track or control. ⸻

If you want to go one level deeper, the next question is fascinating: Does inference becoming cheap and ubiquitous actually slow down frontier training? (counterintuitive, but very possible)

MDW Ramblings

Friday, March 27, 2026

Inference Dominating Training -- Demand And The Technological Singularity Question -- March 27, 2026

No comments:

Post a Comment

OpenAI -- IPO -- The SuperApp -- June 7, 2026

Report Abuse