Goldman Sachs Warns: AI Inference Cost Deflation May Suppress CapEx Expansion Expectations
Claire Weston
Goldman's One-Delta desk head Rich Privorotsky argues that falling AI inference costs are cracking the 'sustained compute CapEx growth' narrative. Capital is already rotating out of AI leaders into healthcare, and Microsoft's late-July fiscal-year guidance will be the next critical test.
Why could inference cost deflation crack the AI CapEx story?
The core logic: when firms can get the same AI output for less money, the first instinct is to pause and optimize spending — not double down. This means → as effective output per dollar keeps rising, the total spend needed to hit the same business goals actually falls.
In plain terms = cheaper AI does not automatically mean more hardware purchases — it more likely means companies pocket the savings first.
Privorotsky calls this one of the clearest cases yet of inference cost deflation — the steady drop in the per-unit price of each AI "thinking" step — directly weakening the demand driver behind continued compute-infrastructure expansion.
What does the Coinbase case show?
Coinbase CEO Brian Armstrong revealed the company routes simple tasks to cheaper local or open-source models and reserves frontier models for complex reasoning. AI spending has been cut nearly in half, yet token usage is still growing exponentially.
This means → the goal is not to suppress usage but to make exponential growth sustainable through routing — spend less, do more.
Armstrong added that this model routing — automatically picking the right model for each task's difficulty — should eventually be handled by AI itself, not by humans. This reflects a broader shift in corporate AI spending from "buy as much as possible" to "spend as smartly as possible."
Where is the money moving?
Goldman's Prime Book data shows a clear U.S.-led de-risking last week. Investors sharply cut tech-sector exposure, consistent with recent large-cap tech underperformance versus the broader market.
Privorotsky's read: hardware names still feel crowded, while some software stocks and hyperscalers — the mega-cloud platforms like AWS, Azure, and Google Cloud — are starting to look oversold.
Meanwhile, healthcare (XLV, XBI) is gaining strength, benefiting from falling oil prices, rising confidence that the Fed is re-anchoring inflation expectations, and a growing willingness to hold assets beyond AI.
Retail and institutions — who is telling the truth?
Goldman's sentiment indicator has broken above +2; the AAII bull-bear spread has surged; the NAAIM survey is back near the top of its historical range — retail sentiment is turning decisively bullish.
Yet the CNN Fear & Greed Index remains in fear territory, driven by persistently elevated put buying and persistently weak market breadth. This means → institutional investors are not following retail into optimism — they are quietly turning cautious.
In plain terms = retail is adding exposure while institutions are trimming — that divergence is itself a risk signal.
Where is the next key checkpoint?
Microsoft may provide the first substantive signal on FY2027 capital expenditure when its new fiscal year begins in late July.
This means → that is when the market's pricing of the "sustained AI CapEx growth" narrative faces a critical reality check — if Microsoft's guidance falls short, the dampening effect of inference cost deflation moves from theory to fact.
Until then, hardware-sector crowding and the retail-institutional sentiment divergence both warrant close tracking.
Content is for reference only, not financial advice.