NVIDIA May Cut Vera CPU Memory Specs to Control Costs

N.R. Finch

Published todayAbout 7 min read

Aletheia Capital analyst Jeff Pu says Nvidia is considering slashing SOCAMM per Vera CPU from 1.5 TB to 96 GB as AI server memory prices hit "unprecedented" levels — whether memory costs can be held to ~20% of rack BOM will determine if this trade-off works.

How bad is the memory price surge?

AI server memory prices have reached what Aletheia Capital calls an "unprecedented" level, pushing up total rack costs.

This means → memory is no longer a minor line item — it has become the single largest cost pressure in AI server builds.

Nvidia's response is to cut from the spec side: reduce the amount of memory per unit, rather than squeeze supplier pricing.

What exactly is Nvidia cutting?

Vera CPU racks: SOCAMM — a packaging format that solders memory directly onto a module — drops from the spec-sheet figure of 1.5 TB to 96 GB per Vera CPU, yielding 768 GB of LPDDR5X per CPU.

General-purpose servers: RDIMM and/or MRDIMM (two standard server memory-stick formats) capacity is cut by roughly 50%.

In plain terms = both product lines go on a diet — the high end loses absolute capacity, the general line loses half, and total memory usage shrinks across the board.

Why target LPDDR5X specifically?

Jeff Pu estimates that LPDDR5X — low-power double-data-rate 5X, a high-speed, low-power memory type — accounts for about 61% of the memory-and-storage budget in a Vera Rubin 200 rack.

This means → trimming other components barely moves the needle; only cutting LPDDR5X can meaningfully pull down the total bill.

The target: bring memory costs to ~20% of 2027 AI rack BOM — down from roughly 61%, an aggressive reduction.

How does the shipping timeline shift?

General-purpose servers: Jeff Pu expects modest quarter-over-quarter growth in Q2 and Q3 2026, followed by a 20%–30% QoQ rebound in Q4 2026.

Vera CPU racks: delivery timelines are expected to slip, though the extent of the delay has not been disclosed.

This reflects a deliberate trade-off — Nvidia is prioritising a healthy cost structure before ramping high-end rack volumes.

Can this cost-control logic actually work?

The pivotal question: after deep cuts to memory specs, will AI compute performance take a meaningful hit?

If performance suffers, buyer appetite weakens — and cost control turns into self-inflicted damage.

In plain terms = saving money at the expense of product capability is not cost reduction, it is a downgrade — that line is where the entire strategy succeeds or fails.

Content is for reference only, not financial advice.