NVIDIA May Cut Vera CPU Memory Specs to Control Costs
N.R. Finch
Aletheia Capital analyst Jeff Pu says Nvidia is considering slashing SOCAMM per Vera CPU from 1.5 TB to 96 GB as AI server memory prices hit "unprecedented" levels — whether memory costs can be held to ~20% of rack BOM will determine if this trade-off works.
How bad is the memory price surge?
AI server memory prices have reached what Aletheia Capital calls an "unprecedented" level, pushing up total rack costs.
This means → memory is no longer a minor line item — it has become the single largest cost pressure in AI server builds.
Nvidia's response is to cut from the spec side: reduce the amount of memory per unit, rather than squeeze supplier pricing.
What exactly is Nvidia cutting?
Vera CPU racks: SOCAMM — a packaging format that solders memory directly onto a module — drops from the spec-sheet figure of 1.5 TB to 96 GB per Vera CPU, yielding 768 GB of LPDDR5X per CPU.
General-purpose servers: RDIMM and/or MRDIMM (two standard server memory-stick formats) capacity is cut by roughly 50%.
In plain terms = both product lines go on a diet — the high end loses absolute capacity, the general line loses half, and total memory usage shrinks across the board.
Why target LPDDR5X specifically?
Jeff Pu estimates that LPDDR5X — low-power double-data-rate 5X, a high-speed, low-power memory type — accounts for about 61% of the memory-and-storage budget in a Vera Rubin 200 rack.
This means → trimming other components barely moves the needle; only cutting LPDDR5X can meaningfully pull down the total bill.
The target: bring memory costs to ~20% of 2027 AI rack BOM — down from roughly 61%, an aggressive reduction.
How does the shipping timeline shift?
General-purpose servers: Jeff Pu expects modest quarter-over-quarter growth in Q2 and Q3 2026, followed by a 20%–30% QoQ rebound in Q4 2026.
Vera CPU racks: delivery timelines are expected to slip, though the extent of the delay has not been disclosed.
This reflects a deliberate trade-off — Nvidia is prioritising a healthy cost structure before ramping high-end rack volumes.
Can this cost-control logic actually work?
The pivotal question: after deep cuts to memory specs, will AI compute performance take a meaningful hit?
If performance suffers, buyer appetite weakens — and cost control turns into self-inflicted damage.
In plain terms = saving money at the expense of product capability is not cost reduction, it is a downgrade — that line is where the entire strategy succeeds or fails.
Content is for reference only, not financial advice.