Memory has grown to nearly two-thirds of AI chip component costs

Posted by intelkishan 2 hours ago

Memory has grown to nearly two-thirds of AI chip component costs(epoch.ai)

149 points | 160 comments

gpm 55 minutes ago|

An interesting implication of this is that AI inference and training has a path to a ~3x hardware cost reduction (and maybe ~2x total cost reduction) without any technical innovation whatsoever, we just need to wait for dram supply to meet demand (either by manufacturing scaling or just waiting for the current rate of manufacturing to fill the demand spike).

radialstub 4 minutes ago||

The memory makers will not expand demand drastically. It is in the nature of their business to keep the market under-supplied, otherwise the following oversupply will kill them. Instead, supply is just rerouted from less profitable segments such as mobile and personal computing.

Waterluvian 36 minutes ago|||

What’s the lifespan/refurbishability of the capex elements like the “GPU” modules or even the DRAM soldered into them?

andrepd 12 minutes ago|||

I wonder if we will see an adoption of alternative floating point formats. IEEE floats are notoriously terrible at lower widths (<= 16 bits). Floating point formats such as posits do much better at 16 or 8 bits. If you could train at 16 bits per value instead of 32, and suffer a much smaller inaccuracy penalty than you would from IEEE32 to IEEE16...

cubefox 17 minutes ago|||

For some reason I still haven't heard any predictions on when new fabs will come online to meet the current demand. This shouldn't be too hard to find, since the building time of fabs is very predictable process.

The difficult question is more whether foreseeable memory demand will remain at the current level, grow further, or shrink again.

shevy-java 8 minutes ago|||

> a path to a ~3x hardware cost reduction

Really?

How long do we have to wait until that ... cost reduction hits us?

sandworm101 17 minutes ago|||

Supply will not meet demand. What incentive do the handful of dram manufacturers have to end the party? This is what happens when legal monopolies finally win control. Dont't worry. The patents will expire in a few decades. Our grandkids will see DDR5 get cheap again. The system functions as intended.

fitblipper 3 minutes ago||

I have fairly simplistic view of the economics involved here. Could you explain why the ability to sell more chips wouldn't be sufficient enough incentive to increase supply?

eldenring 19 minutes ago||

2-3x is completely dwarfed by the remaining improvements in training which is still in its infancy relatively

gpm 9 minutes ago|||

Probably, but at some point we're very likely to run out of significant training improvements and it's not clear that we'll see that point coming from a long way out.

Likewise it's probably dwarfed by improvements in how we make dram - continuing the roughly exponential (maybe a bit less recently) scaling of chips - but not necessarily.

The 2x from returning to previous costs is interesting because it's practically guaranteed, and it's on top of everything else. We're just currently "overpaying" (relative to the stable market price) for the manufacture of dram because of a sudden increase in demand.

BearOso 9 minutes ago|||

[delayed]

slicktux 2 hours ago||

I bought 96GB of RAM a couple of years ago for ~$250. That same RAM now costs $1200!

jmspring 3 minutes ago||

I just found two 4tb Samsung EVO drives - unused - while organizing my garage.

dawnerd 1 hour ago|||

I’m so mad I didn’t max out my main server when I had the chance. Used enterprise sticks were dirt cheap on eBay.

Forgeties79 19 minutes ago||

Used enterprise HDD’s also jacked up now. It’s absurd lol

dawnerd 7 minutes ago||

Yep mad about that too. I was about half way through upgrading my 45 drives server when they started to go up.

adroitboss 1 hour ago|||

I paid $279 for crucial 96gb DDR5 5600 MHz SO-DIMM ram October 22 of last year. Amazon has the same kit going for $1,048.90 right now.

Joel_Mckay 1 hour ago||

Nice, you were lucky. =3

trollbridge 37 minutes ago|||

I bought 192GB of DDR3 a year ago for literally $60 ($5 a stick). It's about $22 a stick now, so more like $350 today. What on earth is _anybody_ doing with DDR3?

jlokier 23 minutes ago|||

Demand for DDR3 is up because people who want DDR5 or DDR4 but can't afford either any more are choosing DDR3 and old DDR3-compatible systems to put it in, instead of what they really want.

manquer 28 minutes ago||||

All memory products use many shared resources in the supply chain, so if there is high demand in one product line, others have to raise prices to compete for the resources or stop making those lines altogether.

That is to say at least you were able to buy them at $350 today, with the current trajectory there will be no supply at all in few months.

chinathrow 28 minutes ago|||

Being desperate?

bushbaba 1 hour ago|||

Makes prior assumptions that getting tens of gigs of ram is cheap thrown out the window. Would likely lead to super fast SSDs such as optain being way more valuable

moregrist 19 minutes ago||

The price of SSDs is similarly depressing.

Forgeties79 20 minutes ago|||

2x16gb for $105 total April of 2025. $600 for that now. Makes no sense.

shevy-java 6 minutes ago|||

My main computer has 64GB. I bought that one in late 2022 or so.

Looking at the current prices, even of the same RAM, is just insane. Those companies really need to pay us compensation damage here. The whole "free market" notion does not work when you have de-facto monopolies and mega-corporations abuse average Joe and average Jane.

IshKebab 53 minutes ago|||

I bought a couple of used computers with 256 GB of DDR 4 (total) a year ago. The ram is worth more than I paid for the whole machines now.

giancarlostoro 52 minutes ago|||

Ramflation

ksec 1 hour ago||

It is one of the thing with consumer when they remember they brought it at the absolutely lowest price point when DRAM maker were bleeding money.

Those are not normal pricing. Before the pricing collapse in early 2020, 96GB DDR5 would have cost about $450 to $500. And I will need to restate again the cost of DRAM hasn't really changed much in the past 20 years. Its price just goes up and down in cycles.

So in reality it is more like going from $500 to $1300. But consumer felt it was more like going from $200 to $1300.

Crucial are already selling DRAM made by CXMT. And China are already throwing money at it. I doubt the memory bubble will burst in next 12-24 months. As in going back to money losing DRAM pricing. As they will all pivot to HBM or other money making products. But the bulk of lower end consumer DDR5 or LPDDR5 will goes to Chinese Foundry. Assuming they have figure out how to do them well. Which they have improved but are still so far away from industry leaders.

Normally memory maker will push the next DDR standard to market just to push out Chinese competitors, I am not sure it will work the same this time around. DDR5 have plenty of other usage / demands.

cogman10 20 minutes ago|||

> Its price just goes up and down in cycles.

Historically the price has always trended downward. When I first got into computing $200 could buy you 128 MB (yes M) of ram. Really nice systems had 512 MB.

That's obviously changed over the decades as process shrinks have lead to higher memory density. We should generally expect that ram will cheaper up and until the point where process shrinks stop happening. They've definitely slowed, but they haven't stopped.

DoctorOetker 1 hour ago|||

> Crucial are already selling DRAM made by CXMT.

Crucial was disestablished this year.

voxic11 1 hour ago|||

He probably meant Corsair which is the DRAM brand selling CXMT produced chips.

trollbridge 36 minutes ago|||

Ah, the old decrucialisestablishmentarianism.

DoctorOetker 33 minutes ago||

I found the phrasing weird myself, I quoted wikipedia

mchusma 1 hour ago||

Everything I read seems to suggest that RAM capacity is going to grow at 20-25% a year, which just doesn't seem good enough. Even in consumer use cases, phones and laptops would benefit greatly by double the amount of RAM. And then obviously, the AI need is gigantic.

I don't see it going away. I mean, it may not grow as fast as now, but I don't see it growing away either. I get why the memory makers do not want to bankrupt themselves, but it feels like there's got to be some way to push that risk off onto model providers and other people in the ecosystem to allow us to grow ram capacity more like 50% per year.

foota 31 minutes ago||

In theory the new futures markets for chip components would help here, since it would allow DRAM suppliers to insulate themselves from that risk.

DoctorOetker 50 minutes ago|||

According to the recent article HBM memory is 3x less efficient wafer area wise than LPDDR; but the bandwidth is more than triple.

What if its in everyone's interest to buy computers at say 1/3rd the rate and switch everything over to HBM?

the discrepancy between compute and memory has been growing for ages, perhaps a painful switch to HBM is exactly what we need?

Would you rather have 3 intermediate computers with low memory bandwidth, or wait a little longer statistically so that we can all enjoy a new computer at 1/3rd the rate but much higher bandwidth than the area ratio?

FuckButtons 40 minutes ago|||

These are fundamentally different points in design space though, hbm doesn’t have a 10mw idle draw like lpddr does.

thfuran 22 minutes ago||||

Not many workloads are RAM bandwidth limited. Power and latency are much more common bottlenecks, and HBM loses on both of those.

pastel8739 1 minute ago||

Isn’t memory bandwidth super relevant for AI?

aurareturn 36 minutes ago|||

Can’t put HBM in smartphones and laptops. The power drain is too great.

minraws 1 hour ago||

I mean the biggest risk is Chinese CXML benefits and capturing markets that others are leaving hanging and then being able to compete and push out the others when costs start to normalize.

As for 20-25% growth not being enough, I think it's not that far off, if we assume data center build out plans hit a wall and slow down significantly, and the AI heat starts to cool off.

I don't think 20-25% may be enough in the short term but if the AI build out stops within this year, we have a massive oversupply instead of a under supply.

blululu 33 minutes ago|||

Looking at the history of the memory industry the biggest risk is that a firm would over produce and go bankrupt. Maybe this time is different but so far no memory chip maker has gone under because their competition increased capacity.

minraws 17 minutes ago||

I might be wrong but your second point can't be true if the first one is true.

Let me explain, imagine CXML grows massive and builds a lot of fabs, so much so that it becomes the leader in multiple segments, then the market demand cools off.

Then CXML the company that invested massively has oversupply so it undercuts every other memory company.

Aka, Samsung, SK Hynix are dead, and to protect Micron now US has 10000% tariff on the supply of memory.

Imagine. Because that has happened, if you don't play the boom and bust game someone will because the market is very large during a boom, and generally the player scaling more isn't the one with margins to protect and generally has the ability to undercut others.

Asian memory chip giants were made by under cutting European and American companies, American companies adapted by moving manufacturing to Asia, and European ones got bought for pennies or dissolved.

galangalalgol 48 minutes ago||||

Is there any indication research is being focused on reducing menory footprint of inference for frontier class models? Is the low hanging fruit already gone there?

minraws 31 minutes ago|||

Low hanging? how low hanging are we talking, the basic stuff is gone. Largely big challenges around quantization were solved 2 years ago, and we have just been improving from there.

But can massive gains still be made? Definitely.

The entire AI hype is based on the paper Attention is all you need, and Attention is basically loading a huge matrix of all the tokens in memory, how well you can optimize this attention layer is basically how most architectures are trying to solve for performance and memory usage.

Only one with significant gains in it is DeepSeek (or so I would like to believe because others don't make their work open for folks like me not in Big AI Labs to read). Their MLA architecture reduced KV-cache memory requirements by upto 90%, ofc that's purely architectural change.

With some quantization like Turboquant from google you could push it down to ~1/3 of that. So 96% memory savings when talking about kv-cache.

But the models are close to being saturated for quantization based memory optimizations. We will have to see some architectural changes for a significant shift now.

aurareturn 33 minutes ago|||

If they manage to make memory more efficient, they’ll just increase the context size and/or model size.

We just haven’t reached the diminishing return of gen AI capabilities yet.

Models will get more useful if you have higher context size or higher param size. Then people will just use the models even more, leading to even more memory demand.

zx8080 1 hour ago|||

What is the risk? Competition is good for consumers.

LPisGood 1 hour ago||

The risk is to the business not the consumers

johnvanommen 1 hour ago||

I really don’t want to give anyone ideas, but doesn’t this make the Nvidia 5090 an unbelievably good deal right now?

The VRAM in the 5090 is only made by one country in the world.

The 50xx series is special, because its ram is so dependent on a single commodity. It’s not like a 4090 or a 3090; their VRAM chips have been around for years.

If there’s a shortage or interruption in DDR7 VRAM, it seems like every GPU that requires it would explode in value.

I hope I don’t regret posting this because I’d really like to buy one myself…

layer8 1 hour ago||

An unbelievably good deal at $4000 plus?

johnvanommen 1 hour ago||

Possibly the best deal there is

I really need to shut up, or bite the bullet and by one.

If you graph the tokens per second on the 5090, your jaw will hit the floor at how cheap it is

gruez 1 hour ago|||

With only 32gb of vram, you can only run small/quantized models, in which case what's the point? At $4000, that gets you 20 months of 10x claude or chagpt subscriptions, which provide far better models. You'd need some use case where you can tolerate worse models, and use a steady supply of them. That doesn't match most people's usage patterns.

echoangle 25 minutes ago|||

Or you want to process private data or don’t have reliable connectivity. There are a few more reasons for local models I think.

EnPissant 16 minutes ago|||

Also, electricity isn't free.

Galanwe 21 minutes ago|||

The 5090 is crap for inference. Unless you like dummy models, sure they will run at light speed. All the rage is MoE with 500B-1T weights nowadays.

mattmanser 1 hour ago|||

It's gone up like 300% in cost in the last year.

JacobAsmuth 1 hour ago|||

Which surely is the highest it'll ever be! You're suggesting that the price will go down in the future? Would love to hear more about your thought process!

bcrosby95 57 minutes ago||

Are you saying we're entering a period where tech increases in price instead of decreases? I guess it depends upon time horizon, but your statement isn't very specific.

johnvanommen 1 hour ago||||

I believe msrp is $2000 right?

EnPissant 1 hour ago|||

There was only a very brief time it was selling for MSRP (last fall for $2000). Even if you use that as the previous data point, it's only 200% increased.

forrestthewoods 1 hour ago||

if you can buy one!

The RTX 5090 is faster than an H200. It just has less ram (32 vs 141), doesn't have NVLink, and technically isn't allowed to be used in a datacenter.

The datacenter GPUs sell at an 80% margin. They're incredibly overpriced. But the laws of supply and demand are undefeated and so here we all are.

alphabeta3r56 1 hour ago||

> The RTX 5090 is faster than an H200. It just has less ram

H200 has HBM and much more 64-bit compute

forrestthewoods 29 minutes ago||

Let me try again.

RTX 5090 has more CUDA cores that run at a higher clock speed. H200 has more RAM and significantly more RAM bandwidth.

Which one is net faster depends on your use case. But you may be very surprised that many workflows are faster on an RTX 5090!

oceansky 2 hours ago||

Awful time for gamers and PC hobbyists not fully into AI.

aunty_helen 1 hour ago||

This is 100% going to kill the home built pc market. When I started building gaming pcs, the top top card was 750$ (NZD). Now they’re 10,000 just for the gpu and another 1-2000 for ram.

People used to get into gaming pcs as an affordable hobby, now it’s making general aviation look like plan B.

throwatdem12311 5 minutes ago|||

Don’t you worry - Microsoft and Amazon will have you covered with cloud streaming.

Can’t afford a computer because they bought up all the supply? They’ll conveniently sell it back to you with a subscription!

You’ll own nothing and be happy.

johnvanommen 1 hour ago||||

Yes, this will definitely renew interest in Stadia type products.

themafia 1 hour ago||||

It's more likely to kill the AI market. They're overbuilding capacity and most of it is going unused. The upcoming haircut is going to kill a lot of the major players.

They've intentionally crafted an unsustainable business model in an effort to get users in the front door and raise their MAUs. We've seen this story before. We should know precisely where it's headed.

Joel_Mckay 1 hour ago|||

Indeed, Gamers Nexus is doing interviews with PC component manufacturers, and some are hurting bad right now. The PC market is no longer in competition, but rather survival mode. =3

https://www.youtube.com/@GamersNexus/videos

paulmist 43 minutes ago|||

I think it's the opposite. Sure in short term hobbyists are getting squeezed, but the amount of capital that they can put into pushing the edge is small compared to Fortune 500. Sooner or later hobbyists will benefit, especially if the market crashes.

baq 17 minutes ago||

If it crashes after it kills the PC we’ll be left with… nothing? Path matters as much as destination

lacunary 2 hours ago||

also for ones fully into AI

elorant 2 hours ago||

Bought a second hand Dell server a week ago. The entire rig with a 12-core CPU and 32GB DDR4 ecc RAM cost as much as I'd pay to buy 64 GB of DDR RAM alone. I hope there's an end to this absurdity soon enough otherwise the pain will affect other markets too. I read the other day that PC case sales have collapsed by more than 40%.

finebalance 1 hour ago||

Poor people are already being priced out of cheap phones due to rise in RAM-related unit costs. https://www.cnet.com/tech/mobile/smartphone-sales-to-plummet...

lostlogin 45 minutes ago||

It makes me sad for the Neo 2.0. More ram is the only thing stopping me switching to it from a Pro.

Npovview 21 minutes ago|||

I have an alternative take.

If hyperscalers are using more RAM, and that RAM is not available for consumers, it means all the heavy stuff will happen in the cloud. Why would we want both the hyperscalers and consumers to have RAM simultaneously? Consumers would want more RAM to run local models but then hyperscalers capacity will be unused.

nik282000 1 hour ago||

I feel like by the time the AI bubble bursts the PC market will be irreparably damaged. Manufactures who have been making "enterprise" parts aren't going to go back to making consumer parts because there will be no market for it. And with a glut of datacenters not making any money on slop, they are going to be repurposed for saas, stuff like OnShape but for every application.

Most users don't seem to care about storing everything they generate in cloud services and this could easily be sold as an alternative to owning "expensive" desktop or laptop hardware.

dawnerd 1 hour ago|||

They’re going to pivot to you renting desktop cloud compute instead of owning anything.

bitwize 1 hour ago||

Enjoy your HP laptop subscription, it's all the computer you're going to get moving forward.

nik282000 56 minutes ago||

It's the reason I just build a new PC, despite the insane prices, I'd rather overpay than have reasonable prices but no stock to buy. With any luck I'll get 8-10 years out of this one and by then the PC landscape will be something else entirely.

MattDamonSpace 42 minutes ago|||

“Bubble”

Legend2440 1 hour ago||

I wonder why the hyperscalers aren't vertically integrating more and building their own fabs. Sure, a fab costs a billion dollars, but they're currently spending hundreds of billions of dollars purchasing chips from NVidia and others.

epistasis 1 hour ago||

I'm not sure if they should vertically integrate, it would probably be a better idea to directly fund the expansion of capacity, much like Apple does when they scale up a new technology for iPhones.

However, that the hyperscalers and AI companies aren't doing this says a lot about their true beliefs about how much future demand AI will have.

AI companies claim they will need a ton of massive expansion, but are unwilling to take on the risk of the capital needed for that expansion.

I'm hearing a lot of sad whining from AI folks about how these chip makers are holding them back, but who actually has the money to finance the expansion easily? Chip makers have been through this game far longer, when Sam Altman went around claiming it was time for $7T of fabs the AI companies made it clear that they were willing to make ridiculous claims, eliminating credibility.

What's needed now is for them to funnel a tiny amount of their massive piles of cash into financing fabs directly.

energy123 34 minutes ago||

Oracle is getting sold because of how much capex they're spending on new data centers in the middle of a high rates environment. It's not like they're stockpiling cash due to doubting AI.

nicoburns 46 minutes ago|||

Because fabs are about the most complex cutting edge technology out there: the "rocket science" of our day (or one of them). And merely having the money is not sufficient. It would be very easy to blow several billion dollars and end up with nothing to show for it.

Just look at how Intel has struggled to compete in recent years, and they have been in the business for decades.

tjwebbnorfolk 40 minutes ago||

Intel struggled because they bet the company that Moore's law was over back in ~2014, and instead of upgrading their fabs to EUV they sent the money back to shareholders.

They forgot Moore's main lesson: only the paranoid survive. They thought they could coast, and it nearly killed them.

aleph_minus_one 26 minutes ago||

> They forgot Moore's main lesson: only the paranoid survive.

"Only the Paranoid Survive" is rather a quote and book title by Andrew S. Grove.

jacekm 1 hour ago||

A fab takes years to build even when you have the necessary know-how. If you don't it'll take some additional experimenting before you can compete with the established manufacturers. By the time you can produce a usable chip the shortage might be over.

KronisLV 1 hour ago||

I'm not moving past my DDR4 build (and the 32 GB of DDR4 2133 MHz backup chips I still have around from way back, before I got the current 3200 MHz ones) until the prices go back to being at least partially sane. This also means that CPU manufacturers are not getting my money (since the 5800X is fine for now) and I have no reason to get a new GPU either (though admittedly the B580 isn't perfect).

johnvanommen 1 hour ago|

What if this is the lowest that prices will ever be?

skiing_crawling 1 hour ago||

I recently built a system at insane ddr4 prices ($2000 for 256gb). But that’s only after seeing how ddr5 prices were 3-4x that!

preisschild 1 hour ago||

Yeah I upgraded all of my systems to DDR5 last year, so now I have to buy for ddr5 memory upgrades.

Joel_Mckay 1 hour ago||

Had to fork over almost $1k for a 64G DDR5 kit a few weeks back. At least AMD chips large L3 cache allows folks to get away with lower grade udimms.

Also had to do an Intel build, and there was no way we were going cudimm at current prices. =3

I_am_tiberius 57 minutes ago|

It seems to me the max memory you can buy in a laptop stagnated for the past 3 years or so.

giancarlostoro 51 minutes ago||

I have always felt insulted that most laptops even offer a low 4 GB of RAM I rather take 16 GB in previous gen memory

ffaccount2 39 minutes ago||

My several years old laptop has 128GB of RAM, is that not enough? I admit that it's a pretty heavy one.

grapedangackle 33 minutes ago||

[dead]

More comments...