Posted by virgildotcodes 19 hours ago
Outsourcing was a great idea for making America, your home, lose. Oh well.
Ternus can’t come fast enough to revamp their corrupt management system and actually innovate again.
In all seriousness this timeline we are on sucks. I hate it. Send me to another multiverse.
Honestly Jassey, Zuck and Tim Apple are prob on the phone with Donnie. If oil companies are “gouging,” what is 85% margins on memory, threatening the whole bull run and raising compute, Killing AI, and raising iPhone/computer pricing? Countdown to DOJ antitrust case is ticking.
To be clear: I understand how markets work, Im just quoting Donald Trump's tweet from yesterday calling oil companies gouging, and I predict government intervention and polital pressures regardless of economic realities.
It is a very risky business, overestimate demand by too much and you go bankrupt. And yes, it is hard, especially HBM. Fabs are scaling up, but it is hard to estimate demand in 2029, and it may be better to not overshoot.
They also need to get in line to buy ASML EUV tooling, and ASML has to deal with scaling for their suppliers as well. There are tons of bottlenecks and complexities.
It is a commodity in that there are standards, not that there are many firms that can hit the standards.
This isn't gouging, this is bidding on fixed quantities and bidders having a high willingness to pay. Think of it like an auction.
These decisions play out on the order of trillions of dollars and 3+ year horizons. They're also incredibly sensitive to other geopolitical issues (Taiwan, issues with Chinese tech capability vs export/import controls, etc).
There are a lot of valid discussions to be had about how we got to this state of oligopoly: Taiwan's consistent sponsorship of its semiconductor capabilities and the subsequent concentration of technology (expertise, capacity, etc), the lack of investment/support (and ceding of technical leadership) in Western countries, the various rivalries with China and the implications of it becoming a first-class producer of semiconductors at scale, etc. None of those discussions and none of their potential outcomes can substantively change that we're going to continue in this situation (massive price increases, spotty availability, etc) for at least the next 18-24 months.
Same with SSD. I could pay another $3,000 to Apple for 7TB of SSD (go from 1TB to 8), or I could get the 1TB, use that as a system drive, and then buy a 4xM.2 NVMe PCIe chassis, and put in 4x2TB Samsung 990 drives from Amazon and OWC for $1,100, and have 9TB of usable storage, and for bonus points, the chassis was about 400MB/s faster.
Antitrust =/= gouging. Jacking up prices during a shortage (eg. electric generators just before a hurricane) might be considered gouging, but it doesn't fall under antitrust. It's just supply and demand.
https://www.micron.com/us-expansion/ny
https://www.micron.com/us-expansion/id
The first Idaho project is starting soon: "Micron has already achieved key construction milestones on its first Idaho fab with DRAM output scheduled to begin in 2027."
Micron executives, who typically offer cautious projections about the boom-bust memory business, said on their earnings call that “tight conditions” will persist beyond 2027. Just three months ago, they had projected tight conditions going only beyond this year.
In an interview Wednesday night, Micron Chief Business Officer Sumit Sadana said the company couldn’t make investments during the memory market’s last downturn, when Micron’s gross profits went negative, in part because certain customers took advantage to pay rock-bottom prices.
“We told a couple of the customers who were being very aggressive with pricing at that time that this is not constructive,” he said, without naming Apple, adding that low prices discouraged capital investments. “A lot of the industry investments got shut down in 2023 because of really poor pricing and really poor margins.”
The iPhone-maker is well known for using its huge memory and storage purchases as leverage to secure the lowest prices, say analysts and former memory company executives.
Maybe instead of antitrust the US could go back to tariffs, the universal cure for high prices.
https://www.asianometry.com/p/the-semiconductor-bust-still-c...
One fix for this problem: Allow US companies to buy memory chips from China. I saw an article about a month ago, that if my memory is correct in this, said that China is ramping up high-end memory manufacturing.
Fix number two: my country (USA) should cease and desist with the craziness that is data center buildouts for AI.
Clearly ‘BIG MONEY’ always needs a new thing (cloud -> crypto -> AI) and the powerful get what they want.
If the US Congress acted to benefit regular people rather than special interests (both party's are corrupt, disbelieve that if you want to live in a fantasy land) then anti-dumping laws would be passed.
If all companies and individuals paid the real price for tokens, then we collectively would work more efficiently. As is, the filthy rich get even filthier, and regular people will get screwed.
I'm relieved.
I also wouldn’t be surprised if memory providers weren’t intimately involved, as they’ve been caught price fixing in the past: https://en.wikipedia.org/wiki/DRAM_price_fixing_scandal
Alleviating the memory constraint would only really make Nvidia a danger to cloud margins, and their consumer sales are neutered while they focus on the datacenter segment. It's feels facetious to insinuate that people would be doing inference on their Macbook Neo or Wintel laptop if they only had a gorbillion gigabytes of memory and a 400W accelerator card plugged into the wall outlet.
There is a pretty large and growing community of us using entirely local models for our agentic flows. From GLM 4.7 flash on 32gb machines with >60tok/s to Gemma and Qwen dense and MOE models on 64gb machines all the way up to Deepseek V4 flash on 128gb machines with 450tok/s prefill and 25-30tok/s decode.
I use DS4 on the daily - it’s become my main model.
I know it’s in fashion to talk trash about Apple but their hardware outperforms other options like DGX Sparc when it comes to local inference, they got the unified memory, memory bandwidth and the GPU cores to actually be useful in a way that most other hardware just isn’t.
I also use it in local agent mode if im coding directly on the machine which is nice cause you can save sessions and resume them, and so for personal projects and training related stuff it's been great.
Even got an autoresearch loop going where the agent looks at the previous run, adjusts parameters and code if needed, and then hands off training to another script (so full system resources are available for training), ad infinitum - it works really well - what antirez has done with that project is pretty incredible.
GLM 4.7 Flash is a 30b model that was far behind SOTA at launch, and I know that because I pay for z.ai inference and have run the model locally. Qwen and Deepseek V4 Flash have the same issue, and beg the question; are you really going to process a 64k agentic context at 450tok/s? That's 2+ minutes that you spend waiting for the first token to generate! Of course nobody can sell that as competitive inference, and it only gets worse with larger models. We're talking about non-interactive speeds, here.
If you're satisfied with small local models, more power to you. It puts you in the same barrel as Strix Halo enthusiasts or the guys that bought 2x3090s on Reddit. You are completely ignoring the market if you think that any of those SOCs are unprecedented or unparalleled for inference workloads, though. The free DS4 API is faster at prefill and decode, you could not give away Mac inference at zero cost and compete with what China provides for free. That's how far behind Macs are for local inference, to put things into perspective.
The datacenter builders and the big hosted AI models. The person you're replying to even mentions OpenAI by name.
There are two things that would prevent people from using local models - pricing and regulations. And we're seeing moves from both of those fronts lately.
Hey, Infantino was ahead of the curve! For the same price as an English MBP, you can get an American one and see the Three Lions disappoint against Panama!