Top
Best
New

Posted by rafaelcosta 5 days ago

My iPhone 16 Pro Max produces garbage output when running MLX LLMs(journal.rafaelcosta.me)
429 points | 216 commentspage 3
MarginalGainz 4 days ago|
[dead]
builderhq_io 5 days ago||
[dead]
tehwebguy 5 days ago||
[flagged]
lionkor 5 days ago||
[flagged]
bri3d 5 days ago||
> Or, rather, MiniMax is! The good thing about offloading your work to an LLM is that you can blame it for your shortcomings. Time to get my hands dirty and do it myself, typing code on my keyboard, like the ancient Mayan and Aztec programmers probably did.

They noticed a discrepancy, then went back and wrote code to perform the same operations by hand, without the use of an LLM at all in the code production step. The results still diverged unpredictably from the baseline.

Normally, expecting floating-point MAC operations to produce deterministic results on modern hardware is a fool's errand; they usually operate asynchronously and so the non-commutative properties of floating-point addition rear their head and you get some divergence.

But an order of magnitude difference plus Apple's own LLM not working on this device suggests strongly to me that there is something wrong. Whether it's the silicon or the software would demand more investigation, but this is a well reasoned bug in my book.

ErroneousBosh 5 days ago||
> Time to get my hands dirty and do it myself, typing code on my keyboard, like the ancient Mayan and Aztec programmers probably did.

https://ia800806.us.archive.org/20/items/TheFeelingOfPower/T...

I should think I'll probably see someone posting this on the front page of HN tomorrow, no doubt. I first read it when it was already enormously old, possibly nearly 30 years old, in the mid 1980s when I was about 11 or 12 and starting high school, and voraciously reading all the Golden Age Sci-Fi I could lay my grubby wee hands on. I still think about it, often.

netsharc 5 days ago|||
I found the article hard to read. I turned on reader mode. I still found it hard to read. Each sentence is very short. My organic CPU spins trying to figure out how each sentence connects to the next. Each sentence feels more like a paragraph, or a tweet, instead of having a flow. I think that's my issue with it.
mr_toad 5 days ago||
If it was written in turgid prose people would be frantically waggling their AI accusatory fingers.
netsharc 5 days ago||
Instead he writes Buzzfeed style: a sentence per paragraph, and then smushes several paragraphs into one.

(The idea being, a paragraph usually introduces a new thought.)

decimalenough 5 days ago|||
My TL;DR is that they tried to run an on-device model to classify expenses, it didn't work even for simple cases ("Kasai Kitchin" -> "unknown"), they went deeeeeep down the rabbit hole to figure out why and concluded that inference on their particular model/phone is borked at the hardware level.

Whether you should do this on device is another story entirely.

jojobas 5 days ago|||
Why shouldn't you? It's your device, it has hardware made specifically for inference.

What's to be gained, other than battery life, by offloading inference to someone else? To be lost, at least, is your data ownership and perhaps money.

dghlsakjg 5 days ago||
> What's to be gained... by offloading inference to someone else?

Access to models that local hardware can't run. The kind of model that an iphone struggles to run is blown out of the water by most low end hosted models. Its the same reason that most devs opt for claude code, cursor, copilot, etc. instead of using hosted models for coding assistance.

jojobas 5 days ago|||
Claude code produces stuff orders of magnitude more complicated than classifying expenses. If the task can be run locally on hardware you own anyway, it should.
selcuka 5 days ago|||
But apparently this model is sufficient for what the OP wants to do. Also apparently it works on iPhone 15 and 17, but not on 16.
wolvoleo 5 days ago|||
I would really not want to upload my expense data to some random cloud server, nope. On device is really a benefit even if it's not quite as comprehensive. And really in line with apple's privacy focus so it's very imaginable that many of their customers agree.
the_arun 5 days ago||
[flagged]
ploum 5 days ago||
Well it seems that, those days, instead of SUM(expense1,expense2) you ask an LLM to "make an app that will compute the total of multiple expenses".

If I read most of the news on this very website, this is "way more efficient" and "it saves time" (and those who don’t do it will lose their job)

Then, when it produces wrong output AND it is obvious enough for you to notice, you blame the hardware.

janalsncm 5 days ago|||
The author is debugging the tensor operations of the on-device model with a simple prompt. They confirmed the discrepancy with other iPhone models.

It’s no different than someone testing a calculator with 2+2. If it gets that wrong, there’s a hardware issue. That doesn’t mean the only purpose of the calculator is to calculate 2+2. It is for debugging.

You could just as uncharitably complain that “these days no one does arithmetic anymore, they use a calculator for 2+2”.

Dylan16807 5 days ago||||
The app-making wasn't being done on the phone.

The LLM that malfunctioned was there to slap categories on things. And something was going wrong in either the hardware or the compiler.

bri3d 5 days ago|||
I mean, Apple's LLM also doesn't work on this device, plus the author compared the outputs from each iterative calculation on this device vs. others and they diverge from every other Apple device. That's a pretty big sign that both, something is different about that device, and this same broken behavior carried across multiple OS versions. Is the hardware or the software "responsible" - who knows, there's no smoking gun there, but it does seem like something is genuinely wrong.

I don't get the snark about LLMs overall in this context; this author uses LLM to help write their code, but is also clearly competent enough to dig in and determine why things don't work when the LLM fails, and performed an LLM-out-of-the-loop debugging session once they decided it wasn't trustworthy. What else could you do in this situation?

bri3d 5 days ago|||
Somewhere along the line, the tensor math that runs an LLM became divergent from every other Apple device. My guess is that there's some kind of accumulation issue here (remembering that floating-point accumulation does not usually commute), but it seems genuinely broken in an unexpected way given that Apple's own LLM also doesn't seem to work on this device.
lxgr 5 days ago||
LLMs are applied math, so… both?
ernsheong 5 days ago||
[flagged]
giancarlostoro 5 days ago||
[flagged]
bri3d 5 days ago||
If you’d read the whole thing, you would go on a debugging journey that both involved bypassing the LLM and was appropriate for HN (vs not dismissing the article), so you might want to do that.
Playboi_Carti 5 days ago|||
It's not about LLMs doing math.
dummydummy1234 5 days ago||
Uhh, that's not the article, the article is running a ml model, on phone and floating point opps for tensor multiplication seems to be off.
RiceNBananas 5 days ago||
[flagged]
vanviegen 5 days ago||
[flagged]
ohyoutravel 5 days ago|
[flagged]
PlatoIsADisease 5 days ago|
[flagged]
dghlsakjg 5 days ago||
I severely doubt your thesis around iPhones being Veblen goods.

You are claiming that if the price of the iPhone went down, apple would sell fewer phones?

Correspondingly, you are arguing that if they increased prices they could increase sales?

You are claiming that 100s of millions of people have all made the decision that the price of an iPhone is more than it is worth to them as a device, but is made up for by being seen with one in your hand?

Not all goods that signify status are Veblen goods.

shiroiuma 5 days ago||
>Correspondingly, you are arguing that if they increased prices they could increase sales?

Veblen goods aren't like this. If they were, everything would be priced at infinity. Veblen goods have to take into account the amount of spending money their target customers have, and how much they're willing to spend. Apple products are priced this way. They're not targeted just at people who can afford Rolls-Royce Silver Shadows, they're targeted at regular people who are willing to spend too much money on a phone when they can get an equivalent Android phone for half the price. Those people have limited money, but they're willing to overpay, but only so much.

>You are claiming that if the price of the iPhone went down, apple would sell fewer phones?

Quite likely, yes. If they adopted razor-thin profit margins on iPhones, their phones would be seen as "cheap" and wouldn't have the cachet they have now. More people would start looking at alternatives, and start buying Samsung Galaxies and other flagship Android phones.

dghlsakjg 5 days ago||
> Veblen goods aren't like this.

Increasing demand with increasing prices is the very definition of a Veblen good. I never said anything like pricing them at infinity (an exceptionally stupid way of saying that something is not for sale).

I simply pointed out that there isn’t really any reason to believe that a mass produced easily available phone that holds a massive percentage of the entire global cell phone market would see increased demand from increased prices. It is an extraordinary claim with nothing resembling evidence. The most damning evidence is that the most expensive iPhone, the Pro Max, is outsold 2:1 by the base model for the last three generations, despite being visually distinguishable. (The 17 saw initial sales of Pro Maxes higher than base, but that appears to have corrected. Easily understandable that early adopters are more willing to pay for the best version of new tech)

There is an argument to be made that the Pro Max flirts with Veblen for small parts of the market, or that certain submarkets in poorer countries treat the iPhone that way, but that all looks more like conspicuous consumption. I still don’t believe that Pro Max sales increase if the price increases. A few individuals or submarket will not have the ability to invert a demand curve for an Apple device.

Again, I think that you are confusing conspicuous consumption with a Veblen good. This sentence is the giveaway:

> Those people have limited money, but they're willing to overpay, but only so much.

What you are describing is a normal demand curve. As price rises fewer people are willing to pay. People being unable to pay for something they still want does not make something a Veblen good (that would make insulin a Veblen good). You are describing a steep demand curve, not a reversed one.

Just because you perceive that an equivalent android can be purchased for half the price does not mean that everyone uses your criteria. I tried switching to a lower priced android made by google. In no way was it equivalent for my purposes. and I still wouldn’t want it. I am happy to pay the price, not because I care about being seen with an iPhone, but because it is the tool that I have determined to best suit my purposes. Many people refuse to believe this, but many people like the Apple ecosystem.

ohyoutravel 5 days ago|||
This is a conclusion that comes with some personal baggage you should identify and consider addressing.
gambiting 5 days ago|||
I mean, I think it's cultural. In US it seems like everyone has an iphone, it's almost kinda quirky not to have one. But in some other places, an iPhone is more than your monthly salary - having one is definitely a symbol of status. Less so than it used to be, but it still has that.
dghlsakjg 5 days ago|||
iPhones in the US have an estimate ~55% market share depending on source. Owning an Android wasn't unusual in the least when I lived there, and appears to be pretty popular.

I don't think its unusual that a country with high median income and higher average income will tend to gravitate towards more expensive phones. Given that Apple doesn't make a cheap phone, it kind of follows that wealthier countries will buy more iPhones.

Of course the opposite is true as well, In a country where an iPhone is measured in months of salary, they won't sell well, but I'd be willing to bet that Androids in that price tier sell like shit in those countries too.

Is it a status symbol? arguably. But it also correlates pretty strongly with median income.

ohyoutravel 5 days ago|||
Fair, but that’s a comment on a US-centric website, run by a US-centric company, in a US-centric industry, on a US-centric medium. So if they didn’t mean US, I think the onus is on them to clarify exactly where this applies.
PlatoIsADisease 5 days ago|||
Admittedly, I hate companies that live off their marketing. Nintendo, Disney, Apple. I hate that these companies can weaponize psychology against humans.

Function > Form.

I think its a Hero Complex, if Jung is correct.

raw_anon_1111 5 days ago|||
Yes because 60% of US phone buyers buy an iPhone to stand out from the average US phone buyer and they shouldn’t because it doesn’t run local llm’s well?
shiroiuma 5 days ago||
That's the least of the problems with using an iPhone.
raw_anon_1111 5 days ago||
So exactly what problems do most people have with iPhones that could be solved with Android.
shiroiuma 4 days ago||
One really big one that comes to mind: watching YouTube videos without ads. On an Android, you just install Firefox, then install uBlock Origin inside it, then navigate to youtube.com and enjoy. Or you can install an app like Revanced; you might have to manually load the APK, but it's doable.
raw_anon_1111 4 days ago||
https://adblockpro.com/help/how-to-block-youtube-ads

Safari has supported real web extensions for at least 2 or three years on iOS.

Or you know you could pay for an ad free experience

DJBunnies 5 days ago||||
Macbooks and iPhones are good devices though, saying this as a primarily linux user.

There is no way a company could exist purely on marketing, Apple backs it up with tech.

wolvoleo 5 days ago|||
Some companies definitely do just exist on marketing. Some clothing brands are objectively overpriced crap and pure wealth signalling. Or something like a juicero.

But I agree Apple doesn't even though they've gone into a direction I couldn't follow them in.

shiroiuma 5 days ago|||
Not really. They back it up with "good enough tech" that looks pretty and sucks people in with marketing, and then locks them into a closed ecosystem. Admittedly, some of their tech is actually very good (e.g. M-series ARM-based CPUs), but much of it is nothing special, or worse, just copying something else that competitors have been doing for years, presenting it as brand-new, and claiming credit for it.

They did this with the always-on screens for phones. My LGs had this many, many years ago. It was so bad that when Apple finally brought it out and acted like they had invented it, coworkers saw my LG and asked if I had gotten the latest iPhone, and I had to point out that it was a 5-year-old LG.

And then there's other stuff that Apple has which is just plain bad, but they present as new and wonderful, such as the "island" keyboard.

anonymars 5 days ago||||
I'd almost say most companies live or die off their marketing. One could argue that understanding your customer as well as or better than they understand themselves is a strength.

To wit, some people do value form over function. Some people do prefer a safe, curated walled garden.

I am not among them--I say this as someone who cannot stand using most Apple products for more than a minute. But I respect what they offer(ed) and for some people even recommended them. (Now I'm less sure because it seems like everything tech has gone to shit, but I can't tell if that's just "old man yells at cloud" or what)

Ideally there would be enough competition for us all to find what we're looking for. I think anticompetitive behavior is a worse sin

kulahan 5 days ago|||
All three of these companies are supremely dedicated to the customer experience. It’s a weird thing to be annoyed at. Ninty is the only company really experimenting with gaming hardware. Disney parks are a thesis on hiding the “behind the scenes” stuff perfectly. Apple does its best to make things just kinda work well, and if you’re in their ecosystem fully, it usually does work out.

Not everyone cares for the most capable device on the planet. Sometimes people just want a pretty familiar and easy experience. I haven’t used my phone for anything more than browsing the web and texting in ages. I absolutely don’t care about whatever function you think I’m missing due to Apple, honestly.

As a side note, the fathers of Psychology were absolutely terrible scientists. The entire field almost failed because they took it so far into pseudo-science land. Of course Jung isn’t correct.

jwrallie 5 days ago|||
Can you prove that is still the case with the iPhone SE by showing a comparable hardware with similar long support on software updates and lower price?
B1FF_PSUVM 5 days ago||
> Its a demonstration of wealth. This is called Veblen good

Just the other day I was reminded of the poor little "I am rich" iOS app (a thousand dollar ruby icon that performed diddly squat by design), which Apple deep-sixed from the app store PDQ.

If misery loves company, Veblen goods sure don't.

More comments...