Posted by retskrad 2 days ago
Smartphones are an absolute graveyard of fads; remember the 3D screen phones, the phones with projectors, and so forth? They generally go nowhere. I suspect 'AI' on phones will be similar.
Overwhelmingly, what people want out of phones is "like my current phone, but with better battery life and maybe a better camera." Previously 'faster' was also a concern, but modern phones are largely Good Enough.
Technically, I understand the difference in the technology, I just don't know who needs that vs who gets excited about new features for a brief moment.
I would love to take a photo using my smartphone that doesn't look pixelated, blurry and or over-processed. Maybe asking too much considering smartphone sensors can't compete with DSLRs in some situations, but I'm always baffled with how dark and desaturated some of my photos turn out on my smartphone, as well.
Now pendulum has swung so all go for 'realistic look', but I expect people actually want rather milder version of above.
Phone photos look OK on phones, but enlarged even the top contender from current dxomark show very much how hardware limits work. Its just not presentable, maybe apart from very bright scenes. Now I wouldn't go bashing phones per se, its marvel what they achieve from those tiny plastic lenses and some CPU time. And they are always there. But any low hanging fruit in phone photography was picked up long time ago by whole market, what lies ahead are slow computational improvements, coupled with very slow increase in size and thickness of camera section of phones to capture more light.
Even things like "[my dog name] beach" isn't reliable.
Or things like I use photos as notes. It doesn't reliably recall things like cheese when I take pictures of cheese in various stores to remember what is sold where. Not even the name of the cheese; just cheese. Ditto spices.
I've found google photos search to be pretty good, and if it can't find something usually the map-mode is enough to pin it down (e.g. go to the beach where you took the photo and it shows you the photos from there)
I did just check, and "dog at beach" generates sub 20% recall for me. I go to the beach weekly with my dog, take lots of photos because I'm a dork, and that first query skips many weeks.
Also, I did add my dog as a known / named pet under the explore tab, which is why I thought the name should work.
I can make it work by picking out the beach via geo, but I think the whole thing illustrates how much better this could be. I'd like to be able to get responses to queries like
* [pet name] at [beach X]
* [pet name] with sand on face
* dead seal, or even just dead animal (pics on beach)
* seaglass (recall is poor there too until I manually added to a photo album)
* dent in car
* [spice name] (I take pics of spices to know which stores offer what)
etc etc. The only way I manage the thousands of photos I have now is by carefully sorting into hundreds of albums, which google also doesn't support well.
Amongst the many many deficiencies of the app (which, tbf, does work extremely well as a read-through cache and seems to back things up very well), it likes to surface spotlights of dead people and pets. Which is not at all what I want proactively surfaced.
It was also hobbled quite hard near the start as they had a scare with searches for "gorilla" accidentally returning pictures of black people so they have probably turned all the safety knobs they have up to 11, even if that impacts recall.
It's not bad, it's really really bad.
The image clean-up feature is utterly useless, and I think it's one of the areas where we can clearly see the difference between a company for whom Generative AI is an afterthought (Apple), and the competition. I paid $1500 for the new iPhone Pro Max and a great part of the deal was the Apple Intelligence support, but frankly, I might as well switch to Android at this point because I'm really disappointed at Apple's take on AI. I'll probably wait until the official Apple Intelligence is introduced but tbh I don't think there will be much improvements over this version.
And as for Siri: It's as stupid as it ever was. I ask it to convert something from lbs to kg and it responds "there's no music playing". If anything, its natural language comprehension has degraded.
Currently, there's no context awareness, so I still can't ask Siri "how do I respond to this email?".
Really, the only thing that "works" is the ChatGPT feature that describes an image you send it. Anything Apple-related is bonkers. It's really embarrassing.
As far as the future of compute goes, Meta seems like it has a much more compelling argument with the Orion glasses and their investment in AI.
Yes, but Apple fans always respond by saying "...but Apple has been using ML in much of the OS for years, you just don't see it..."
I really want Apple to be better at this, but Generative AI is too uncontrollable for a control-freak company.
“Siri get me directions to McDonald’s” ‘I found this on the web for you:’ <serp for “Direct In Donald”>
Or it would get directions to a McDonalds in a different state. Or say “I’m sorry, I don’t see a McDonald in your contacts.”
All this AI marketing push has got to be because they think investors are stupid, and they can fool the market into thinking they're doing an AI.
That said, I think the tradeoffs being made right now are probably the right ones. Apple's latest devices have gone to an electrically released sort of adhesive (versus the older pull-strip removeable adhesive, which is a big step up from the "glue it in" approach many vendors take), and for a given volume, you get more battery if you can rely on the phone to protect it from damage - which is why almost everything with an internal battery uses some variety of pouch cell. They're quite a bit more fragile than the hard-cased batteries, but you get a lot more battery in the volume than you do with the hard cased ones.
As long as it's not incredibly irritating to replace the battery, I'm fine optimizing the daily use thing (battery life in a given phone size) over the once-every-few-years thing (replacing the battery).
I suppose "stability improvements in WebKit" doesn't do much for Apple's stock price compared to SiriGPT. But this is feeling like death-by-1000-cuts: I don't think users are deeply committed to Apple UX/etc. I believe Apple's US market dominance is largely due to burnt fingers around Android's unreliability and annoyances between 2010-2020 (e.g. Google and Samsung not playing nice, badly orchestrated version changes). This was never a permanent state of affairs; Android has stabilized significantly and is more harmonized among competing manufacturers, while retaining its advantages on price and ease of development.
Apple seems complacent on the basics, and is overextending into an AI product few users seem to want.
When my daughter was studying Chinese, I could use the live-video translation app and see the lesson text translated to English, and see her hand-written answers also translated to English. I could see this being more broadly useful when travelling, along with live translation of spoken words.
I don't know if LLM-based translation is better than previous translation models.
Getting hardware to enable faster AI processing on phones should be good thing if used for useful tasks, LLM or not.
E.g. if it sees that I always reopen an application 2 seconds after the OS kills it in the background, then maybe it shouldn't be killed.
Or if I wake up 3 minutes before the alarm would go off, and take a trip to the toilet, maybe it shouldn't blow up the speaker while I'm frantically pulling up my underpants, but recognize that I'm already awake, or at least wait with the alarm until I'm around the phone again.
Or automatic backlight shouldn't go crazy when I walk in the night under the streetlamps, it should recognize that lamps are coming and going, and that backlight adjustment every 5 seconds is silly and annoying.
I could go on. IMO there is definitely a place for machine learning/AI in phones (and other places too), especially for quality of life thingies. Just nobody is doing them, I guess becacuse these are not as visible as image generation. My credit card has been ready to spend on such developments since at least 2021. One of these days I will have enough of waiting and do it myself, out of spite...
I'd also love to be able to give commands that traverse multiple apps (e.g. take my google sheet and venmo request everyone the specified amount). Most likely this would happen by teaching an AI tool use and having apps expose an API.
I'd love to be able to give voice commands for certain things (e.g. flipping through recipes when my hands are wet) and have the phone be able to do the actual thing I want.
I actually think phones are a much better place for AI since they're so difficult to type on that voice could provide a higher-bandwidth interface.
AI could provide more human-oriented direction that focus on key landmarks and decisions rather than every minor turn. For example:
"Hop on 80 West, cross the bridge, take Sir Francis Drake onto 101 South, take the Alexander Avenue exit, don't go through the tunnel, and your destination will be on the right."
At one point, I had ChatGPT working via voice in CarPlay mode (via Shortcuts I think?), but seems like Apple disabled that at one point, for some stupid reason probably.
Otherwise, totally with you. No idea why my phone needs AI. I can just open the ChatGPT app if I want to have a discussion with ChatGPT about something. I'm so tired of apps updating to "Add a new AI assistant!" like why do I need to talk to an LLM in most of the apps I use?