Posted by nnx 5 days ago
The problem is, "The Only Thing Worse Than Computers Making YOU Do Everything... Is When They Do Everything *FOR* You!"
"ad3} and "aP might not be "discoverable" vi commands, but they're fast and precise.
Plus, it's easier to teach a human to think like a computer than to teach a computer to think like a human — just like it's easier to teach a musician to act than to teach an actor how to play an instrument — but I admit, it's not as scalable; you can't teach everyone Fortran or C, so we end up looking for these Pareto Principle shortcuts: Javascript provides 20% of the functionality, and solves 80% of the problems.
But then people find Javascript too hard, so they ask ChatGPT/Bard/Gemini to write it for them. Another 20% solution — of the original 20% is now 4% as featureful — but it solves 64% of the world's problems. (And it's on pace to consume 98% of the world's electricity, but I digress!)
PS: Mobile interfaces don't HAVE to suck for typing; I could FLY on my old Treo! But "modern" UI eschews functionality for "clean" brutalist minimalism. "Why make it easy to position your cursor when we spent all that money developing auto-conflict?" «sigh»
The other great thing about this mode is that it can double as a teaching methodology. If I have a complicated interface that is not very discoverable, it may be hard to sell potential users on the time investment required to learn everything. Why would I want to invest hours into learning non-transferrable knowledge when I'm not even sure I want to go with this option versus a competitor? It will be a far better experience if I can first vibe-use the product , and if it's right for me, I'll probably be incented to learn the inner workings of it as I try to do more and more.
> The other great thing about this mode is that it can double as a teaching methodology.
gvim has menus and puts the commands in the menus as shortcuts. I learned from there vim has folding and how to use it.
What we're really seeing is specific applications where conversation makes sense, not a wholesale revolution. Natural language shines for complex, ambiguous tasks but is hilariously inefficient for things like opening doors or adjusting volume.
The real insight here is about choosing the right interface for the job. We don't need philosophical debates about "the future of computing" - we need pragmatic combinations of interfaces that work together seamlessly.
The butter-passing example is spot on, though. The telepathic anticipation between long-married couples is exactly what good software should aspire to. Not more conversation, but less need for it.
Where Julian absolutely nails it is the vision of AI as an augmentation layer rather than replacement. That's the realistic future - not some chat-only dystopia where we're verbally commanding our way through tasks that a simple button press would handle more efficiently.
The tech industry does have these pendulum swings where we overthink basic interaction models. Maybe we could spend less time theorizing about natural language as "the future" and more time just building tools that solve real problems with whatever interface makes the most sense.
The article is useful as it's enunciated arguments which many of us have intuited, but are not necessarily able to explain ourselves.
> That is the type of relationship I want to have with my computer!
He means automation of routine tasks? Took 50 years to reach that in the example.
What if you want to do something new? Will the thought guessing module in your computer even allow that?
If we want an interface that actually lets us work near the speed of thought, it can't be anything that re-arranges options behind our back all the time. Imagine if you went into your kitchen to cook something and the contents of all your drawers and cupboards had been re-arranged without your knowledge! It would be a total nightmare!
We already knew decades ago that spatial interfaces [1] are superior to everything else when it comes to working quickly. You can walk into a familiar room and instinctively turn on a light by reaching for the switch without even looking. With a well-organized kitchen an experienced chef (or even a skilled home cook) can cook a very complicated dish very efficiently when they know where all of the utensils are so that they don't need to go hunting for everything.
Yet today it seems like all software is constantly trying to guess what we want and in the process ends up rearranging everything so that we never feel comfortable using our computers anymore. I REALLY miss using Mac OS 9 (and earlier). At some point I need to set up some vintage Macs to use it again, though its usefulness at browsing the web is rather limited these days (mostly due to protocol changes, but also due to JavaScript). It'd be really nice to have a modern browser running on a vintage Mac, though the limited RAM would be a serious problem.
Even I can make a breakfast without looking in my kitchen, because I know where all the needed stuff is :)
On another topic, it doesn't have to look well organized. My home office looks like a bomb exploded in it, but I know exactly where everything is.
> I REALLY miss using Mac OS 9 (and earlier).
I was late to the Mac party, about the Snow Leopard days. I definitely remember that back then OS X applications weren't allowed to steal focus from what I had in the foreground. These days every idiotic splash screen steals my typing.
Natural language is very lossy: forming a thought and conveying that through speech or text is often an exercise in frustration. So where does "we form thoughts at 1,000-3,000 words per minute" come from?
The author clearly had a point about the efficiency of thought vs. natural language, but his thought was lost in a layer of translation. Probably because thoughts don't map cleanly onto words: I may lack some prerequisite knowledge to graph what the author is saying here, which pokes at the core of the issue: language is imperfect, so the statement "we form thoughts at 1,000-3,000 words per minute" makes no sense to me.
Meta-joking aside, is "we form thoughts at 1,000-3,000 words per minute" an established fact? It's oddly specific.
I also have my doubts about the numbers put forward on reading, listening and speaking. When reading, again I can read words about as fast as I can speak words. When I'm reading, I am essentially speaking out the words but in my mind. Is that not how other people read?
This stuff is fascinating.
For me, when I need to think clearly about a specific/novel thing, a monologue helps, but I don't voice out thoughts like "I need a drink right now".
Also I read much faster than I speak, I have to slow down while reading fiction as a result.
Has it even been tried? Is there an iPhone text editing app with fully customizable keyboard that allows for setting up modes/gestures/shortcuts, scriptable if necessary?
> A natural language prompt like “Hey Google, what’s the weather in San Francisco today?” just takes 10x longer than simply tapping the weather app on your homescreen.
That's not entirely fair, the natural language could just as well be side button + saying "Weather" with the same result, though you can make app availability even easier by just displaying weather results on the homescreen without tapping
iPad physical keyboards also have shortcuts.
What did they have in their touch interfaces?
It might be hard to understand now, but Blackberry power users could be much more productive with email/texting than any phone that exists today. But they were special purpose 2-way radio (initially, pager) devices that lacked the flexibility of modern apps with full internet data access.
I don't remember where else they used voice, they had a lot of other interface types they switched between. Tried searching for a clip and found this quote:
> The voice interface had been problematic from the start.
> The original owner was Chinese so, I turned the damn thing off.
So yes, quite realistic :-)