Posted by nnx 6 days ago
We might form fleeting thoughts much faster than we can express them, but if we want to formulate thoughts clearly enough to express them to other people, I think we're close to the ~150 words per minute we can actually speak.
I recently listened to a Linguistics podcast (lingthusiasm, though I don't recall which episode) where they talked about the efficiency of different languages, and that in the end they all end up roughly the same, because it's really the thought processes that limit the amount of information you communicate, not the language production.
And thoughts develop over time. They're often not conceived complete. That has been shown with some clever experiments.
And language production also puts a limit on our communication channel. It is probably optimized to convert communication intent into motor actions. It surely takes its time. That is not a problem for the system, since motor actions are slow. Idk where "lingthusiam" gets their ideas from, but there's psycholinguistic literature dating back to the 1920s that is often neglected by linguists.
Natural language isn't best described as data transfer. It's primarily a mechanism for collaboration and negotiation. A speech act isn't transferring data, it's an action with intent. Viewed as such the key metrics are not speed and loss, but successful coordination.
This is a case where a computer science stance isn't fruitful, and it's best to look through a linguistics lens.
There's a very similar obsession with the idea that things should be visual instead of textual. We tend to end up back at text.
Personal suspicion for both is the media set a lot of people's expectations. They loudly talked to the computer in films like 2001 or Star Trek for drama reasons, and all the movie computers generally fancy visual interactions.
I m not sure how it could fit in to my 2 modalities of work: (i) alone in complete focus / silence (ii) in the office where there is already too much spoken communication between humans... maybe it s just a matter of getting used to it
I would like to know what this measures exactly.
The reason I often prefer writing to talking is because writing lets me the time to pause and think. In those cases the bottleneck is very clearly my thought process (which, at least consciously, doesn't appear to me as "words").
E.g. say I find the scrollbars somewhere way too thin and invisible and I want thick high contrast scrollbars, and nobody thought of implementing that? Ask the AI and it changes your desktop interface to do it immediately.
1. > "What’s the voice equivalent of a thumbs-up or a keyboard shortcut?" Current ASR systems are much narrow in terms of just capturing the transcript. there is no higher level of intelligence, even the best of GPT voice models fail at this. Humans are highly receptive of non-verbal cues. All the uhms, ahs, even the pauses we take is where the nuance lies.
2. the hardware for voice AI is still not consumer ready interacting with a voice AI is still doesn't feel private. i am only able to do a voice based interaction only when am in my car. sadly at other places it just feels a privacy breach as its acoustically public. have been thinking about a private microphones to enable more AI based conversations.