Top
Best
New

Posted by trq_ 10/24/2024

Claude Computer Use – Is Vision the Ultimate API?(www.thariq.io)
113 points | 90 commentspage 2
m3kw9 10/25/2024|
No, Vision in this case is a brute force way for the AI to interact with our current world because we designed the interface for human vision. In the future, AI creates the UI and their control will be low level most likely at the model level as even business logic+UI will be generated live.
PreInternet01 10/24/2024||
Counterpoint: no, it's just more hype.

Doing real-time OCR on 1280x1024 bitmaps has been possible for... the last decade or so? Sure, you can now do it on 4K or 8K bitmaps, but that's just an incremental improvement.

Fact is, full-screen OCR coupled with innovations like "Google" has not lead to "ultimate" productivity improvements, and as impressive as OpenAI et al may appear right now, the impact of these technologies will end up roughly similar.

(Which is to say: the landscape will change, but not in a truly fundamental way. What you're seeing demonstrated right now is, roughly speaking, the next Clippy, which, believe it or not, was hyped to a similar extent around the time it was introduced...)

simonw 10/24/2024||
The way these new LLM vision models work is very different from OCR.

I saw a demo this morning of someone getting Claude to play FreeCiv (admittedly extremely badly): https://twitter.com/greggyb/status/1849198544445432229

Try doing that with Tesseract.

croes 10/24/2024||
I bet Tesseract plays pretty badly too.
KoolKat23 10/24/2024|||
Existing OCR is extremely limited and requires custom narrow development.
acchow 10/24/2024||
"OCR : Computer Use" is as "voice-to-text : ChatGPT Voice"
freediver 10/25/2024||
And text is the ultimate API to human brain! ;)

https://www.youtube.com/watch?v=Zctp972y_Eg

cheevly 10/24/2024||
No, language is the ultimate API.
ukuina 10/24/2024|
On the instruction-provision end, sure.
throwaway19972 10/24/2024||
I'd imagine you'd get higher quality leveraging accessibility integrations.
bev-erage 10/24/2024|
[dead]