Top
Best
New

Posted by swolpers 1 day ago

Computer use in Gemini 3.5 Flash(blog.google)
237 points | 160 commentspage 2
fridder 23 hours ago|
I wonder if it will be better at building TUI's. It has been absolutely abysmal at interacting with them and building them
chatmasta 23 hours ago||
Claude can build UI but it sucks at testing it and iterating on it. Fable showed some improvements in this regard but alas.
Chu4eeno 22 hours ago||
It seems to do it just fine when in desktop applications using Qt, fwiw., it leverages all the standard Qt GUI testing stuff (and if you have the money you can just integrate Squish which has LLM support now).
IncreasePosts 20 hours ago||
That's my experience too. I've had increased luck encouraging the LLM to structure the code in "functional core, imperative shell" style, and telling it stupid things like "make sure you can test the code you're writing".
beastman82 1 day ago||
No UI like their competitors Claude CoWork or Codex. This is vaporware
knollimar 1 day ago||
Where is 3.5 pro?
squidbeak 22 hours ago|
Google said June, and all its model updates seem to be on Tuesdays, Wednesdays or Thursdays. So unless the release is slipping, either tomorrow or Tuesday.
WarmWash 22 hours ago||
Rumor is now July, although preliminary A/B tests people are getting show promise with whatever they have right now.
villgax 1 day ago||
Will it skip Ads lol
humblyCrazy 1 day ago|
I looked at their demo and it does not
chatmasta 23 hours ago||
Better question might be will it skip recaptcha?
SXX 20 hours ago||
Only if its needed to save your grandma and a cat. It will hack few servers along the way.
vulcan1964 15 hours ago||
Hot take: "computer use" is a dumb term for this concept; almost as if it was named by AI models...

Case in point: "We are already seeing customers drive value with computer use."

Yes... since the early 1980s, most companies and businesses have driven their value with computer use... smdh.

I'm no AI dev, but dare I suggest a better possible name for this: "agentic computer software interaction" which can be shortened to agent_actor

I swear, the direction we are headed with Big Tech leading the way will surely spell long term disaster

ai_fry_ur_brain 19 hours ago||
I have basically unlimited access to every SOTA model and I opt for gemini flash 3.5 9/10 times I use an LLM.

Llms are mostly useless but when I do use them its with gemini. If they're going to waste my time 95% of the time, I might as well get it over with fast.

zuzululu 1 day ago||
performance is quite impressive given that its 3x cheaper than 5.5
SoMomentary 21 hours ago|
The speed was impressive when I tested it but unfortunately the accuracy left a lot to be desired. Be interesting to do the math on some of my normal workflows to see where the break even is between them, assuming the tasks you have can tolerate a couple of failures.
zuzululu 18 hours ago||
we are talking about computer use here

gemini 3.5 flash isn't meant to compete head to head with frontier models on tough problems

cws_ai_buddy 22 hours ago||
[flagged]
jkwang 11 hours ago||
[dead]
shafiemoji 13 hours ago|
My work requires me to use `agy cli` (Google AI Ultra) for development, and it's been incredibly frustrating. I strongly dislike the Gemini models because they consistently fail to grasp basic instructions. I also can't use the Claude models included in the AI Ultra plan because the agy cli wrapper makes the experience completely unusable. I'd rather use the free plan on OpenCode than deal with this Gemini setup.
perbu 13 hours ago|
They can’t follow instructions at all. They are a year behind Claude.