Top
Best
New

Posted by jeffmcjunkin 6 hours ago

Google releases Gemma 4 open models(deepmind.google)
874 points | 262 commentspage 3
jwr 6 hours ago|
Really looking forward to testing and benchmarking this on my spam filtering benchmark. gemma-3-27b was a really strong model, surpassed later by gpt-oss:20b (which was also much faster). qwen models always had more variance.
mhitza 5 hours ago||
If you wouldn't mind chatting about your usage, my email is in my profile, and I'd love to share experiences with other HNers using self-hosted models.
jeffbee 6 hours ago||
Does spam filtering really need a better model? My impression is that the whole game is based on having the best and freshest user-contributed labels.
drob518 1 hour ago|||
He said it’s a benchmark.
hrmtst93837 3 hours ago|||
Better models help on the day the spam mutates, before you have fresh labels for the new scam and before spammers can infer from a few test runs which phrasing still slips through. If you need labels for each pivot you're letting them experiment on your users.
jeffbee 3 hours ago||
In my experience the contents of the message are all but totally irrelevant to the classification, and it is the behavior of the mailing peer that gives all the relevant features.
popinman322 2 hours ago||
Does anyone know whether we'll be receiving transcoders for this batch of models? We got them for Gemma 3, but maybe that was a one-off.
fooker 6 hours ago||
What's a realistic way to run this locally or a single expensive remote dev machine (in a vm, not through API calls)?
matja 5 hours ago|
I'm running Gemma 4 with the llama.cpp web UI.

https://unsloth.ai/docs/models/gemma-4 > Gemma 4 GGUFs > "Use this model" > llama.cpp > llama-server -hf unsloth/gemma-4-31B-it-GGUF:Q8_0

If you already have llama.cpp you might need to update it to support Gemma 4.

hikarudo 3 hours ago||
Also checkout Deepmind's "The Gemma 4 Good Hackathon" on kaggle:

https://www.kaggle.com/competitions/gemma-4-good-hackathon

kuboble 4 hours ago||
Im really looking forward to trying it out.

Gemma 3 was the first model that I have liked enough to use a lot just for daily questions on my 32G gpu.

sigbottle 4 hours ago||
There are so many heavy hitting cracked people like daniel from unsloth and chris lattner coming out of the woodworks for this with their own custom stuff.

How does the ecosystem work? Have things converged and standardized enough where it's "easy" (lol, with tooling) to swap out parts such as weights to fit your needs? Do you need to autogen new custom kernels to fix said things? Super cool stuff.

bredren 4 hours ago|
Thanks for the notes, for those interested in learning more:

- Lattner tweeted a link to this: https://www.modular.com/blog/day-zero-launch-fastest-perform...

- Unsloth prior post on gemma 3 finetuning: https://unsloth.ai/blog/gemma3

bearjaws 4 hours ago||
The labels on the table read "Gemma 431B IT" which reads as 431B parameter model, not Gemma 4 - 31B...
whhone 5 hours ago||
The LiteRT-LM CLI (https://ai.google.dev/edge/litert-lm/cli) provides a way to try the Gemma 4 model.

  # with uvx
  uvx litert-lm run \
    --from-huggingface-repo=litert-community/gemma-4-E2B-it-litert-lm \
    gemma-4-E2B-it.litertlm
stephbook 4 hours ago||
Kind of sad they didn't release stronger versions. $dayjob offers strong NVidias that are hungry for models and are stuck running llama, gpt-oss etc.

Seems like Google and Anthropic (which I consider leaders) would rather keep their secret sauce to themselves – understandable.

wg0 6 hours ago|
Google might not have the best coding models (yet) but they seem to have the most intelligent and knowledgeable models of all especially Gemini 3.1 Pro is something.

One more thing about Google is that they have everything that others do not:

1. Huge data, audio, video, geospatial 2. Tons of expertise. Attention all you need was born there. 3. Libraries that they wrote. 4. Their own data centers and cloud. 4. Most of all, their own hardware TPUs that no one has.

Therefore once the bubble bursts, the only player standing tall and above all would be Google.

whimblepop 5 hours ago||
I recently canceled my Google One subscription because getting accurate answers out of Gemini for chat is basically impossible afaict. Whether I enable thinking makes no difference: Gemini always answers me super quickly, rarely actually looks something up, and lies to me. It has a really bad unchecked hallucination problem because it prioritizes speed over accuracy and (astonishingly, to me) is way more hesitant to run web searches than ChatGPT or Claude.

Maybe the model is good but the product is so shitty that I can't perceive its virtues while using it. I would characterize it as pretty much unusable (including as the "Google Assistant" on my phone).

It's extremely frustrating every way that I've used it but it seems like Gemini and Gemma get nothing but praise here.

mike_hearn 2 hours ago|||
My wife was amazed to discover that Gemini recommended to her a local business that turned out to be in another country, and then after she checked and corrected it, it recommended a second that was marked as permanently closed on Google Maps.

ChatGPT got it right first time. Baffling.

neonstatic 4 hours ago||||
I used Gemma 3 for quite a few things offline and found it to be very helpful. Your experience with Gemini is very similar to mine, though. I hate the way it speaks with this fake-excited, reddit-coded, condescending tone and it is useless for coding.
staticman2 4 hours ago||||
I've found Gemini works better for search when used through a Perplexity subscription. (Though these things can quickly change).
logicchains 5 hours ago|||
Recently I had a pretty basic question about whether there was a Factorio mod for something so decided to ask it to Gemini, it hallucinated not one but two sadly non-existing mods. Even Grok is better at search.
whimblepop 4 hours ago||
Whenever I ask it questions about videogames (even very old ones), the odds that it will lie to me are very high. I only see LLMs get those right when they go look them up online.

The other thing that kills me about Gemini is that the voice recognition is god-awful. All of the chat interfaces I use have transcriptions that include errors (which the bot usually treats unthinkingly as what I actually said, instead of acting as if we may be using a fallible voice transcription), but Gemini's is the worst by far. I often have to start conversations over because of such badly mangled transcriptions.

The accuracy problems are the biggest and most important frustrations, but I also find Gemini insufferably chummy and condescending. It often resorts to ELI5 metaphors when describing things to me where the whole metaphor is based on some tenuous link to some small factoid it thinks it remembers about my life.

The experiences it seems people get out of Gemini today seem like a waste of a frontier lab's resources tbf. If I wanted fast but lower quality I'd go to one of the many smaller providers that aren't frontier labs because lots of them are great at speed and/or efficiency. (If I wanted an AI companion, Google doesn't seem like the right choice either.)

solarkraft 4 hours ago|||
I agree with the theory and maybe consumers will too. But damn, the actual products are bad.
0xbadcafebee 4 hours ago|||
Tiny AI labs with a fraction of Google's resources still turn out amazing open weights. But besides the logistics, the other aspect is can I use it? Gemini (and some other models) have a habit of dropping conversations altogether if it's "uncomfortable" with your question. Recently I was just asking it about financial implications of the war. It decided my ideas were so crazy that I must be upset, and refused to tell me anything else about finance in that chat. Whereas other models (not abliterated, just normal models) gave me information without argument, moralizing, or gaslighting. I think most people are gonna prefer the non-nerfed models, even if they aren't SOTA, because nobody wants to have an argument with their computer.
mhitza 5 hours ago|||
At the start of last year Gemma2 made the fewest mistakes when I was trying out self-hosted LLMs for language translation. And at the time it had a non open source license.

Really eager to test this version with all the extra capabilities provided.

chasd00 6 hours ago||
Not sure why you're being downvoted, the other thing Google has is Google. They just have to spend the effort/resources to keep up and wait for everyone else to go bankrupt. At the end of the day I think Google will be the eventual LLM winner. I think this is why Meta isn't really in the race and just releases open weight models, the writing is on the wall. Also, probably why Apple went ahead and signed a deal with Google and not OpenAI or Anthropic.
wg0 6 hours ago|||
I don't know why I am downvoted but Google has data, expertise, hardware and deep pockets. This whole LLM thing is invented at Google and machine learning ecosystem libraries come from Google. I don't know how people can be so irrational discounting Google's muscle.

Others have just borrowed data, money, hardware and they would run out of resources for sure.

faangguyindia 5 hours ago|||
Same can be said for java, yet google own android.
greenavocado 5 hours ago|||
This remains true so long as advertisers give Google money.
bitpush 5 hours ago||
Why wouldnt advertisers give Google money? Are you noticing any shift in trend?
WarmWash 5 hours ago|||
The rumor is also that Meta is looking to lease Gemini similar to Apple, as their recent efforts reportedly came up short of expectations.
More comments...