Google releases Gemma 4 open models

Posted by jeffmcjunkin 5 hours ago

Google releases Gemma 4 open models(deepmind.google)

730 points | 205 commentspage 2

karimf 3 hours ago|

I'm curious about the multimodal capabilities on the E2B and E4B and how fast is it.

In ChatGPT right now, you can have a audio and video feed for the AI, and then the AI can respond in real-time.

Now I wonder if the E2B or the E4B is capable enough for this and fast enough to be run on an iPhone. Basically replicating that experience, but all the computations (STT, LLM, and TTS) are done locally on the phone.

I just made this [0] last week so I know you can run a real-time voice conversation with an AI on an iPhone, but it'd be a totally different experience if it can also process a live camera feed.

https://github.com/fikrikarim/volocal

functional_dev 2 hours ago|

yeah, it appears to support audio and image input.. and runs on mobile devices with 256K context window!

minimaxir 5 hours ago||

The benchmark comparisons to Gemma 3 27B on Hugging Face are interesting: The Gemma 4 E4B variant (https://huggingface.co/google/gemma-4-E4B-it) beats the old 27B in every benchmark at a fraction of parameters.

The E2B/E4B models also support voice input, which is rare.

regularfry 4 hours ago|

Thinking vs non-thinking. There'll be a token cost there. But still fairly remarkable!

DoctorOetker 3 hours ago||

Is there a reason we can't use thinking completions to train non-thinking? i.e. gradient descent towards what thinking would have answered?

joshred 3 hours ago||

From what I've read, that's already part of their training. They are scored based on each step of their reasoning and not just their solution. I don't know if it's still the case, but for the early reasoning models, the "reasoning" output was more of a GUI feature to entertain the user than an actual explanation of the steps being followed.

mudkipdev 4 hours ago||

Can't wait for gemma4-31b-it-claude-opus-4-6-distilled-q4-k-m on huggingface tomorrow

entropicdrifter 3 hours ago||

I'd rather see a distill on the 26B model that uses only 3.8B parameters at inference time. Seems like it will be wildly productive to use for locally-hosted stuff

indrora 3 hours ago||

gemma4-31b-it-claude-opus-4-6-distilled-abliterated-heretic-GGUF-q4-k-m

Deegy 1 hour ago||

So what's the business strategy here?

Google is the only USA based frontier lab releasing open models. I know they aren't doing it out of the goodness of their hearts.

artificialprint 1 hour ago|

Release open weights so competitors can't raise good money, then rear naked choke when they run dry

ceroxylon 4 hours ago||

Even with search grounding, it scored a 2.5/5 on a basic botanical benchmark. It would take much longer for the average human to do a similar write-up, but they would likely do better than 50% hallucination if they had access to a search engine.

WarmWash 4 hours ago|

Even multimodal models are still really bad when it comes to vision. The strength is still definitely language.

stevenhubertron 2 hours ago||

Still pretty unusable on Raspberry Pi 5, 16gb despite saying its built for it, from the E4B model

  total duration:       12m41.34930419s
  load duration:        549.504864ms
  prompt eval count:    25 token(s)
  prompt eval duration: 309.002014ms
  prompt eval rate:     80.91 tokens/s
  eval count:           2174 token(s)
  eval duration:        12m36.577002621s
  eval rate:            2.87 tokens/s

Prompt: whats a great chicken breast recipe for dinner tonight?

stevenhubertron 2 hours ago|

On my MBP M4 Pro 48gb same model/question while multitasking with Figma, email etc:

  total duration:       37.44872875s
  load duration:        145.783625ms
  prompt eval count:    25 token(s)
  prompt eval duration: 215.114666ms
  prompt eval rate:     116.22 tokens/s
  eval count:           1989 token(s)
  eval duration:        36.614398076s
  eval rate:            54.32 tokens/s

hikarudo 2 hours ago||

Also checkout Deepmind's "The Gemma 4 Good Hackathon" on kaggle:

https://www.kaggle.com/competitions/gemma-4-good-hackathon

VadimPR 4 hours ago||

Gemma 3 E4E runs very quick on my Samsung S26, so I am looking forward to trying Gemma 4! It is fantastic to have local alternatives to frontier models in an offline manner.

snthpy 3 hours ago|

What's the easiest way to install these on an Android phone/Samsung?

nolist_policy 2 hours ago||

Google AI Edge Gallery: https://github.com/google-ai-edge/gallery/releases

jwr 5 hours ago||

Really looking forward to testing and benchmarking this on my spam filtering benchmark. gemma-3-27b was a really strong model, surpassed later by gpt-oss:20b (which was also much faster). qwen models always had more variance.

mhitza 4 hours ago||

If you wouldn't mind chatting about your usage, my email is in my profile, and I'd love to share experiences with other HNers using self-hosted models.

jeffbee 4 hours ago||

Does spam filtering really need a better model? My impression is that the whole game is based on having the best and freshest user-contributed labels.

drob518 26 minutes ago|||

He said it’s a benchmark.

hrmtst93837 1 hour ago|||

Better models help on the day the spam mutates, before you have fresh labels for the new scam and before spammers can infer from a few test runs which phrasing still slips through. If you need labels for each pivot you're letting them experiment on your users.

jeffbee 1 hour ago||

In my experience the contents of the message are all but totally irrelevant to the classification, and it is the behavior of the mailing peer that gives all the relevant features.

gunalx 1 hour ago|

We didnt get deepseek v4, but gemma 4. Cant complain.

More comments...