GLM 5.2 vs. Opus - Hacker News

Posted by ritzaco 16 hours ago

451 points | 307 commentspage 4

doe88 8 hours ago|

To me one shot prompting is as relevant as Strava's KOM is for cycling, i'm more interested in a good cycling performance after a 3 hours ride than a straight up 30 min record effort.

stavarotti 9 hours ago||

These style of comparisons are decent at showing capability but they don't really show me what I truly want - a sounding board and implementer with senior engineer-level execution. When I look back at all the teams that I've been part of, the best outcomes came from white-boarding (sometimes in the metaphorical sense) with one or two people, at times arguing, then finally compromising on a plan. Instead of synthetic benchmarks that try to be objective, I wonder if there's a way test this, or maybe I'm opining on a way of working that will soon be gone?

CuriouslyC 12 hours ago||

You should repeat this experiment but with progressively more detail in the initial prompt. Claude's secret sauce is taking weakly specified prompts and making passable things from them, but as the degrees of freedom in the prompt go down Claude starts to disobey while other models close in on the intent.

jameswhitford 12 hours ago|

That is a great suggestion that I am definitely going to look into, thanks!

Babooz 12 hours ago||

Nice comparison, but perhaps a more informative one would be to keep the harness the same and use Claude Code for both model. In your comparison, the differences could be due to many harness design decisions.

thedreammachine 12 hours ago||

I was surprised today by how much better GLM-5.2 was than GPT-5.5 at aesthetic/UI work. I'll keep my Claude/Codex setup via Conductor for now, but this model got me to set up OpenCode, download their desktop app and do most of my work there today.

Aozora7 15 hours ago||

I used GLM 5.0/5.1/5.2 for some projects, and for me, the area in which they lag behind frontier models the most are user interfaces. They get really close to Opus when it comes to pure algorithms, but when I need something like web application or a mobile app that looks and works well, they are very noticeably worse than even Sonnet.

xrd 9 hours ago||

How are people running this locally? I just checked llama.cpp and it appears unsloth has a version but it hacks a bunch of things to make it work and isn't optimal.

https://github.com/ggml-org/llama.cpp/issues/24730

jeremyjh 9 hours ago|

No one is doing that for a model this size it would have to be so heavily quantized that it wouldn’t be useful - or you’d need to spend a half million dollars on hardware. People use hosted APIs. Open weight means cloud vendors can host it.

malshe 9 hours ago||

Can you recommend any US based cloud providers?

maybe_pablo 8 hours ago|||

In HuggingChat (https://huggingface.co/chat) you can test open models for free and even test specific providers.

From there I collected the following US providers currently serving GLM 5.2:

- Together (https://www.together.ai/models)

- Fireworks (https://fireworks.ai/models)

- Featherless (https://featherless.ai/models)

malshe 7 hours ago||

That's great. Thank you!

fooster 2 hours ago|||

ollama cloud, neuralwatt.

bornfreddy 7 hours ago||

I know that running this locally is prohibitively expensive (for now), but what kind of cost would I be looking at if I wanted to rent the hardware and run the model by myself?

hmokiguess 10 hours ago||

I signed up for GLM 5.2 yesterday to try it out because Anthropic kept throwing 529 Overloaded

I like it, but the lite plan ate 22% usage of my 5h reset window in a single session after 2 prompts on xhigh of GLM 5.2 [1m]

Result was satisfactory, I think stuff is decent, I'm happy to use either, wish there was a combined subscription plan where I could get both

w4yai 9 hours ago|

I may be biased and interested as I'm going to give you an affiliate link, but really honestly Synthetic LLM provider is a beast! They provide perfect GLM5.2, awesome token/s, TTFT and price.

Coupled with a local Headroom (https://github.com/headroomlabs-ai/headroom) you'll be able to use a LOT without hitting your 5h window :)

Definitely the best $ value for me considering the reasonable performance of GLM5.2.

They provide a rolling window quota, so you're never really out of quota contrary to other providers, you can adjust day to day.

Check it out if interested : https://synthetic.new/?referral=kwjqga9QYoUgpZV

---

Docs & all models : https://dev.synthetic.new/docs/api/models

hmokiguess 3 hours ago||

how do I configure claude code / pi with it? sounds like a good deal!

EDIT: I've RTFM lol, thanks for the links, will give it a shot!

w4yai 3 hours ago||

The docs are really helpful : https://dev.synthetic.new/docs/guides/claude-code

Glad you figured it out :) Let me know your thoughts about the quota and GLM5.2, so far I don't think I've came across against anything better, $/usefulness wise.

samsin 9 hours ago|

My understanding was that n-shot prompting just referred to the number of examples included in a prompt, not the number of prompts to achieve the desired result.

"Build a 3D platformer game from scratch, in raw WebGL, with no game engine or 3D library" would be a zero-shot prompt.

More comments...