Show HN: State of the Art of Coding Models, According to Hacker News Commenters

Posted by yunusabd 16 hours ago

Show HN: State of the Art of Coding Models, According to Hacker News Commenters(hnup.date)

Hello HN,

I was away from my computer for two weeks, and after coming back and reading the latest discussions on HN about coding assistants (models, harnesses), I felt very out of the loop. My normal process would have been to keep reading and figure out the latest and greatest from people's comments, but I wanted to try and automate this process.

Basically the goal is to get a quick overview over which coding models are popular on HN. A next iteration could also scan for harnesses that people use, or info on self-hosting or hardware setups.

I wrote a short intro on the page about the pipeline that collects and analyzes the data, but feel free to ask for more details or check the Google Sheet for more info.

https://hnup.date/hn-sota

124 points | 64 commentspage 3

input_sh 4 hours ago|

Terrible metric that tells absolutely nothing about what's state-of-the-art. You might as well call this list the most astroturfed models on HN.

julianlam 9 hours ago||

Interesting that Gemma 4 didn't crack the top 10.

I've been experimenting with the 26B-A4B model with some surprisingly good results (both in inference speed and code quality — 15 tok/s, flying along!), vs my last few experiments with Devstral 24B. Not sure whether I can fit that 35B Qwen model everybody's so keen on, on my 32GB unified RAM.

However I think I may be in the minority of HN commenters exploring models for local inference.

asnelt 2 hours ago|

Can you elaborate on your setup? What harness are you using with Gemma 4 on your 32GB machine?

tokkkie 9 hours ago||

more users = more complaints. negativity just means popularity.

kimi...?

Frannky 11 hours ago||

I am looking for a good alternative to Claude code + opus that is not codex. I tried switching back to opus 4.6. The attitude of 4.7 is what is more problematic. Difficult to enforce checking stuff before answering, and it suppose he knows better than me and reality. Plus all the latest shenanigans they did. Pretty disgusted I am still using them

rane 5 hours ago||

You can use other models in Claude Code

https://github.com/raine/claude-code-proxy

https://api-docs.deepseek.com/quick_start/agent_integrations...

Frannky 11 hours ago||

I have forgotten to add the tendency of not owing problems and taking care and solve immediately but instead deflecting and saying it shouldn't be done now it's not my responsibility etc Just terrible

alxhslm 6 hours ago||

100% this! So often it complains failing unit tests are not its fault.

jimmypk 2 hours ago||

[flagged]

soupspaces 8 hours ago|

[dead]