VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

4gotunameagain 9 hours ago|

What are the implications of local SOTA inference, given the insane datacenter "investing" ?

It surely cannot be justified only for training at this scale, and since models nowadays are improved more and more by fine tuning than re-training from scratch.

Will a viable local model crash the US economy ?

More importantly, are the LLM companies aware, and are they deliberately buying out all the RAM and GPUs in order to prolong the inevitable ? Probably not, but I wouldn't be surprised if that is the case.

viduus 6 hours ago||

[flagged]

diseasedyak 8 hours ago||

[dead]

c121618 8 hours ago||

[flagged]

sosojustdo 19 hours ago||

[flagged]

jkwang 14 hours ago||

[flagged]

riponcm 18 hours ago||

[dead]

lisa_luoyf 13 hours ago||

[flagged]

cheekygeeky 8 hours ago||

[flagged]

t_e_s_t 13 hours ago|

[flagged]