Posted by ricardbejarano 19 hours ago
It pretty obvious that this reasoning scaling is a mirage, parameters are all you need. Everything else is mostly just wasting time while hardware get better.
This isn’t nearly complete.
The size of the quantization you chose also makes a difference.
The GPU driver also plays an important role.
What was your approach? What software did you use to run the models?
Just ask any Apple user, they don't actually use local models.