Posted by vinhnx 6 days ago
I wonder why they focused specifically on a task that is already solved algorithmically. The paper does not seem to address this, and the references do not include any mentions of non-LLM approaches to the line-breaking problem.
The point is to see how LLMs implement algorithms internally, starting with this simple easily understood algorithm.
The biology metaphor they make is interesting, because I think a biologist would be the first to tell you that you need more than one datapoint.
It makes it tedious to figure out what they actually did (which sounds interesting) when it's couched in such terms and presented in such an LLMified style.
like the difference between Unicode code-points and UTF-8 bytes, you can't just count UTF-8 bytes to know how many code-points you have
There is no biology here, and there are so many other words that describe perfectly what they are doing here, without twisting the meaning of another word.