Top
Best
New

Posted by Anon84 5 days ago

Embarrassingly simple self-distillation improves code generation(arxiv.org)
653 points | 200 commentspage 6
aiiaro 4 days ago|
[flagged]
yubainu 4 days ago||
[dead]
dist-epoch 5 days ago||
[flagged]
avaer 5 days ago||
I definitely pay more attention to papers affiliated with Chinese companies; the economics seem to be more conducive to doing good academic work and publishing it. I would say the same for companies like Apple (where TFA came from).

But to filter based on author's names sounds pretty darn racist.

ptidhomme 5 days ago|||
I used to have the opposite rule in my signal processing field : the more Chinese names, the less innovation was there.

They seemed like they had to be churning out papers and any little adaptation to existing research triggered a new publication.

But it may have changed now.

0x3f 5 days ago|||
That's... almost every AI paper.
amelius 5 days ago||
So

"Made in China, designed by Apple in California"

should be:

"Made in China, designed by Chinese people in California"?

jofzar 5 days ago||
> simple self-distillation (SSD):

Sorry apple, SSD is already taken, you can't use that acronym.

love2read 5 days ago||
You're right, I offer these alternatives:

Consistency Preservation Update (CPU)

Guided Probability Update (GPU)

History-aware Distillation Driving (HDD)

Probability Smoothing Update (PSU)

drittich 5 days ago|||
I used to invent TLAs on the spot for fun, and when someone asked what it was, would respond, "It's a PUA", eventually revealing that meant "previously unknown acronym". It was even more annoying that it sounds.
ape4 5 days ago||
ATT=All TLAs are Taken
politelemon 5 days ago||
It's cringe worthy to see that the original paper itself is editorialised.

Title should be: Simple Self-Distillation Improves Code Generation

StevenWaterman 5 days ago||
"Embarrassingly" has a history as a technically meaningful word roughly equivalent to "maximally", see "Embarrassingly parallel"

https://en.wikipedia.org/wiki/Embarrassingly_parallel

Aurornis 5 days ago||
The phrase embarrassingly parallel has a history in computer science.

Many computer science paper titles allude to past titles in other CS papers.

Calling it “cringe worthy” is unnecessarily mean. There is context and history you don’t understand.

gottheUIblues 5 days ago|||
"Embarrassingly" considered harmful?
cbm-vic-20 5 days ago|||
"Embarrassingly" considered harmful is all you need.
TeMPOraL 4 days ago||
Programming Introduction to "Embarrasingly" considered harmful is all you need in 21 hours.
rzzzt 4 days ago|||
Cringeworthily parallel, not even serial
ape4 5 days ago|
Shouldn't a scientific paper be using metric units (like 30T) rather than 30B.

There are two distinct billions. https://en.wikipedia.org/wiki/Billion

mikkupikku 5 days ago|
Objective one should be to communicate effectively, not confuse everybody.
unknownx113 4 days ago||
that disqualifies like 80% of papers lmao
mikkupikku 4 days ago||
Lol, you're probably not wrong. But have you ever noticed that the most important papers tend to be on the clear and readable side of things? It's as if researchers understand that being understood is important, but deemphasize that when the paper itself isn't important in the first place. (Maybe if they're only publishing to not perish, not being understood is actually a goof thing from their perspective?)