Zen and the Art of Machine Learning Research

Posted by jxmorris12 3 days ago

Zen and the Art of Machine Learning Research(blog.jxmo.io)

188 points | 63 commentspage 2

jessinra98 5 hours ago|

Would either of you have a recommendation on where to start learning about either?

misiti3780 1 hour ago||

why is SVD so important? i know it's important in general ML but seems minor for LLMs (LoRA?)

lostdog 12 hours ago||

I have some coworkers that are similar in everything--education, work ethic, and intelligence--but some of the tick out ML ideas that work like clockwork, while others get hits rarely if ever. I cannot tell what makes it work for some and not others. Their ideas both sound equally good.

Sometimes a coworker will be an ML star for a year or two, but then suddenly run out of steam. It's brutal to watch.

I used to think most smart people had similar distributions of good ideas, and it was just that the hardest working tried out all 50 of their ideas to pick out the 2 good ones. But I've seen smart and hardworking people have a hit rate of 0.

fyredge 11 hours ago||

That's the nature of research. You try every idea that may be a good avenue and only a handful work out, if at all. That's why quantifying research credibility via publication and citation counts inherently lead to toxic work cultures. The best ideas must be given time to be discovered, not forced out and contorted to fit the requirements of a journal.

bobmarleybiceps 11 hours ago||

this is part of why I think most researchers get less productive over time... Someone gets some big result during grad school or early career, get some big job from it, and then struggle to get new results of similar quality :shrug:

With ML in particular, there's also the sheer volume of people basically all looking at (essentially) the same problems... so it's kind of like monkeys with type writers spamming ideas until some work.

sdsdfsdff344sd 6 hours ago|||

It's not just ML research; that's just human nature.

We like to see hard-working, God-fearing people minting raw knowledge from Mount Olympus itself, whereby each shard of crystalline insight is carved meticulously by the Apprentice over the course of a productive and morally pure career.

The reality is it's some skill plus the occasional drive-by of an unknown force of nature, hitting you on the head with a shattered fragment of insight whose provenance you'll remain completely ignorant of. I'd say we just revert back to invoking the muses. It was a fine explanation.

jack_pp 11 hours ago|||

In spirituality it is believed that ideas and inspirations aren't our own. That our mind is like an LLM that gets prompted by higher beings. In research everyone has high param count minds, trained for many years by studying. But just like LLMs by themselves are useless at creating new original work, no matter the compute you have available, so the mind can not create anything new without "inspiration"

59nadir 10 hours ago|||

Wow, this makes ML sound even more like voodoo than I thought. Can you give examples of what the nature of these ideas is?

cold_harbor 5 hours ago||

[dead]

stared 10 hours ago||

It revolves around the sentiment of "go deeper" - but I think it is a double-edged sword. Sure, entropy, tensors and gradients are important - and yes, they are pretty much requirements.

But from what I see, it is the opposite - a lot (if not virtually all) progress in the last decade of deep learning was not because of a fundamental idea, but incremental, experimentally-verified practice. Even though I think there is good intuition for why ReLU is better than sigmoid (tl;dr: last layer is log(sigmoid) ~ ReLU, putting anything different inside kills the gradient), the original paper by Hinton himself was more or less "because it trains 3x faster".

Re-thinking fundamentals might help, but most "let's change the fundamentals" is rarely how it works. Even the most seminal papers, i.e. AlexNet and "Attention Is All You Need", are refinements of existing ideas, and show how they help.

Machine learning is an experimental science. Many mathematically cool ideas do not work. Many engineering ones do.

> I've tweeted before that one of the most important traits in a researcher is healthy paranoia. Be paranoid!

I have seen so many PhDs burned out to cinders; I don't think it is any more a good piece of advice than "depression is good for philosophers". Sure, be a relentless explorer.

> In short, holding on to ideas for too long can actually be counterproductive. Stay open-minded and refuse to let ego cloud your judgement.

Which I think is true.

nathaah3 11 hours ago||

This is gold!!!!

photochemsyn 2 hours ago|

[flagged]