Top
Best
New

Posted by __rito__ 2 days ago

Auto-grading decade-old Hacker News discussions with hindsight(karpathy.bearblog.dev)
Related from yesterday: Show HN: Gemini Pro 3 imagines the HN front page 10 years from now - https://news.ycombinator.com/item?id=46205632
636 points | 261 commentspage 4
dschnurr 1 day ago|
Nice! Something must be in the air – last week I built a very similar project using the historical archive of all-in podcast episodes: https://allin-predictions.pages.dev/
sanex 1 day ago|
I'll use this as evidence supporting my continued demand for a Friedberg only spinoff.
sigmar 1 day ago||
Gotta auto grade every HN comment for how good it is at predicting stock market movement then check what the "most frequently correct" user is saying about the next 6 months.
Rychard 1 day ago||
As the saying goes, "past performance is not indicative of future results"
xpe 1 day ago||
I hope this is a joke.

Forecasting and the meta-analysis of forecasters is fairly well studied. [1] is a good place to start.

[1]: https://en.wikipedia.org/wiki/Superforecaster

sigmar 1 day ago||
> The conclusion was that superforecasters' ability to filter out "noise" played a more significant role in improving accuracy than bias reduction or the efficient extraction of information.

>In February 2023, Superforecasters made better forecasts than readers of the Financial Times on eight out of nine questions that were resolved at the end of the year.[19] In July 2024, the Financial Times reported that Superforecasters "have consistently outperformed financial markets in predicting the Fed's next move"

>In particular, a 2015 study found that key predictors of forecasting accuracy were "cognitive ability [IQ], political knowledge, and open-mindedness".[23] Superforecasters "were better at inductive reasoning, pattern detection, cognitive flexibility, and open-mindedness".

I'm really not sure what you want me to take from this article? Do you contend that everyone has the same competency at forecasting stock movements?

anshulbhide 1 day ago||
I often summarise HN comments (which are sometimes more insightful than the original article) using an LLM. Total game-changer.
SequoiaHope 1 day ago||
This is great! Now I want to run this to analyze my own comments and see how I score and whether my rhetoric has improved in quality/accuracy over time!
NooneAtAll3 1 day ago||
UX feedback: I wish clicking on a new thread scrolled right side to the top again

reading from the end isn't really useful, y'know :)

neilv 1 day ago||
> I spent a few hours browsing around and found it to be very interesting.

This seems to be the result of the exercise? No evaluation?

My concern is that, even if the exercise is only an amusing curiosity, many people will take the results more seriously than they should, and be inspired to apply the same methods to products and initiatives that adversely affect people's lives in real ways.

cootsnuck 1 day ago|
> My concern is that, even if the exercise is only an amusing curiosity, many people will take the results more seriously than they should, and be inspired to apply the same methods to products and initiatives that adversely affect people's lives in real ways.

That will most definitely happen. We already have known for awhile that algorithmic methods have been applied "to products and initiatives that adversely affect people's lives in real ways", for awhile: https://www.scientificamerican.com/blog/roots-of-unity/revie...

I guess the question is if LLMs for some reason will reinvigorate public sentiment / pressure for governing bodies to sincerely take up the ongoing responsibility of trying to lessen the unique harms that can be amplified by reckless implementation of algorithms.

godelski 1 day ago||

  > I was reminded again of my tweets that said "Be good, future LLMs are watching". You can take that in many directions, but here I want to focus on the idea that future LLMs are watching. Everything we do today might be scrutinized in great detail in the future because doing so will be "free". A lot of the ways people behave currently I think make an implicit "security by obscurity" assumption. But if intelligence really does become too cheap to meter, it will become possible to do a perfect reconstruction and synthesis of everything. LLMs are watching (or humans using them might be). Best to be good.
Can we take a second and talk about how dystopian this is? Such an outcome is not inevitable, it relies on us making it. The future is not deterministic, the future is determined by us. Moreso, Karpathy has significantly more influence on that future than your average HN user.

We are doing something very *very* wrong if we are operating under the belief that this future is unavoidable. That future is simply unacceptable.

jacquesm 1 day ago||
Given the quality of the judgment I'm not worried, there is no value here.

To properly execute this idea rather than to just toss it off without putting in the work to make it valuable is exactly what irritates me about a lot of AI work. You can be 900 times as productive at producing mental popcorn, but if there was value to be had here we're not getting it, just a whiff of it. Sure, fun project. But I don't feel particularly judged here. The funniest bit is the judgment on things that clearly could not yet have come to pass (for instance because there is an exact date mentioned that we have not yet reached). QA could be better.

godelski 1 day ago||
I think you're missing the actual problem.

I'm not worried about this project but instead harvesting, analyzing all that data and deanonymizing people.

That's exactly what Karparthy is saying. He's not being shy about it. He said "behave because the future panopticon can look into the past". Which makes the panopticon effectively exist now.

  Be good, future LLMs are watching
  ...
  or humans using them might be
That's the problem. Not the accuracy of this toy project, but the idea of monitoring everyone and their entire history.

The idea that we have to behave as if we're being actively watched by the government is literally the setting of 1984 lol. The idea that we have to behave that way now because a future government will use the Panopticon to look into the past is absolutely unhinged. You don't even know what the rules of that world will be!

Did we forget how unhinged the NSA's "harvest now, decrypt later" strategy is? Did we forget those giant data centers that were all the news talked about for a few weeks?

That's not the future I want to create, is it the one you want?

To act as if that future is unavoidable is a failure of *us*

jacquesm 1 day ago||
Yes, you are right, this is a real problem. But it really is just a variation on 'the internet never forgets', for instance in relation to teen behavior online. But AI allows for weaponization of such information. I wish the wannabe politicians of 2050 much good luck with their careers, they are going to be the most boring people available.
godelski 1 day ago||
The internet never forgets but you could be anonymous. Or at least somewhat. But that's getting harder and harder

If such a thing isn't already possible (it is to a certain extent), we are headed towards a point where your words alone will be enough to fingerprint you.

jacquesm 1 day ago||
Stylometry killed that a long time ago. There was a website, stylometry.net that coupled HN accounts based on text comparison and ranked the 10 best candidates. It was incredibly accurate and allowed id'ing a bunch of people that had gotten banned but that came back again. Based on that I would expect that anybody that has written more than a few KB of text to be id'able in the future.
godelski 1 day ago||
You need a person's text with their actual identity to pull that off. Normally that's pretty hard, especially since you'll get different formats. Like I don't write the same way on Twitter as HN. But yeah, this stuff has been advancing and I don't think it is okay.
jacquesm 1 day ago||
The AOL scandal pretty much proved that anonymity is a mirage. You may think you are anonymous but it just takes combining a few unrelated databases to de-anonymize you. HN users think they are anonymous but they're not, they drop factoids all over the place about who they are. 33 bits... it is one of my recurring favorite themes and anybody in the business of managing other people's data should be well aware of the risks.
godelski 17 hours ago||
I think you're being too conspiracy theorist here by making everything black and white.

Besides, the main problem of how difficult it is to deanonymize, not if possible.

Privacy and security both have to perfect defense. For example, there's no passwords that are unhackable. There are only passwords that cannot be hacked with our current technology, budgets, and lifetime. But you could brute force my HN password, it would just take billions of years.

The same distinction it's important here. My threat model on HN doesn't care if you need to spend millions of dollars nor thousands of hours to deanonymize me. My handle is here to discourage that and to allow me to speak more freely about certain topics. I'm not trying to hide from nation states, I'm trying to hide from my peers in AI and tech. So I can freely discuss my opinions, which includes criticizing my own community (something I think everyone should do! Be critical of the communities we associate with). And moreso I want people to consider my points on their merit alone, not on my identity nor status.

If I was trying to hide from nation states I'd do things very very differently, such as not posting on HN.

I'm not afraid of my handle being deanonymized, but I still think we should recognize the dangers of the future we are creating.

By oversimplifying you've created the position that this is a lost cause, as if we already lost and that because we lost we can't change. There are multiple fallacies here. The future has yet to be written.

If you really believe it is deterministic then what is the point to anything? To have desires it opinions? Are were just waiting to see which algorithm wins out? Or are we the algorithms playing themselves out? If it's deterministic wouldn't you be happy if the freedom algorithm won and this moment is an inflection in your programming? I guess that's impossible to say in an objective manner but I'd hope that's how it plays out

jacquesm 1 hour ago||
I have enough industry insights to prove that your data is floating out there, unprotected, in plain text and that those that are not bound by the law are making very good use of it. Every breach leaks more bits about you.

This is the main driver behind the targeted scams that ordinary people now have to deal with. It is why people get voice calls from loved ones in distress, why they get 'tech support' calls that aim to take over their devices and why lots of people have lost lots of money.

If you think I am too conspiracy theorist by making everything black and white that is maybe simply because we live different lives and have different experience.

acyou 1 day ago||
I call this the "judgement day" scenario. I would be interested if there is some science fiction based on this premise.

If you believe in God of a certain kind, you don't think that being judged for your sins is unacceptable or even good or bad in itself, you consider it inevitable. We have already talked it over for 2000 years, people like the idea.

godelski 1 day ago||
You'll be interested in Clarke's "The Light of Other Days". Basically a wormhole where people can look back at any point in time, ending all notion of privacy.

God is different though. People like God because they believe God is fair and infallible. That is not true for machines nor men. Similarly I do not think people will like this idea. I'm sure there will be some but look at people today and their religious fever. Or look in the past. They'll want it, but it is fleeting. Cults don't last forever, even when they're governments. Sounds like a great way to start wars. Every one will be easily justified

https://en.wikipedia.org/wiki/The_Light_of_Other_Days

mistercheph 1 day ago||
A majority don't seem to be predictions about the future, and it seems to mostly like comments that give extended air to what was then and now the consensus viewpoint, e.g. the top comment from pcwalton the highest scored user: https://news.ycombinator.com/item?id=10657401

> (Copying my comment here from Reddit /r/rust:) Just to repeat, because this was somewhat buried in the article: Servo is now a multiprocess browser, using the gaol crate for sandboxing. This adds (a) an extra layer of defense against remote code execution vulnerabilities beyond that which the Rust safety features provide; (b) a safety net in case Servo code is tricked into performing insecure actions. There are still plenty of bugs to shake out, but this is a major milestone in the project.

dw_arthur 1 day ago|
Reading this I feel the same sense of dread I get watching those highly choreographed Chinese holiday drone shows.
More comments...