The work is written by western AI safety proponents, who often need to argue with important people who say we need to accelerate AI to “win against China” and don’t want us to be slowed down by worrying about safety.
From that perspective, there is value in exploring the scenario: ok, if we accept that we need to compete with China, what would that look like? Is accelerating always the right move? The article, by telling a narrative where slowing down to be careful with alignment helps the US win, tries to convince that crowd to care about alignment.
Perhaps, people in China can make the same case about how alignment will help China win against US.
Would love to read a perspective examining "what is the slowest reasonable pace of development we could expect." This feels to me like the fastest (unreasonable) trajectory we could expect.
Their research is consistent with a similar story unfolding over 8-10 years instead of 2.
That's kind of unavoidably what accelerating progress feels like.
The others include:
Eli Lifland, a superforecaster who is ranked first on RAND’s Forecasting initiative. You can read more about him and his forecasting team here. He cofounded and advises AI Digest and co-created TextAttack, an adversarial attack framework for language models.
Jonas Vollmer, a VC at Macroscopic Ventures, which has done its own, more practical form of successful AI forecasting: they made an early stage investment in Anthropic, now worth $60 billion.
Thomas Larsen, the former executive director of the Center for AI Policy, a group which advises policymakers on both sides of the aisle.
Romeo Dean, a leader of Harvard’s AI Safety Student Team and budding expert in AI hardware.
And finally, Scott Alexander himself.
A lot of people (like the Effective Altruism cult) seem to have made a career out of selling their Sci-Fi content as policy advice.
There's hype and there's people calling bullshit. If you work from the assumption that the hype people are genuine, but the people calling bullshit can't be for real, that's how you get a bubble.
Sure, OpenAI put up with one of these safety larpers for a few years while it was part of their brand. Reasonable people can disagree on how much that counts for.
You're right it's not a bunch of junior academics. It's not even a bunch of junior academics. This stuff would never pass muster in a reputable academic peer-reviewed journal, so from an academic perspective, this is not even the JV stuff. That's why they have to found their own bizarro network of foundations and so on, to give the appearance of seriousness and legitimacy. This might fool people who aren't looking closely, but the trick does not work on real academics, nor does it work on the silent majority of those who are actually building the tech capabilities.
Which to be fair it actually is kind of impressive if someone can make accurate predictions about the future that far head, but only because people are really bad at predicting the future.
Implicitly when I hear "superforecaster" I think they're someone that's really good at predicting the future, but deeper inspection often reveals that "the future" is constrained to the next 2 years. Beyond that they tend to be as bad as any other "futurist".
Not all these soft roles
They are great at selling stories - they sold the story of the crypto utopia, now switching their focus to AI.
This seems to be another appeal to enforce AI regulation in the name of 'AI safetyiism', which was made 2 years ago but the threats in it haven't really panned out.
For example an oft repeated argument is the dangerous ability of AI to design chemical and biological weapons, I wish some expert could weigh in on this, but I believe the ability to theorycraft pathogens effective in the real world is absolutely marginal - you need actual lab work and lots of physical experiments to confirm your theories.
Likewise the dangers of AI systems to exfiltrate themselves to multi-million dollar AI datacenter GPU systems everyone supposedly just has lying about, is ... not super realistc.
The ability of AIs to hack computer systems is much less theoretical - however as AIs will get better at black-hat hacking, they'll get better at white-hat hacking as well - as there's literally no difference between the two, other than intent.
And here in lies a crucial limitation of alignment and safetyism - sometimes there's no way to tell apart harmful and harmless actions, other than whether the person undertaking them means well.
The funny part, to me, is that it won't. They'll continue to toil and move on to the next huck just as fast as they jumped on this one.
And I say this from observation. Nearly all of the people I've seen pushing AI hyper-sentience are smug about it and, coincidentally, have never built anything on their own (besides a company or organization of others).
Every single one of the rational "we're on the right path but not quite there" takes have been from seasoned engineers who at least have some hands-on experience with the underlying tech.
There are engineers with AI predictions, but you aren't reading them, because building an audience like Scott Alexander takes decades.
(That said, I agree with you. But I know I myself am biased to agree with Scott.)
This bullshit article is written for that audience.
Say bullshit enough times and people will invest.
The hubris is strong with some people, and a certain oligarch with a god complex is acting out where that can lead right now.
The only reason timelines are as short as they are is because of people at OpenAI and thereafter Anthropic deciding that "they had no choice". They had a choice, and they took the one which has chopped at the very least years off of the time we would otherwise have had to handle all of this. I can barely begin to describe the magnitude of the crime that they have committed -- and so I suggest that you consider that before propagating the same destructive lies that led us here in the first place.
Simply put, with the ever increasing hardware speeds we were dumping out for other purposes this day would have come sooner than later. We're talking about only a year or two really.
"We have to nuke the Russians, if we don't do it first, they will"
"We have to clone humans, if we don't do it, someone else will"
"We have to annex Antarctica, if we don't do it, someone else will"
That said, this snippet from the bad ending nearly made me spit my coffee out laughing:
> There are even bioengineered human-like creatures (to humans what corgis are to wolves) sitting in office-like environments all day viewing readouts of what’s going on and excitedly approving of everything, since that satisfies some of Agent-4’s drives.
Based on each individual's vantage point, these events might looks closer or farther than mentioned here. but I have to agree nothing is off the table at this point.
The current coding capabilities of AI Agents are hard to downplay. I can only imagine the chain reaction of this creation ability to accelerate every other function.
I have to say one thing though: The scenario in this site downplays the amount of resistance that people will put up - not because they are worried about alignment, but because they are politically motivated by parties who are driven by their own personal motives.
There is some very careful thinking there, and I encourage people to engage with the arguments there rather than the stylized narrative derived from it.
Oh hey, it's the errant thought I had in my head this morning when I read the paper from Anthropic about CoT models lying about their thought processes.
While I'm on my soapbox, I will point out that if your goal is preservation of democracy (itself an instrumental goal for human control), then you want to decentralize and distribute as much as possible. Centralization is the path to dictatorship. A significant tension in the Slowdown ending is the fact that, while we've avoided AI coups, we've given a handful of people the ability to do a perfectly ordinary human coup, and humans are very, very good at coups.
Your best bet is smaller models that don't have as many unused weights to hide misalignment in; along with interperability and faithful CoT research. Make a model that satisfies your safety criteria and then make sure everyone gets a copy so subgroups of humans get no advantage from hoarding it.