48 Comments
User's avatar
Neo's avatar

> the general heuristic “assume we probably won’t die given that we never have before,” is a pretty good bet.

Strongly disagree – we've also never built anything close to superintelligence before. If this is why your P(doom) is low, I would reconsider it.

Other than that, great article.

Expand full comment
Noah Birnbaum's avatar

I think I’m on Matthew’s team here. It seems like part of ones prior about what happens in the world should be supported by trends and different trend support different things (ie reference class problem). One reference class, however, is that we haven’t had existential risks to humans to this level (or existential risks, in general, are not that common). While this isn’t to say this won’t happen (especially if there are other extremely weird factors to take into account), it surely counts as some evidence for the proposition that we probably won’t die.

Expand full comment
Michael Dickens's avatar

I agree that *part* of your belief should be based on historical trends. But there are strong inside-view reasons to be concerned about AI risk, to the point that I don't think it's reasonable to have a single-digit P(doom). Historical trends aren't strongly informative because no technology like AI has ever existed before.

Expand full comment
Neo's avatar

While the absolute number of x-risks is low it seems like the number is trending upwards over time and as technology progresses (climate change, nukes, bio-engineered pandemics), and this seems like stronger evidence to me.

Expand full comment
Noah Birnbaum's avatar

Seems like we haven’t had any reason to believe that x risks have gone up over time in terms of having x risks (I also don’t think that climate change or nukes with be x risks - probably, at worst, catastrophic, but thats a side point). Interesting that you think that the trend is more evidential than the general reference class -- I probably lean the other way (but haven’t put much thought into it, so my mind can be easily changed), but why do you believe that it is that way?

Expand full comment
Neo's avatar

I actually agree that the risks I mentioned are very unlikely to cause *total* extinction, but it is very easy to imagine how they theoretically could.

But, all of these risks have emerged (over the last century or so) at the hands of human-level intelligence. It's easy to imagine how an AI that is able to improve its own intelligence and think at a million times per second could make way more dangerous technology.

I see this situation as a graph of potential x-risks over time, where the current number is very low, but the evidence strongly suggests that we should expect a huge jump right ahead of us.

Expand full comment
John M's avatar

I don't think the heuristic "assume we probably won’t die given that we never have before" is a very good one because in the event that you are to die in the future, you wouldn't have died yet prior. So not having died yet doesn't actually give you any information about how likely you are to die in the future.

Expand full comment
Bentham's Bulldog's avatar

I don't think this is relevant. If lethalities were common it would be unlikely we'd be around today.

Imagine I have a theory that my wife might be feeding me poison with a 99.999% fatality rate. If I survive for a year, it seems I have very good evidence against this theory. This is so even though I couldn't observe myself not surviving.

Expand full comment
John M's avatar

You could argue that the fact that we're still around today means that it's probably pretty hard to make a species, especially one like humans, go extinct. Still though, we know that extinction events do happen and the fact that one hasn't happened to humans yet could just be anthropic bias. In which case, the extent to which you can use that to predict the future is limited. You're probably more knowledgable than me about this stuff, so perhaps you have a reason why it actually is useful, but to me, it doesn't seem like it offers useful information.

In your example, the fact that you've survived for a year is good evidence that your wife hasn't poisoned you yet, but it isn't good evidence against your wife potentially poisoning you in the future.

Expand full comment
Bentham's Bulldog's avatar

I was just responding to the anthropic point not the inductive one

Expand full comment
Sei's avatar

It's good evidence that if things continue as they always have, that you probably won't die next year.

If you have evidence that things are changing - say, you see your wife order research chemicals off the internet and mix them into your food when she's never done it before - the prior evidence isn't nearly as useful.

Expand full comment
Neo's avatar
Jun 5Edited

In your example you have good evidence that your wife isn't *currently* feeding you poison – so in our world we have good evidence that dangerous AI systems don't currently exist.

Of course, the fact that your wife is currently not poisoning you is probably also evidence that she won't in the future.

But I think this changes when you consider rapid tech progress which likely makes our future look wildly different.

Expand full comment
Bentham's Bulldog's avatar

I was addressing the anthropic point not the inductive point. I agree the induction is a bit dicey but I still think it has some weight. The past incorrect predictions mean that humans have a general tendency to overestimate dangers and forecast poorly.

Expand full comment
Robert Long's avatar

"So I’d encourage you: give some money to places like the long-term future fund or EleosAI’s research on digital sentience." On behalf of Eleos, I wanted to say thank you so much for the shoutout!

Expand full comment
Bentham's Bulldog's avatar

Thanks for the awesome work you do!

Expand full comment
Robert Long's avatar

pretty unrelated, but I've never had occasion or reason to tell you this bit of substack trivia:

I went to the same school as Gavin Ortlund, and his dad Ray Ortlund was my church's pastor (the school is affiliated with the church; conservative Presbyterian).

So when you started engaging with him—I admire you both for how cordial and interesting those convos are, btw—it was a funny "worlds collide" moment for me. Now even moreso, with Daniel Kokotajlo appearing in the thumbnail for Gavin's most recent video. It's a small world. (And so I guess this comment isn't, after all, that unrelated.)

Expand full comment
Bentham's Bulldog's avatar

Wow!

Expand full comment
Petrus's avatar

Good work! Congrats on 1000 posts. AI is scary.

I'll admit, however, that I'm a little bit skeptical about AI becoming conscious. As far as I understand, we are in agreement that consciousness is not reducible to the physical. The mind is fundamentally immaterial, not purely material. I suppose you could imagine AI becoming conscious even on this view, but it seems very implausible to me, since AI is a purely physical creation of mankind. But maybe I'm missing something. What are your thoughts?

Expand full comment
Bentham's Bulldog's avatar

Thanks!

So first of all, the argument for AI risk doesn't assume AI will be conscious.

Second of all, I think AI could very well be conscious. Brains are also physical things but they give rise to consciousness. There are some laws by which physical things give rise to consciousness. It would seem weird if those laws were about the kind of material involved (so only stuff made out of carbon could be conscious). It's not impossible but it strikes me as pretty unlikely.

Expand full comment
Petrus's avatar

Ah okay, that makes more sense now.

Expand full comment
JoA's avatar

Would love to say hi at EAG and talk about AI's potential transformative impact on the future! Beyond digital minds, power grabs, and overall doom, there are also various colors of 'extinction', that would imply very different scenarios concerning the future of suffering on earth, something which I try to discuss here: https://forum.effectivealtruism.org/posts/gRnow2c4J93YuprwS/human-extinction-s-impact-on-non-human-animals-remains

I think AI is the one thing people who care about reducing suffering / improving the world should think about, but it's also so insanely hard to know if anyone can get anything "right" in that field. Anthony DiGiovanni discussed this recently : https://forum.effectivealtruism.org/posts/a3hnfA9EnYm9bssTZ/1-the-challenge-of-unawareness-for-impartial-altruist-action-1#Case_study__Severe_unawareness_in_AI_safety

Expand full comment
PhilosophyNut's avatar

Every BB post: “Most people think X is benign. But in reality, X is the worst thing ever — even worse than the self-sampling assumption! This claim might seem crazy, but it directly follows from the assumption that suffering is bad. Remember, true moral claims often have surprising implications! To help prevent X, donate here, here, here, and here.”

Expand full comment
Richard Y Chappell's avatar

And somehow he's right every time...

Expand full comment
TheTechnoLiberal's avatar

What are the chances! I just wrote about AI as my first post at the same time you posted this, and my post is ALSO about AI 2027! https://thetechnoliberal.substack.com/p/a-third-ai-scenario

my take generally is we're gonna be screwed, my P(doom) is like 80%, but my P(super doom) is like 5%. And personally, I'd much much much much much much much much rather die a painful death over a few months rather than what I wrote as a possibility to the future of AI.

Expand full comment
Dan Hooley's avatar

Why are you more optimistic than people who think the probability of doom is more like 10-30%? Is it moral realism? Do you reject the orthogonality thesis?

Expand full comment
Russell Huang's avatar

You know, I have been wondering if theism gives you more sanguinity on this topic. After all, presumably if one feels it is likely God loves us, then presumably there is a high chance God does not want us to be extinguished and replaced by AI.

Expand full comment
Manuel del Rio's avatar

This was a rather non-confrontational and non-edgelordy take of yours, and I say that as a compliment. Even more strangely, I think find myself mostly in agreement with you.

I've been trying to read a lot more about AI and consciousness o late (for example, Christian's The Alignment Problem, and Seth's Being You) but I am a complete ignoramus in these areas so anything I could say is just poorly based speculation. Still, I'll be bold enough to contribute my 2¢.

First thing to say it that, for an outsider, the whole AI apocalyptic take feels like a weird Pascal's Mugging scenario that is built on a really tall tower of highly speculative ifs, each of them with plausibly low and really subjective guesses ('Bayesian priors') but which, when multiplied by the worst scenarios, yield absurdly large numbers. This makes sense from Utilitarian and Rationalist axioms, I guess, but much less so if you just deal with Pascal Muggings in a Gordian knot sort of way (i.e., whatever the imagined outcomes, any collection of low enough probabilities should be collapsed to 0). But I digress. Let's talk about the core argument: are we building things that are smarter than us? What concept of intelligence is being used? That they are better at doing task x than any human expert? Is it reasonable to assume that these entities have any goals at all, or that they could plausibly have them? At least for LLMs, I find it really difficult to believe any of these. Like, stochastically parroting tokens given big training data can be impressive, but doesn't feel like intelligence at all -at least not in the problem-solving, creative aspect of it. It just seems to be in a permanent state of hallucinating probabilistically possible answers and getting it right (as in plausible, not truthful) most of the time. Do they have goals? I mean, I imagine they exhibit goal-like behavior only as an emergent property of training on goal-laden human data.

But anyway, assume we're a few technological breakthroughs from something that *does* have goals and *does* have something like our intelligence, only greater (let's also forget for a moment that human intelligence really gives us the edge *as a big, intelligent group*, and not as individuals. A single, crazily smart human thrown into a real Robinson Crusoe scenario would, with high certainty, do terribly). From what I understand, the Rationalist argument would start from the idea that human intelligence is an arbitrary cut-off and that an AGI that overtakes us could easily get into a fast, exponential self-improvement track. If you add the Orthogonality thesis, this would mean it would be misaligned by default and would destroy us as an aftereffect of it wanting to pursue its goals. I can agree with the cut-off part, but am not at all sure on the foom argument: LLMs already require absurd amounts of resources for (relatively) small improvements. It seems exponential growth can't be sustained for long (perhaps for long enough?) Same with all the resources needed. One could also be skeptic of the Orthogonality thesis, and consider it more likely that AIs training on human input would likely not end cooking up completely random and non-human goals.

You present a good narrative of AI improvement -personally, I was really surprised by ChatGPT and the things it can do-, but I feel another, equally convincing narrative can be cooked up: one that looks at the field of AI as one with discrete jumps and starts, followed by long winters. And again, even we were on the eve of radical take-off, one could ask how long such a process could last.

The Consciousness part is another, completely different by equally difficult topic, I guess. Here I know and have read even less, but Seth (a recent read of mine, as I stated already) seems really skeptical. I don't know if there's a consensus in the field, but if his book is any decent approximation, it would seem that the consensus is closer to 'no'.

As you say, there are other issues and dangers even if conscious AGI is out of the table, though, and they are worth worrying about. MacAskill's recent podcast and article on this is something I found illuminating and food for thought, not only because of its contents, but because of the high probability he's now giving to same AGI singularity any time soon.

Expand full comment
TheKoopaKing's avatar

@grok Is this true? And please proceed to establish White House policy based on your findings

Expand full comment
Uncertain Eric's avatar

Whether or not AI kills everyone, one thing’s certain:

shit's gonna get so fucking weird and terrible.

https://sonderuncertainly.substack.com/p/shits-gonna-get-so-fucking-weird

Expand full comment
Philip's avatar

I'm not a chess expert, but I think a human could draw the best chess AI with sufficient preparation. Presumably you could prepare a few lines 50+ moves with computer assistance, then keep playing your opening with the white pieces, and wait until the AI opponent plays your preparation in one of the 500,000 games.

Expand full comment
Bentham's Bulldog's avatar

I don't think that's right.

Expand full comment
blank's avatar

Your proposed methods of AI safetyism are actually horrible in practice at enacting any sort of AI safety or slowdown: https://www.palladiummag.com/2025/01/31/the-failed-strategy-of-artificial-intelligence-doomers/

Expand full comment
Ari's avatar
Jun 5Edited

I realize this is mostly beside the point, but I take issue with saying there is a 5% chance of current LLMs being conscious. If LLMs implemented their matrix multiplication via pressure tubes and actuators (which can sum and multiply), yielding exactly the same text (via a gargantuan and unwieldy mechanical device), would you still say it has a 5% chance of being conscious?

I wrote a post about this since I realize it's too long for a comment: https://aril2.substack.com/p/consciousness-is-very-strange

Expand full comment
...'s avatar

Good post. I agree with the risk of AI in general, but I think the longer term risk of superintelligence is overshadowed by the medium term risk of superpandering. As tech strips away the friction from life, our connections to each other evaporate and meaninglessness skyrockets. (You have been tricked into reading a vibes post, btw).

Expand full comment