What If Most Longtermists Are Wrong About The Primary Aim?
The report that most changed my worldview
1 The case for flourishing
I have gotten one-shotted by every Forethought (for whom I now work) report I’ve ever read. The first one I read was Preparing For The Intelligence Explosion, which very quickly convinced me that AI was likely to be a big deal soon, and this has totally worldview upending implications. The second one was Better Futures which pretty dramatically changed my mind on what the best thing to do was. The third one made the case for the realistic possibility of AI-enabled coups, which I agreed with, but it didn’t radically shatter my worldview, because my pre-existing worldview did not depend on the assumption that AIs wouldn’t be useful for coups (that would be a sort of weird pillar of a worldview).
In this article, I’m going to talk about the better futures series. I think the ideas in it are both:
Hard to deny after you think about them.
Not obviously the sorts of things you’d think about.
Hugely important.
(I know I said both and then listed three things. Well fuck tha (grammar) police).
The core thesis of better futures is that we should be working more on trying to make good futures better. At the margins, that is a better thing to pursue than reducing existential risks. We should promote flourishing, not just survival. The core argument for this is very simple: most future value is lost from failure to get near-optimal futures, and yet almost no one is working to get a near-best future. If something is the source of the majority of lost future value and yet only like twelve people are working on it, it’s probably a pretty good thing to work on.
First, why think it’s where most value is lost? The answer is that value is fragile. Only a small fraction of futures are near-best. There’s no inevitable force that guarantees a really good future. Thus, we should be surprised to get a near-best future for the same reason we’d be surprised by any highly-specific future.
There’s also an inductive point: no society in history has been near-optimal. For most of history, people owned slaves, and this left society worse than it might have otherwise been (Source???). We’ve made giant torture farms where we mistreat hundreds of billions of animals. We’re not anywhere near trying to maximize value. So absent some highly-specific force driving things in the direction of a near-best future, then we shouldn’t expect anything to be near-optimal by default.
Now, this isn’t totally hopeless. One could imagine the world, after getting advanced AI, seriously reflecting on how to make things really good and deferring to the superintelligent AI moral philosophers. But this is very far from a guarantee.
Worlds that aren’t near-best, by default, lose out on most value. Most moral theories imply that if we weren’t consciously optimizing for what they say is most important, we probably won’t get it in anything like the amounts we could. Losing out on half of future value because we only use half of space resources optimally incurs as much expected value loss as a 1/2 chance of extinction (if we’d be guaranteed a near-best future given extinction and more if we wouldn’t).
For example, suppose that it’s good to create happy people. But suppose additionally that we won’t optimally use space resources for this noble goal—that we won’t use them to create as many happy people as we can. Then most future value is going to be lost! This would be a catastrophe vastly worse, by many orders of magnitude, than all bad things that have ever happened.
Or suppose that digital beings matter a lot morally. In the future, there’s no guarantee that we’ll take their interests seriously. We might create an Omelas-like society built off the backs of suffering, intelligent, digital beings. This would be bad by the lights of utilitarianism, deontology, and common sense.
Or we might have the wrong view about what makes a life go well. We might think what really matters is preference satisfaction, and spend space resources promoting that, even if there’s more to life than satisfying preferences. Or alternatively there might be clever ways to use space resources to create enormous amounts of value, but we fail to use them for that purpose.
And many others. In order to have a near-best future, it isn’t enough to get some things right. We need to get right the answer to every important moral question. And that’s genuinely difficult.
Now, you might wonder: won’t we just defer to the superintelligent AIs? They should be able to do philosophy better than us, right? I agree there’s some chance of this (which is why I think the odds are near one in ten we get a near-best future) but it’s far from a guarantee.
In the real world, do we defer to moral philosophers before doing high-stakes things? Mostly we just don’t think about the surprising moral implications of our actions. There was no moral consultation before the first factory farm was built. If we consulted the superintelligent AI and it told us that meat-eating was morally terrible or that we should convert space resources into spamming happy digital minds, do you think the world would listen? Certainly doesn’t seem like a guarantee.
And in any case, it seems possible that we’ll settle space and take irreversible actions before we have the sorts of AIs that can solve crucial ethical questions.
This conclusion gets more dramatic the more you think that existential risks are low but odds of a near best future are low as well. For a simple model, let’s imagine that there’s a 10% chance of a near-best future given that we don’t go extinct, and a 10% chance that we go extinct. Let’s also assume that the non-near best futures are 1% as good as the near-best futures.
On this model, the expected value loss from the 10% risk of extinction would be equivalent from the expected value loss of lowering the odds of a near-best future by 1.9%. In other words, if we miss out on a near-best future by default, existential risk reduction efforts go down in value, because even if successful, they don’t guarantee a near-best future. Thus, it seems likely that guaranteeing a near-best future conditional on survival would bring about way more value than guaranteeing survival.
So all this is to say: it seems like of the expected future value we lose out on, most of it is from failure to get near-best futures. A smaller fraction is on reducing existential risks. And yet despite this, the number of people specifically working on boosting the odds we get a near-best future in the billions and trillions of years to come is around ten.
Note: other things people are doing often has the effect of increasing the odds of getting a near-best future. E.g. campaigning for justice might increase the odds of a just future. But there are very few people specifically aiming at trying to secure a near-best future in the long-term.
There are more detailed discussions to be had about precisely which actions can boost the odds of a near-best future. There are some reasons for skepticism. But just in general, if we grant that there are like 10 people working on the source of most of the world’s lost value, that seems obviously good to work on. As an analogy, if there was a button which was pressed and cut the value of the world in half, it would seem like more than nine people should be looking into ways to reverse the button’s effects. Even if you were pretty sure that the button’s impacts were irreversible, shouldn’t there be at least, like, at least 100 people working on seeing if anything could be done about the thing that destroys half the value of the world.
Alternatively, if there was an alien monster that was planning to swallow half the world—and continue swallowing at regular intervals so that in total half of future value got swallowed—seems like someone should look into stopping it.
2 Will our actions wash out?
Imagine cave men trying to promote flourishing. They reason that near-best futures are vastly better than non-near-best futures, and so they set off trying to steer the direction of the world so that it goes better. This would be a waste of time. Nothing that cave-men could do would affect the long-term future.
One reason for skepticism about promoting flourishing is analogous. Perhaps we are like cave men, in that nothing we can do can affect how the far future goes. This was part of the standard Longtermist reply to cluelessness: sure we can’t really predict most long-term effects of our actions, but we can be pretty sure that if the world gets destroyed, it will have lower value than it would have otherwise, so we should work on making sure it isn’t destroyed.
In my view, this largely comes down to the question of whether we might enter a state near-term that will likely persist into the far future. Some examples of possible states like this just to help give the idea:
Imagine that absent some specific historical process, feudalism would have been expected to persist forever. Bringing about that process would be very important, because a feudalist world is much lower expected value than a non-feudalist world.
If we go extinct then we’ll persist in that state forever, and the value of the future will be vastly diminished.
If a stable global totalitarian regime takes over, a bit like the one envisioned in 1984, it might persist forever.
So the question is: are there states the world might enter that would permanently diminish future value? Alternatively, are there ones we could enter that would majorly increase future value? The paper Persistent Path Dependence argues that there are. Some examples:
Through the intelligence explosion, we might expect one single entity, or a small number of entities, to get the lion’s share of global power. Such an entity would be able to shape how the future goes. They could set out and begin taking over space resources in pursuit of their long-term aim. Or, if they were a totalitarian regime, they could ensure that the leader’s rule endures forever.
Even without a single hegemon, multiple powers who together share a large portion of global power could coordinate to take over.
There are various mechanisms by which the future might have vastly greater ability to lock in some state than the past:
A major source of change is old leaders dying. But digital beings could live forever. In theory, a digital being could be made to carry out the will of the leader even after he dies.
AGI could be used to enforce some plan for a very long period of time.
With the ability to create digital beings, we’ll have much greater ability to mold the aims of our descendants. This increases the odds that the ideology of the present will persist into the future.
Overall, I think it’s reasonably likely that we’ll lock in the future at some time soon. The biggest reason is as follows: I think odds are good that we’ll get very powerful AI soon. At some point, either we’ll venture out into space and begin transforming the world in whichever way we choose or we’ll get some set of institutions that prevents that from ever happening. If we begin venturing out and transforming space, then I expect it to be hard to put the genie back in the bottle. I expect such a pivotal action to continue without much change for billions of years. Alternatively, if we lock in a mechanism to block space development, that action would be irreversible.
At some point, the world will decide the future plan for the universe (note: this doesn’t require an explicit global plan. If there’s no explicit global regulatory plan, then the plan in practice is “let the universe develop according to the whims of private actors). I think it’s reasonably likely such a future plan will be decided soon, which makes now a critical period for making the future go better.
3 How can we make the future flourish?
Okay, so maybe some of the actions we’ll take near-term could affect the value of hte far future. But what can we do practically?
I think there’s a decent chance we can make progress just because of how few people are working on the problem. Just as you could have a big impact by being one of a few thinkers planning early-stage nuclear strategy, how Locke had a big impact by being one of a small number of thinkers writing about how Democracies could develop, and how Bostrom’s superintelligence book hugely impacted the AI conversation, neglectedness might make progress easy. There is still low-hanging fruit to pick.
Some suggestions:
Writing about which institutions could be valuable in a post AGI world seems valuable. For example, Will MacAskill’s paper about aiming for Viatopia—for a world that puts us in a position to secure a near-best future, rather than specific vision of a near-best future—seems pretty valuable. Similarly, we should get people thinking about which values to give to an AI and which norms we should have for governing space.
Further research about how to promote flourishing seems valuable. At this point, the terrain is just so underexplored so lots of progress seems possible.
One of the sub-reports in the Better Futures series was titled How to Make the Future Better. It discussed concrete actions to promote flourishing. The suggestions were:
Working to prevent a post-AGI autocracy. Specific actions to promote this include: 1) working to stop AI-enabled coups; 2) helping preserve the Democratic structure of current Democracies (e.g. working to prevent the U.S. from going authoritarian); 3) slowing down the AGI development of autocracies (e.g. by blocking chip sales to China).
Working to improve space governance. Space is important because it could allow one actor to seize control early, and it has almost all of the resources. Working to prevent near-term space grabs, and longer-term regimes for flourishing space governance seems high-value. For example, it might be worth requiring some portion of space resources to be used for a high-value purpose.
Law could require sunset clauses on certain kinds of commitments made between countries. This would prevent the U.S. and China from locking in our current priorities, by having them be indefinitely enforced by AI.
Slowing the intelligence explosion could give us more time to prepare and lower the odds of autocrats seizing the reigns.
Regulating AI to make sure that it’s safe and prohibit one actor from seizing control seems important.
Giving the AI robustly good moral values and preventing its scheming also seems valuable. If AIs will control how much of the future goes, increasing the odds that AIs with good values are the ones being used to make important decisions seems very high-impact.
Working on securing rights for AI would be important because almost all beings in expectation are digital. This lowers the odds that we’ll have very large populations of mistreated and disenfranchized AIs. Right now, few people are thinking about AI rights, so this seems like an area in which lots of progress could be made.
Efforts to better integrate the will of the public could be valuable for preventing most people from being disenfranchized.
Working to prevent sub-extinction catastrophes could lead to a more stable world, thus making the intelligence explosion go well.
Using AIs to make important decisions would improve our ability to navigate hard tradeoffs. Given that it’s unlikely we’d get a near-best future from human ingenuity alone and that smart decision making improves our ability to navigate the myriad challenges arising in the near future, integrating AIs for making important decisions seems valuable.
Empowering morally responsible actors—e.g. imposing the kinds of regulations that benefit safety-conscious AI companies but penalize those whose AIs temporarily become mecha-Hitler—seems to raise the odds things go well, by shifting power into the hands of good decision-makers.
Given how hugely neglected many of these things are, it seems plausible that there’s lots of progress that can be made. I wouldn’t be surprised if a concerted research effort could meaningfully shift the odds that the world would get a near-best future, perhaps even by a few percentage points if sufficient effort was made.


"Or suppose that digital beings matter a lot morally. In the future, there’s no guarantee that we’ll take their interests seriously. We might create an Omelas-like society built off the backs of suffering, intelligent, digital beings."
The reference here doesn't really make sense. In the original Omelas story, there was a single child being tortured for the benefit of everyone else in society. (Which is totally fine by utilitarian standards.) Whereas you're imagining a sort of reverse-Omelas society, where the vast majority of beings (digital minds) are tortured for the benefit of a tiny minority of beings (natural / human minds).
Otherwise, your point is correct.
The word 'climate' nowhere appears. AI nerds are funny.