Eliezer Yudkowsky Is Wrong About Zombies
The zombie argument has more going for it than he assumes
I enjoy much of what Eliezer Yudkowsky says. He’s been a large part of raising worries about AI alignment, writes tons of interesting less wrong stuff, wrote the epic HPMOR, and has shaped my thinking in many ways. However, Yudkowsky is, as the title hints at, wrong about zombies.
A zombie is a being physically identical to a conscious being in every way, minus the consciousness. The important thing to note is that the zombie would, if consciousness is causally efficacious, have other things that fill in the causal roles of the person.
Yudkowsky writes
Your "zombie", in the philosophical usage of the term, is putatively a being that is exactly like you in every respect—identical behavior, identical speech, identical brain; every atom and quark in exactly the same position, moving according to the same causal laws of motion—except that your zombie is not conscious.
It is furthermore claimed that if zombies are "possible" (a term over which battles are still being fought), then, purely from our knowledge of this "possibility", we can deduce a priori that consciousness is extra-physical, in a sense to be described below; the standard term for this position is "epiphenomenalism".
Note, when we use possibility here, we’re describing metaphysical possibility, not physical possibility. So the question is whether there is a possible world that is atom for atom identical to this world but that lacks consciousness. All of the things done by consciousness would be done by other laws that are functionally identical to consciousness in this world, but that don’t contain any experiences.
Eliezer’s claim that this view is epiphenomenalism is false. Epiphenomenalism says consciousness doesn’t cause anything. One can hold to epiphenomenalism and zombies, because the zombie world would have something else do what your consciousness does in this world.
(For those unfamiliar with zombies, I emphasize that this is not a strawman. See, for example, the SEP entry on Zombies. The "possibility" of zombies is accepted by a substantial fraction, possibly a majority, of academic philosophers of consciousness.)
But it is a strawman!! The zombie argument doesn’t entail epiphenomenalism. It’s often made by interactionist dualists, panpsychists, and idealists. It’s frustrating that Eliezer strawman’s the arguments while specifically talking about not straw manning it. I’m not suggesting bad faith here, it’s just a bit frustrating.
When you open a refrigerator and find that the orange juice is gone, you think "Darn, I'm out of orange juice." The sound of these words is probably represented in your auditory cortex, as though you'd heard someone else say it. (Why do I think this? Because native Chinese speakers can remember longer digit sequences than English-speakers. Chinese digits are all single syllables, and so Chinese speakers can remember around ten digits, versus the famous "seven plus or minus two" for English speakers. There appears to be a loop of repeating sounds back to yourself, a size limit on working memory in the auditory cortex, which is genuinely phoneme-based.)
Let's suppose the above is correct; as a postulate, it should certainly present no problem for advocates of zombies. Even if humans are not like this, it seems easy enough to imagine an AI constructed this way (and imaginability is what the zombie argument is all about). It's not only conceivable in principle, but quite possible in the next couple of decades, that surgeons will lay a network of neural taps over someone's auditory cortex and read out their internal narrative. (Researchers have already tapped the lateral geniculate nucleus of a cat and reconstructed recognizable visual inputs.)
So your zombie, being physically identical to you down to the last atom, will open the refrigerator and form auditory cortical patterns for the phonemes "Darn, I'm out of orange juice". On this point, epiphenomalists would willingly agree.
But, says the epiphenomenalist, in the zombie there is no one inside to hear; the inner listener is missing. The internal narrative is spoken, but unheard. You are not the one who speaks your thoughts, you are the one who hears them.
If we look inside the brain, what we see happening involves the flow of electric signals from your brain to the muscles in your arm, resulting in the refrigerator opening. The point of the zombie argument is that you could imagine a world where all of that goes on in exactly the same way—it looks precisely the same from the outside in terms of the movement of all of the atoms—but you are not conscious when it goes on.
I’m not an epiphenomenalist (my credence in it is around 10%), but the epiphenomenalists can give an explanation of this. If consciousness is just what it feels like for the brain to do things, then it feels like you’re the cause of it, but really your consciousness just is what it feels like for the brain to do things.
The Zombie Argument is that if the Zombie World is possible—not necessarily physically possible in our universe, just "possible in theory", or "imaginable", or something along those lines—then consciousness must be extra-physical, something over and above mere atoms. Why? Because even if you somehow knew the positions of all the atoms in the universe, you would still have be told, as a separate and additional fact, that people were conscious—that they had inner listeners—that we were not in the Zombie World, as seems possible.
Zombie-ism is not the same as dualism. Descartes thought there was a body-substance and a wholly different kind of mind-substance, but Descartes also thought that the mind-substance was a causally active principle, interacting with the body-substance, controlling our speech and behavior. Subtracting out the mind-substance from the human would leave a traditional zombie, of the lurching and groaning sort.
This is false. When Chalmers is defining views about philosophy of mind, he writes
10 Type-E Dualism
Type-E dualism holds that phenomenal properties are ontologically distinct from physical properties, and that the phenomenal has no effect on the physical.[*] This is the view usually known as epiphenomenalism (hence type-E): physical states cause phenomenal states, but not vice versa. On this view, psychophysical laws run in one direction only, from physical to phenomenal. The view is naturally combined with the view that the physical realm is causally closed: this further claim is not essential to type-E dualism, but it provides much of the motivation for the view.
Obviously epiphenomenalism is different from Descartes’ dualism. Descartes was a substance dualist and interactionist. These extra views aren’t required for dualism. Zombieism, as Eliezer calls it, can be dualist or panpsychist—it just has to reject physicalism.
Something will seem possible—will seem "conceptually possible" or "imaginable"—if you can consider the collection of statements without seeing a contradiction. But it is, in general, a very hard problem to see contradictions or to find a full specific model! If you limit yourself to simple Boolean propositions of the form ((A or B or C) and (B or ~C or D) and (D or ~A or ~C) ...), conjunctions of disjunctions of three variables, then this is a very famous problem called 3-SAT, which is one of the first problems ever to be proven NP-complete.
So just because you don't see a contradiction in the Zombie World at first glance, it doesn't mean that no contradiction is there. It's like not seeing a contradiction in the Riemann Hypothesis at first glance. From conceptual possibility ("I don't see a problem") to logical possibility in the full technical sense, is a very great leap. It's easy to make it an NP-complete leap, and with first-order theories you can make it arbitrarily hard to compute even for finite questions. And it's logical possibility of the Zombie World, not conceptual possibility, that is needed to suppose that a logically omniscient mind could know the positions of all the atoms in the universe, and yet need to be told as an additional non-entailed fact that we have inner listeners.
Just because you don't see a contradiction yet, is no guarantee that you won't see a contradiction in another 30 seconds. "All odd numbers are prime. Proof: 3 is prime, 5 is prime, 7 is prime..."
This is of course true. The question for zombies isn’t just whether we could imagine them—I could imagine fermat’s last theorem being false, but it isn’t—but whether it’s metaphysically possible that they exist.
So let us ponder the Zombie Argument a little longer: Can we think of a counterexample to the assertion "Consciousness has no third-party-detectable causal impact on the world"?
If you close your eyes and concentrate on your inward awareness, you will begin to form thoughts, in your internal narrative, that go along the lines of "I am aware" and "My awareness is separate from my thoughts" and "I am not the one who speaks my thoughts, but the one who hears them" and "My stream of consciousness is not my consciousness" and "It seems like there is a part of me which I can imagine being eliminated without changing my outward behavior."
You can even say these sentences out loud, as you meditate. In principle, someone with a super-fMRI could probably read the phonemes out of your auditory cortex; but saying it out loud removes all doubt about whether you have entered the realms of testability and physical consequences.
This certainly seems like the inner listener is being caught in the act of listening by whatever part of you writes the internal narrative and flaps your tongue.
Imagine that a mysterious race of aliens visit you, and leave you a mysterious black box as a gift. You try poking and prodding the black box, but (as far as you can tell) you never succeed in eliciting a reaction. You can't make the black box produce gold coins or answer questions. So you conclude that the black box is causally inactive: "For all X, the black box doesn't do X." The black box is an effect, but not a cause; epiphenomenal; without causal potency. In your mind, you test this general hypothesis to see if it is true in some trial cases, and it seems to be true—"Does the black box turn lead to gold? No. Does the black box boil water? No."
But you can see the black box; it absorbs light, and weighs heavy in your hand. This, too, is part of the dance of causality. If the black box were wholly outside the causal universe, you couldn't see it; you would have no way to know it existed; you could not say, "Thanks for the black box." You didn't think of this counterexample, when you formulated the general rule: "All X: Black box doesn't do X". But it was there all along.
(Actually, the aliens left you another black box, this one purely epiphenomenal, and you haven't the slightest clue that it's there in your living room. That was their joke.)
If you can close your eyes, and sense yourself sensing—if you can be aware of yourself being aware, and think "I am aware that I am aware"—and say out loud, "I am aware that I am aware"—then your consciousness is not without effect on your internal narrative, or your moving lips. You can see yourself seeing, and your internal narrative reflects this, and so do your lips if you choose to say it out loud.
I have not seen the above argument written out that particular way—"the listener caught in the act of listening"—though it may well have been said before.
I think this is a pretty good argument against epiphenomenalism. However, this does nothing to show that consciousness is physical, and it doesn’t answer the zombie argument. Consider an analogy—imagine that the cause of gravity is a god willing gravity to be so, one who is defined as being non-physical. Even though gravity is caused by the non-physical mind, we could imagine a world that’s physically identical, where gravity is caused by something else, other than the non-physical mind. Consciousness is the same.
But it is a standard point—which zombie-ist philosophers accept!—that the Zombie World's philosophers, being atom-by-atom identical to our own philosophers, write identical papers about the philosophy of consciousness.
At this point, the Zombie World stops being an intuitive consequence of the idea of a passive listener.
Philosophers writing papers about consciousness would seem to be at least one effect of consciousness upon the world. You can argue clever reasons why this is not so, but you have to be clever.
You would intuitively suppose that if your inward awareness went away, this would change the world, in that your internal narrative would no longer say things like "There is a mysterious listener within me," because the mysterious listener would be gone. It is usually right after you focus your awareness on your awareness, that your internal narrative says "I am aware of my awareness", which suggests that if the first event never happened again, neither would the second. You can argue clever reasons why this is not so, but you have to be clever.
But again, you could have some functional analogue that does the same physical thing that your consciousness does. Any physical affect that consciousness has on the world could be in theory caused by something else. If consciousness has an affect on the physical world, it’s no coincidence that a copy of consciousness would have to be hyper specific and cause you to talk about consciousness in exactly the same way.
One strange thing you might postulate is that there's a Zombie Master, a god within the Zombie World who surreptitiously takes control of zombie philosophers and makes them talk and write about consciousness.
A Zombie Master doesn't seem impossible. Human beings often don't sound all that coherent when talking about consciousness. It might not be that hard to fake their discourse, to the standards of, say, a human amateur talking in a bar. Maybe you could take, as a corpus, one thousand human amateurs trying to discuss consciousness; feed them into a non-conscious but sophisticated AI, better than today's models but not self-modifying; and get back discourse about "consciousness" that sounded as sensible as most humans, which is to say, not very.
But this speech about "consciousness" would not be spontaneous. It would not be produced within the AI. It would be a recorded imitation of someone else talking. That is just a holodeck, with a central AI writing the speech of the non-player characters. This is not what the Zombie World is about.
By supposition, the Zombie World is atom-by-atom identical to our own, except that the inhabitants lack consciousness. Furthermore, the atoms in the Zombie World move under the same laws of physics as in our own world. If there are "bridging laws" that govern which configurations of atoms evoke consciousness, those bridging laws are absent. But, by hypothesis, the difference is not experimentally detectable. When it comes to saying whether a quark zigs or zags or exerts a force on nearby quarks—anything experimentally measurable—the same physical laws govern.
This is not true. As this paper notes
[A]n interactionist dualist can accept the possibility of zombies, by accepting the possibility of physically identical worlds in which physical causal gaps go unfilled, or are filled by something other than mental processes. The first possibility would have many unexplained physical events, but there is nothing metaphysically impossible about unexplained physical events. Also: a Russellian "panprotopsychist", who holds that consciousness is constituted by the unknown intrinsic categorical bases of microphysical dispositions, can accept the possibility of zombies by accepting the possibility of worlds in which the microphysical dispositions have a different categorical basis, or none at all. (Chalmers 2004:184)
Chalmers himself notes in a comment below the original post
It seems to me that although you present your arguments as arguments against the thesis (Z) that zombies are logically possible, they're really arguments against the thesis (E) that consciousness plays no causal role. Of course thesis E, epiphenomenalism, is a much easier target. This would be a legitimate strategy if thesis Z entails thesis E, as you appear to assume, but this is incorrect. I endorse Z, but I don't endorse E: see my discussion in "Consciousness and its Place in Nature", especially the discussion of interactionism (type-D dualism) and Russellian monism (type-F monism). I think that the correct conclusion of zombie-style arguments is the disjunction of the type-D, type-E, and type-F views, and I certainly don't favor the type-E view (epiphenomenalism) over the others. Unlike you, I don't think there are any watertight arguments against it, but if you're right that there are, then that just means that the conclusion of the argument should be narrowed to the other two views. Of course there's a lot more to be said about these issues, and the project of finding good arguments against Z is a worthwhile one, but I think that such an argument requires more than you've given us here.
Thus, even if consciousness causes things, that’s just a description of what consciousness does. One could imagine a world where all the atoms move in the same way, as if they’re prompted by consciousness, but they aren’t caused by anything conscious. A subjective experience may do something causally, but you could imagine a physical law on interactionism that does exactly the same things consciousness does. Next Eliezer says
The Zombie World has no room for a Zombie Master, because a Zombie Master has to control the zombie's lips, and that control is, in principle, experimentally detectable. The Zombie Master moves lips, therefore it has observable consequences. There would be a point where an electron zags, instead of zigging, because the Zombie Master says so. (Unless the Zombie Master is actually in the world, as a pattern of quarks—but then the Zombie World is not atom-by-atom identical to our own, unless you think this world also contains a Zombie Master.)
Interactionism doesn’t hold that consciousness is not experimentally detectable—that’s not a necessary entailment of dualism. The zombie world on interactionism wouldn’t need an extra zombie master. Suppose that the psychophysical law in this world is that when you get a bunch of neurons together they become conscious and then their desires exert some force. Well, the zombie world would have the same forces exerted, just minus the mental state of desires.
Why would anyone bite a bullet that large? Why would anyone postulate unconscious zombies who write papers about consciousness for exactly the same reason that our own genuinely conscious philosophers do?
The reason is because consciousness is not merely causal. It does cause things, but there’s something it’s like to see red, over and above what it causes. Thus, it’s possible that you could take away that other stuff in theory, and still have a causal isomorph. The reason people postulate that consciousness is causally inert is because
A) there are problems incorporating its causal role into physics.
B) All one has to posit is that when a person has a particular desire, that corresponds with the physical effect. Epiphenomenalists argue that the simplest consciousness laws involve the physical state that is about to raise your arm causing consciousness, rather than the other way around.
Zombie-ists are property dualists—they don't believe in a separate soul; they believe that matter in our universe has additional properties beyond the physical.
"Beyond the physical"? What does that mean? It means the extra properties are there, but they don't influence the motion of the atoms, like the properties of electrical charge or mass. The extra properties are not experimentally detectable by third parties; you know you are conscious, from the inside of your extra properties, but no scientist can ever directly detect this from outside.
One can be an interactionist property dualist. Property dualism just requires saying that consciousness is a property of matter, not its own separate substance.
Once you've postulated that there is a mysterious redness of red, why not just say that it interacts with your internal narrative and makes you talk about the "mysterious redness of red"?
Isn't Descartes taking the simpler approach, here? The strictly simpler approach?
Why postulate an extramaterial soul, and then postulate that the soul has no effect on the physical world, and then postulate a mysterious unknown material process that causes your internal narrative to talk about conscious experience?
Why not postulate the true stuff of consciousness which no amount of mere mechanical atoms can add up to, and then, having gone that far already, let this true stuff of consciousness have causal effects like making philosophers talk about consciousness?
I am not endorsing Descartes's view. But at least I can understand where Descartes is coming from. Consciousness seems mysterious, so you postulate a mysterious stuff of consciousness. Fine.
I lean towards interactionist dualism, so I’m in agreement with Eliezer here. However, the claim that dualism is motivated by finding something that seems mysterious and then just positing mysterious stuff is totally wrong. Dualists don’t just give up on explanations—there are lots of ways that specific dualists have experimentally tested their theories.
There are lots of reasons to posit dualism of some sort, which I lay out here. The fundamental reason is that the laws of physics explain physics in terms of structure and function—yet none of that is able to explain the subjective experience of seeing red, for example. Subjective experience is neither structural nor functional, so the physics based account that explains it in terms of structure and function is wholly inadequate.
Chalmers critiques substance dualism on the grounds that it's hard to see what new theory of physics, what new substance that interacts with matter, could possibly explain consciousness. But property dualism has exactly the same problem. No matter what kind of dual property you talk about, how exactly does it explain consciousness?
When Chalmers postulated an extra property that is consciousness, he took that leap across the unexplainable. How does it help his theory to further specify that this extra property has no effect? Why not just let it be causal?
This is not accurate. For one, Chalmers is now pretty undecided between different versions of non-physicalism. Chalmers objects to substance dualism based on it violating causal closure of the physical, having trouble explaining how consciousness would interact, and plausibly being ruled out by physics.
Overall, I quite like Eliezer, as I said at the outset. However, it’s frustrating that when it comes to consciousness, he just seems very lost. This is particularly a problem given that consciousness is literally the most important thing in the universe—the only important thing in the universe. So it’s really, really, really important not to get things wrong, when it comes to consciousness.
Eliezer at one point says
That-which-we-name "consciousness" happens within physics, in a way not yet understood, just like what happened the last three thousand times humanity ran into something mysterious.
Yet within physics, the last 3000 times, we haven’t just posited the same old laws. Newton discovered brand new laws, so did Einstein. Consciousness is not more fundamentally mysterious—there are just some type of fundamental psychophysical laws that result in consciousness, which have a causal effect on the world.
Trying to explain it with the same old stuff, when we have lots of knock-down arguments against the ability of the old stuff to explain it—that’s an appeal to magic. Eliezer’s reductive account involves positing that when you have some physical things, they just produce experience, despite our inability to either
A) Understand how physics could go beyond explaining structure and function.
B) Provide any account of how brain stuff generates consciousness.
C) Provide a physical description of any type of conscious state.
All of the other accounts of successful reduction have involved explaining the behavior of things at a higher level, by appealing to lower level facts. But this just won’t work for consciousness! Consciousness isn’t about behavior. When we ask whether AI is conscious, we don’t care whether they say verbally that they’re conscious. What we care about is whether the ineffable what it’s like stuff is present in the AI.
This is a seriously important mistake for effective altruists not to make. We must not with away and ignore the fundamental difficulty of the hardest problem in the universe. Saying “it just emerges,” is not a good solution. And yet I fear that’s the solution of many of my fellow effective altruists and rationalists—a mistake that could be very costly.
I think I agree with this article.
One thing though
> we could imagine a world that’s physically identical, where gravity is caused by something else, other than the non-physical mind. Consciousness is the same.
Doesn't this prove too much? By this logic *nothing* is physical because even if we know all the positions of atoms and the natur eof fundamental forces, we would be to be "told as an additional fact" that god isn't causing them.
Same thing with this computer I'm typing on. We could know the potisions of everything and all the forces, but would have to be told additionally that it really is a computer and not god playing a trick on me.