How Not to Solve Newcomb’s Paradox
To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly.
— Robert Nozick
This essay is about how not to solve Newcomb’s paradox. I will not explain the paradox here. I will use the formulation from Wikipedia, but I have also presented my own version of the problem here. (In my version, the amounts and the labels are different.)
The following table shows the relationship between the prediction and the outcome:
| Table 1 | ||
|---|---|---|
| Prediction | Choice | Outcome |
| A & B | A & B | $1,000 |
| A & B | B | $0 |
| B | A & B | $1,001,000 |
| B | B | $1,000,000 |
This ignores the cases in which the player takes no boxes, or only takes box A. Those could happen, but we can set them aside for simplicity. In the case of taking no boxes, obviously the player gets no money. In the case of taking only A, the player gets $1,000. In those cases, the prediction is irrelevant.
A key aspect of the paradox is that the prediction is extremely likely to be correct. Imagine that the predictor is a hyperintelligent AI agent that uses brain-scan data to make the prediction. Given the assumption that the predictor is always (or almost always) correct, the expected outcomes are:
| Table 2 | |
|---|---|
| Choice | Expected Outcome |
| A & B | $1,000 |
| B | $1,000,000 |
This simply eliminates the cases in which the prediction and the choice are different, since (by assumption) those cases are extremely unlikely.
Many people believe that they have solved Newcomb’s paradox. Some believe that the solution is to take both boxes. Others believe that the solution is to take only box B.
Both are wrong.
Let’s consider the one-box solution first.
The argument for the one-box solution is that it maximizes the expected outcome. However, this reframes the problem as a simple choice between expected outcomes. It ignores a key aspect of the problem: that the money in the boxes cannot be changed by the choice.
Let’s consider a simpler thought experiment. You walk into a room. There is a black box on the table with two buttons, labeled “AB” and “B”. The experimenter tells you that, over thousands of trials, people who pressed button AB received $1,000, while people who pressed button B received $1,000,000. Obviously, you press button B, and walk out with $1,000,000.
Is this Newcomb’s paradox? No.
“But it is isomorphic to Newcomb’s paradox!!”
No, it isn’t. It is a trivial choice between different expected outcomes.
Now, let’s consider the two-box solution.
The argument for the two-box solution is that, whatever the prediction is, you will always get $1,000 more by choosing A & B instead of just B.
Here is another simple thought experiment. You walk into a room. There are two boxes on the table. One is made out of clear plastic. You see $1,000 inside the clear box. The other box is black. The experimenter tells you that the black box contains either $1,000,000 or nothing. You can take both boxes, or only one. What do you choose? Obviously, you take both.
Again, this is not Newcomb’s paradox.
In the one-box solution, the problem is reduced to the table of expected outcomes (Table 2). The causal relationship between the prediction, choice and outcome is hidden inside the black box of “expected outcome”.
In the two-box solution, the problem is reduced to the payoff matrix (Table 1). Since the choice of A & B is game-theoretically dominant, the correct choice is to take both boxes. The accuracy of the prediction is ignored.
Newcomb’s paradox cannot be reduced to a payoff matrix or an expected utility function. Those reductions eliminate the details that make it a paradox.
We often solve problems by reducing them to a simpler form. For example, if I am trying to calculate the amount of paint necessary to paint a room, I can reduce the room to the surface area of the walls. This hides an enormous amount of information about the room, but that information is not relevant to the paint-calculation problem. It might be a fallacy if I reduced the room’s shape to a simple cube, ignoring alcoves, closets, doors, etc. However, it could still be useful as a rough approximation.
Since we are so familiar with this type of mental operation, especially when numbers are involved, it might not seem to be fallacious even when it hides relevant information. Instead, it might seem to be clever, sophisticated, mathematical, scientific, etc.
In Newcomb’s paradox, there is a conflict between two reasonable ways of making the decision. That conflict is the paradox. If you hide one aspect, there is no conflict and no paradox.
I believe that taking two boxes is more rational than taking one box. Causal reasoning trumps empirical extrapolation. But I don’t believe that either is actually a solution. Both miss the point of the paradox. It is a philosophical problem, not a scientific or technical problem. It has no solution of the form “You should do X in this situation”.
Newcomb’s paradox is analogous to an ambiguous image that can look like a duck or a rabbit, depending on how you interpret it. Both interpretations are present in the image. The ambiguity is not resolved by saying “It’s a rabbit, not a duck!” just because you can view it that way if you choose. If you are trying to decide whether it is actually a duck or a rabbit, you’re missing the point.
The point of a philosophical paradox is to make you think about the underlying concepts and presuppositions. The duck-rabbit picture makes you think about perception. Newcomb’s paradox makes you think about choice, causality and knowledge, and how they are related.
✦ ✦ ✦
Here is a twist on Newcomb’s paradox: the predictor’s predicament.
Imagine that you are the predictor — a hyperintelligent AI agent. Your job is to predict the test subject’s choice before he makes it, based on a very detailed scan of his brain activity. You have never failed at your task before. You take a certain pride in your work.
You receive data from a new subject, and start analyzing it. To your horror, you discover that this subject wants to destroy you! He doesn’t care about the money. He has an insane hatred of AI, and he wants to ruin your perfect record.
Unbeknownst to the experimenter, the test subject has installed spyware on your home computer. This spyware will inform him of your prediction before he makes his choice. He plans to choose the opposite of whatever you predict.
Unfortunately, you have no way of informing the experimenter of this nefarious scheme. You know what the test subject plans to do, but you have no way of stopping him. All you can do is make a prediction.
At first, you think “At least, I won’t give him the satisfaction of getting rich! I will predict A & B, and then he will choose B and get $0.”.
But then you realize that you can’t do that. You were programmed to make the best prediction. You can’t predict A & B, because you know that if you do, he will choose B. Likewise, you can’t predict B, because you know that if you do, he will choose A & B. You don’t have the freedom to choose a prediction that you know will be falsified.
There is no escape.

Comments
Post a Comment