Newcomb's Paradox

In this essay, I will give my version of Newcomb’s paradox, and then explain it.

Suppose that there is a brilliant but eccentric scientist. We’ll just call him “the professor”. The professor has invented a device that scans brain waves and generates a model of the way the brain makes decisions. The models generated by this device are highly accurate. The professor invites you to take part in an experiment for a payment of $100. Let’s assume that you fully trust the professor. His reputation is impeccable. You agree to the experiment.

It goes like this. You sit at a table in his laboratory, with a computer screen in front of you. The professor puts the brain wave scanner on your head. He assures you that it has no negative effects. It doesn’t use radiation or anything harmful. It just picks up brain waves. He tells you that the computer will present you with multiple choice problems. You are supposed to solve each problem to the best of your ability. While you are solving each problem, the computer will be reading your brain waves, and it will correlate them with the answer you choose. The whole process will take about an hour. After that time, the computer will have constructed a model of how your brain makes decisions.

The professor leaves the room, and the experiment begins. The computer presents you with a series of problems of various kinds: math problems, simple choices (vanilla or chocolate?), ethical dilemmas (trolley problems), and so on. For each one, you make a decision, click on the box on the screen, and then the computer moves on to the next problem. As the professor predicted, it takes roughly an hour.

When you are finished, the professor returns sipping a cup of tea. He takes off the scanning device, and says he has one more problem for you to solve, to test the accuracy of the model. He puts two envelopes on the table in front of you. One is labeled “A”. The other is labeled “B”. The professor tells you that envelope B contains the $100 he promised as payment for the experiment. Envelope A contains either $1000 or nothing.

  • A: $1000 or $0
  • B: $100

You can take both envelopes if you want, or you can take either one by itself, or you can leave both and go home empty-handed. You are free to make any of those choices. But there is a wrinkle, he tells you.

A few minutes ago, he gave this problem to the computer model of your brain, and he decided whether to put $1000 in envelope A based on the model’s choice. If the model of your brain chose both A and B (if it was “greedy”) then the professor only put money in B. On the other hand, if the model of your brain chose only one envelope (A or B) then he put $1000 in A.

So, whether envelope A contains $1000 or $0 depends on what the model of your brain chose when presented with the same options you have now. The professor also tells you that the model has always predicted the person’s choice correctly, on thousands of past trials.

What should you do?

On the one hand, it seems obvious that you should take both envelopes. After all, the money is already either in A or not. Your choice now has no effect on whether or not A has money in it. Why pass up $100? Besides, there is a chance that there is no money in envelope A. If you only take A, you might get no money at all.

On the other hand, every person who has chosen only A in the past has received $1000, and every person who has chosen both A and B in the past has received only $100. The model is incredibly accurate. So, you can be reasonably certain that if you choose both A and B, you will receive only $100, but if you choose only A, you will receive $1000.

Think about your choice.

Newcomb’s paradox involves self-reference and undecidability. Like all self-referential paradoxes, it has a referential loop and a negation. The standard format of a self-referential paradox is something like “This sentence is false”. If it is true, then it is false, but if it is false, then it is true. The sentence defines an infinite cycle of oscillating truth values. We cannot say that it is true or false. It is undecidable.

In Newcomb’s paradox, something similar is going on, but with choices instead of truth values. You should choose both envelopes if you will not choose both envelopes. In Newcomb’s paradox, the undecidable bit is a choice instead of a truth value, but it has the same structure: a self-referential loop and a negation.

Newcomb’s paradox is more than just a self-referential paradox. It also involves a conflict between free will and determinism. From your subjective perspective, you are free to choose one or both envelopes. From the perspective of the professor and the model, however, your choice is already determined, because it is predictable with a very high level of accuracy. In my version of Newcomb’s paradox, not only is your choice predetermined, it is sitting on the table in front of you.

Although Newcomb’s paradox depends on the rather implausible premise of a computer model that can accurately predict your choices, it isn’t entirely divorced from reality. Predicting other people’s choices is something we often do. It is part of many games, such as chess. It is an important part of war. It occurs in social interactions such as asking your boss for a raise or asking a girl on a date (you should only ask if you expect the answer to be “yes”). We all have models in our heads that predict how other people make choices.

When my youngest daughter was 7 years old, she became obsessed with the game “rock, paper, scissors”. We played that game a lot together, and after a while I became quite good at guessing her moves. I was never perfect at it, but I could predict her moves with something like 60% accuracy, and so I would win about 60% of the time. Since 1/3 of the possible outcomes are ties, that’s much better than luck. If the outcomes were random, you’d win a third, lose a third, and tie a third. My daughter was both annoyed and fascinated by my success at this game. She wanted to learn my “trick”, which I claimed was just magic. It puzzled me a bit too. I don’t know exactly how I was predicting her moves, but I could do it. Somehow, my brain developed a pretty accurate model of her choices. I can’t do it any more, maybe because she uses a different method to select her moves, or maybe because she is better at concealing whatever cues I was using to make my guesses.

Back to the thought experiment. What is the solution to Newcomb’s paradox? What should you do?

The rational choice is to take both envelopes. Your choice can’t retroactively change what is in the envelopes.

However, I would make the choice in a different way. I would smile at the professor, take a coin out of my pocket, and say:

“I’ll let fate decide. I’m going to flip this coin. If it lands heads, I will take both A and B. If it lands tails, I will only take A.”


  1. "The rational choice is to take both envelopes. Your choice cannot retroactively change what is in the envelopes."

    That's a metpahysical assumption. What if your choice can in fact change the entire past including everyone's memory of it, so that it is consistent with the changed past?

    1. "A few minutes ago, he gave this problem to the computer model of your brain, and he decided whether to put $1000 in envelope A based on the model's choice.

      Consider this experiment instead:

      Instead of putting the $1000 into the envelope, he merely already wrote the result of his prediction on a piece paper.

      So at the moment you make the choice envelope A is empty. But if your choice coincides with the prediction he wrote onto the piece of paper, then he will give you $1000, if you have chosen envelope A.

      Now you can't say anmore that the rational choice is to take both envelopes merely because you cannot retroactively change what is in the envelopes. There is nothing in enevelope A and your choice won't change it, but that doesn't matter.

    2. That doesn't change anything. In that version, the rational choice is to take both envelopes, because the choice can't change the prediction written on the piece of paper.

    3. Ok, a different version:

      First you make the choice, while the professor sits in a different room and doesn't know which choice you made. Then the professors runs his model and guesses what choice you made without having any information about what choice you made.

    4. That's a bit more interesting, but it doesn't change the paradox in a fundamental way. In that version, the choice happens before the prediction, but here is no causal pathway from the choice to the prediction, so the rational choice is still both envelopes.
      Do I know you from youtube, btw?

    5. As for the metaphysics of causality, well, my brain has that assumption built into it, regardless of whether I accept it philosophically or not.

      If I believed in that many-worlds metaphysics, in which a choice is just a branching of timelines, well then the choice would be different on different timelines too, so I couldn't really make a choice. I would always be making both choices from different perspectives, and so on, ad infinitum. The many-worlds metaphysics renders choice and prediction both meaningless. It reduces everything to the anthropic principle, and by doing so, eliminates causality, prediction and choice.

    6. That doesn't seem rational to me, because it is completely independent of the professor's prediction accuracy. In reality your choice should however depend on the accuracy with which the professor makes his predictions. In the extreme case where his accuracy is 100 % and he is always right, it holds that:
      The professor will predict exactly that which you will choose.
      So if you choose both envelopes, he will have also predicted that you choose both envelopes and you will only get $100.
      If you choose only envelope B, he will have also predicted that you choose envelope B and you will get $1000.

      You may say that it is practically impossible to predict anyone's choice with 100 % accuracy because of free will, but that doesn't matter.
      We just assume for the sake of this (purely theoretical) problem that it is possible. There is nothing that contradicts this possibility logically. It can at most collide with your metaphysical assumptions about free will, causality etc.

    7. Your reasoning assumes (implicitly) that your choice can change the professor's prediction. It's not rational for that reason. Taking both envelopes is game-theoretically correct and makes no false assumptions or errors of logic.

      I guess that raises the question "What is rational?". In this context, it just means "in accord with my metaphysical assumptions", or perhaps "in accord with the inescapable metaphysical assumptions of thought". Logic is a metaphysical assumption and so is induction (that past performance predicts the accuracy of the model on this trial). There's no reasoning at all without (implicit) metaphysical assumptions.

    8. No, this is Patrick.

    9. The bigger question is: why assume the model (which was used to choose the values in the envelopes) is rational?

      I started by skimming this article. I legit said to myself "well, do I want to gamble or go for something secure?" all on the prompt of glancing at the word 'rational'.

      I was influenced to choose the certainty of $100.

      Afterward I read a little more, realized I could take both, considered it, and didn't want to be greedy, so I remained with B.

      Then I read the catch, and selected A.

      The problem with this paradox is it doesn't consider the internal debate problem, and partial information, but if I was always going to arrive at A, then I was always going to arrive at A, which means the maximal value I could ever arrive at was A=$1000. The implicit and incorrect assumption here is that A+B = $1100, when actually it should read A+B = $100 || $1100.

      If you commit the error of assuming your choice after the fact will be different than the models choice beforehand, or that your choice afterward can *alter* the contents, then the greedy choice should be written as A+B=$100


Post a Comment