What is meant by Prisoner’s Dilemma?
Prisoners Dilemma is a game theory experiment that deals with the players’ decision making skills based on variable outcomes presented to them. There are two players in the game and neither of them is aware about the decision taken by the other. This is a crucial aspect of the game as the independent decision making of both the players alters the outcomes for both.
This example highlights the trade-off faced by individuals between being competitive and being cooperative. If the players cooperate then the outcomes are relatively better for both, however if they do not, then one of them might be better off while the other would be worse off.
Game Theory, branch of applied mathematics, is defined as the competitive and strategic interaction between 2 more or players in an economic/philosophical situation, playing according to set norms.
How does the game work?
The game starts with two prisoners: Prisoner A and Prisoner B. Both are suspected of committing a crime, to which they have not admitted yet. Each of them wants the lowest possible prison sentence and does not know whether the other prisoner would confess or not. However, all possible outcomes are presented to them, upon which they must make their decisions.
There are four possible outcomes:
- Neither A nor B confesses: If neither of the prisoners confess to the crime, then they would both face one year of jail time (as they have been caught with weapons).
- Both A and B confess: Both would face 5 years or jail time if they both confess independently.
- A confesses and B does not confess: If only A confesses then he would not face any jail time, however B would face 20 years in jail.
- B confesses and A does not confess: Like in case 3, if only B confesses then he would not face any jail time, however A would be sentenced to 20 years in jail.
What do the different outcomes mean for the players?
The unique aspect of any game theory analysis, including Prisoner’s Dilemma game, is that the knowledge about the future outcomes might alter the current decision-making of the players. However, this knowledge about the future outcomes is partial as it depends not just on the player himself, but also on his accomplice.
Here the best possible outcome for either one is to go free, however, this is only possible at the cost of leaving behind the other to serve 20 years in jail. The worst possible outcome for either one of them, individually, is when the other confesses alone. Both these extreme outcomes for an individual player are limited to the win-lose cases.
If the cooperative cases are examined, then the punishment lies somewhere in the middle for both. Among the cooperative cases, if neither of them confesses then they are better off than when they both confess.
What is the solution to this game theory experiment?
Both A and B know that if the other confesses, then the best strategy would be to confess as well. Also, if either one of them thinks the other has not confessed then the best strategy would again be to confess, as that acts in the personal interest of the last decision maker.
Thus, it can be argued that the safer outcome for both would be to confess, irrespective of what the other does.
The absolute strategy of staying quiet for either of the players can only be beneficial when both are sure that the other will stay quiet as well. Therefore, to stay on the safe side both can go by the rationale that admitting to the crime would serve their collective best interest.
Here, it is important to understand the concept of Nash equilibrium, developed by American mathematician, John Nash. This represents a state where neither player has an incentive to change the decision, considering that other does not change the decision too.
To find out the Nash Equilibrium, the first step would be to find out the dominant strategy for an individual player. Dominant Strategy refers to that strategy which has a greater payoff for the player.
For instance, if player A confesses, then the prison time faced by him depending on whether B confesses or not are 5 years and 0 years, respectively. While if A does not confess then the sentence faced by him based on whether B confesses or not is 20 years and 1 year, respectively. It can be clearly seen that A faces lesser prison time when he confesses (5+0) compared to when he lies (20+1). Thus, the dominant strategy for A is to confess.
Given this strategy by A, B now must make his decision. Now B has two options, if he confesses then he would have to serve 5 years of jail time. And if he lies then he would have to serve 20 years of jail time. Thus, the better option for him also is to confess.
Now, taking the reverse course of action, we would consider the actions of A, given B’s dominant strategy. This would follow the same process as was seen in the previous case.
Thus, both A and B can reach the Nash Equilibrium when they confess.
How is Prisoner’s Dilemma solved in real-life?
The scenario of a trade war, or a price war between the producers of homogenous goods or threats by a country of releasing weapons onto another nation are some of the real-life examples of the prisoner’s dilemma game. However, the Nash equilibrium strategy is not pareto optimal. This is so because either of them can become better off if he deflects from this strategy.
For instance, both A and B know that it is possible to reach a relatively better outcome by not confessing at all. However, this would only be beneficial to both when they co-operate. Thus, it is difficult to estimate whether co-operation would be achieved or not. Therefore, effectively, it might seem better to play safe rather than going for the pareto optimal strategy.
Nash Equilibrium strategy might not be the most efficient way to solve Prisoner’s Dilemma. A possible solution to tackle this is to iterate the Prisoner’s Dilemma game. This means repeatedly playing the game.
Consider the case of finite iterations, each player knows that there will be an end to the game, where they can deflect from the Nash Equilibrium outcome. However, in the case of infinite iterations, a backwards induction could help solve the case. Both players know that there is going to be another stage to the game, thus they feel it is better to stay safe and confess. As a result, repeated iterations fetch the same strategy of confess-confess. Therefore, these iterations will only work when they are done an infinite number of times.