A quantum Samaritan’s dilemma cellular automaton

The dynamics of a spatial quantum formulation of the iterated Samaritan’s dilemma game with variable entangling is studied in this work. The game is played in the cellular automata manner, i.e. with local and synchronous interaction. The game is assessed in fair and unfair contests, in noiseless scenarios and with disrupting quantum noise.


The classic and quantum Samaritan's dilemma
The Samaritan's dilemma (SD) is a non-zero sum, asymmetric game played by two players: the charity player A and the beneficiary player B. Player A may choose Aid/No Aid, whereas player B may choose Work/Loaf. The Samaritan's dilemma arises in the act of charity. The charity wants to help (Aid) people in need. However, the beneficiary may simply rely on the handout (Loaf) rather than try to improve their situation (Work). This is not anticipated by the charity. Many people may have experienced this dilemma when confronted with people in need. Although there is a desire to help them, there is the recognition that a handout may be harmful to the long-run interests of the recipient [1][2][3][4][5]. Following Huang et al. [6], Ozdemir et al. [7] and Rasmussen [8], we adopt here the pay-off matrices P A and P B given in figure 1a.

The classic context
In conventional classic games, both players decide independently their probabilistic strategies x = (x, 1 − x) and y = (y, 1 − y) . As a result, the expected pay-offs (p) in the SD game are The SD belongs to the class of the so-called discoordination games, i.e. games with no pair of pure strategies in Nash equilibrium (NE), where the one player's incentive is to coordinate (charity (A, W)), while the other player tries to avoid this (beneficiary (A, L)). Figure 1b shows the reaction functions whose intersection determines the NE, i.e. x is a best response to y and y is a best response to x. In our case, (x = 0.5; y = 0.2), with associated pay-offs p A = −0.2, p B = 1.5. Figure 1c shows the pay-offs region of the studied game, which turns out negative for player A in points such as the NE, whereas this does not happen for player B.
Let us remark here then, that the pay-offs of the SD are biased towards the beneficiary player B. In other words, the dilemma of the Samaritan game (sometimes referred to as the Welfare game) is somehow only that of the charity (or Samaritan) player A.
In a different game scenario, that of correlated games, an external probability distribution Π = ( π 11 π 12 π 21 π 22 ) assigns probability to every combination of player choices [9]. Thus, the expected pay-offs in the SD are p A = 3π 11 − π 12 − π 21 and p B = 2π 11 + 3π 12 + π 21 . (1. 2) The quantum game approach described in the next subsection, participates of both the independent players (1.1) and of the correlated games (1.2) models.

Quantum games
In the quantization scheme introduced by Eisert et al. [10] (EWL for short), the classical pure strategies are assigned two basic vectors |0 and |1 , respectively, in a two-level Hilbert space. The state of the game is a vector in the tensor product space spanned by the basis vectors |00 , |01 , |10 , |11 . The seminal paper [10] deals with the Prisoner's Dilemma, so that the classical pure strategies for both players are that of Cooperation and Defection, referred to as C and D, respectively. We will keep this strategy codification in this section, although in the context of the SD game C standing for Aid for the Samaritan-Work for the beneficiary, and D standing for no Aid for the Samaritan-Loaf for the beneficiary. The EWL quantum protocol, described below, includes the classical approach as a particular case, but with the purely classical strategies referred to asD andĈ = I = ( 1 0 1 0 ), so that the hat operator mark indicates by itself that a quantum approach is taken into account.
The two-parameter (2P) model using the two-parameter operators given in (1.4) may be criticized as being just a subset of the general SU(2) space of unitary strategies which allows for three parameters. A subset that is not a closed set and fails to reflect any reasonable physical constraint [11,12]. As pointed out in the seminal paper dealing with the quantum Samaritan's dilemma [7], the 2P-restriction in the original formulation of the EWL model is a subject of continuing discussions. But accordingly with that stated in [7], the 2P-EWL model is a good (and widely used) test-bed to show how the quantum approach in game theory may solve dilemmas, by allowing for NE strategies out of the scope of the classic approach [13]. Consequently, the structure of this study follows the way paved for the paper [ [10]) dealing with the prisoner's dilemma game, and more importantly regarding this study in [7]: one-parameter (classic, §1.1) → two-parameters ( § §2 and 3) → three-parameters ( §4).

Quantum noise
Real-world quantum information processing systems inevitably interact with the environment. The environmental noise may destroy the quantum properties and be harmful to our purpose of information processing. So it is important to take quantum noise into our consideration [15].
In the presence of noise, the shared state is corrupted before the players apply their strategiesÛ A ⊗ U B . The noise effect on a single qubit can be described by CPTP (completely positive trace-preserving) maps: We assume both qubits of the shared state are affected by the same kind of noise. As a result, (1.6) We will consider here the effect of amplitude-damping noise type: where μ ∈ [0, 1] represents the strength of noise. The final density matrix for μ = 1 2 becomes, The pay-off dependence of γ in the SD of Q versus the pure strategies C and D with μ = 1 2 may be checked in figure which leads to Permuting the columns, the rows and the rows and columns of Π QQ , generates Π QD , Π DQ and Π DD , respectively. The pay-off dependences of γ in the SD of the pure strategies with full noise may be checked in figure 2 under the label μ = 1. In particular, in the (Q, Q) context, p QQ A = 3 cos 2 (γ /2), p QQ B = 2 cos 2 (γ /2). Let us remark that only amplitude-damping quantum noise will be under scrutiny in this study. The effect of other types of quantum noise [16] will be taken into account in future work. In particular that of phase-damping, structurally close to amplitude-damping as both damping noises have the same K 1 Kraus operator, deferring in K 2 = ( 0 0 0 √ μ ) with phase damping. Mirroring the proximity of the two noise types, with full phase-damping noise, Π QQ becomes diagonal as with amplitude damping, but with π

The spatialized quantum Samaritan dilemma game
In the spatial version of the quantum SD (QSD-CA), we deal with each player occupying a site (i, j) in a two-dimensional N × N lattice. In order to compare different types of players, two types of players, termed A and B, are to be considered. A and B players alternate in the site occupation in a chessboard form, so that every player is surrounded by four orthogonal adjacent partners (A − B, B − A), and four diagonal adjacent mates, i.e. players with the same role, either Samaritan or beneficiary (A − A, B − B). The game is played in the cellular automata (CA) manner, i.e. with uniform, local and synchronous interactions [17]. In this way, every player plays with his four adjacent partners, so that the pay-off p of a given individual is the sum over these four interactions. The evolution is ruled by the (deterministic) imitation of the best paid neighbour, so that in the next generation, every generic player (i, j) will adopt the parameters of his mate player (k, l) with the highest pay-off among their mate neighbours.
All Before presenting the results obtained regarding the Samaritan's dilemma, let us point here that this study follows the way paved by previous studies dealing with CA simulations of prototypical games such as the symmetric Prisoner's Dilemma (PD) [18][19][20][21] and the asymmetric Battle of the Sexes (BoS) [22][23][24][25]. Unlike the discoordination SD game, both the PD and the BoS have NE based on pure strategies, which marks a decisive qualitative difference in the study of the SD compared with those of the PD and the BoS. Figure 3 deals with the results obtained in a QSD-CA simulation free of noise. The beneficiary player B clearly overrates the charity player A for low values of the entanglement, so that for γ up to just passed π/8,p A oscillates slightly below zero, whereasp B oscillates slightly over around 1.5, thus not far from (−0.2,1.5), the pay-offs in classic NE. If γ = 0 (or if α A = α B = 0), Π becomes factorizable as in the classic game with independent strategies (1.1), i.e. Π = xy , with x = cos 2 θ A /2, y = cos 2 θ B /2, which makes the α parameters irrelevant. It is θ NE A = 2 arccos(0.5) = π/2 = 1.570, θ NE B = 2 arccos(0.2) = 2.214, not far from 3π/4 = 2.35 1 . Over π/8 and before π/4, figure 3b shows thatθ A gradually tends to zero as γ increases, which is reflected in a stabilization ofp A in figure 3a. From γ = π/4, and before π/4,θ A =θ B = 0 and α A =ᾱ B = π/4, i.e. both players adopt theQ strategy. Remarkably, the pair {Q,Q} is the only pair in NE for γ > π/4. This is so because, (i) p QQ Thus, the main finding derived from figure 3 is that the imitation dynamics in CA enables the emergence of NE in the SD: the NE that there is in the classic game with low entanglement and that of the pair (Q, Q) with high entanglement. The latter reverses the SD structural trend in favour of the beneficiary player, inducing the highest possible sum of pay-offs. One may say that the Q strategy resolves the dilemma, though the sceptical reader might argue that as Q has no real component (θ = 0), the dilemma is resolved in the imaginary . . . world. Figure 4 shows the dynamics in simulations in the QSD-CA scenario of figure 3 with γ = 0 (figure 4a), γ = π/4 (figure 4b) and γ = π/2 (figure 4c). As a result of the initial random assignment of the parameter values, it is initially in both frames:θ π/2 = 1.570,ᾱ π/8 = 0.785. With γ = 0, after a short transition time, both the parameters and mean pay-offs stabilize their values fairly soon, the latter close to the pay-offs of NE (−0.2,1.5). In the γ = π/2 dynamics in figure 4c, bothθ's rocket to π , and bothᾱ's plummet to zero, i.e. both players play the Q strategy and in consequence the mean pay-offs of A and B players stabilize at (3.0,2.0) in a straightforward manner. Remarkably, the parameter tendencies heavily emerge from the very beginning, despite the full range of parameters initially accessible in the CA local interactions. In figure 4b, with γ = π/4, the threshold for the emergence of Q, the parameter stabilization is achieved also in fast manner, with the exception ofθ A , whose trend towards zero is achieved beyond T = 100 in a rather unexpected way which determines the sudden leap ofp A towards 3.0.      The seminal reference [7] analyses also a variant of the original EWL model that considers as initial state |ψ i =Ĵ|01 instead of |ψ i =Ĵ|00 . Four pair of strategies in NE are reported in [7] in the full entangled model starting from |01 . The four NE pairs induce the (3, 2) pay-offs and have in common that: α A = 0 and α B = π/2. Figure 5 shows that CA simulations tend to induceᾱ A = 0 andᾱ B = π/2 for not low γ and select the NE pair with θ A = 3π/4 and θ B = π/4 for high entanglement. The most distinctive difference of figure 5 compared with figure 3 is that of the fairly monotonic increase ofp A (γ ) and the lower variation of the five values ofp A for low values of γ .
Let us point out here that non-factorizable Π may be generated from independent strategies (x, y) in the classic context, without resorting to any quantum approach. Thus, for example, an ad hoc method based in the external parameter 0 ≤ k ≤ 1 is given in [26], and shown as follows:  With the joint probabilities generated as in (2.1), the pairs in NE in the SDG are given in (2.2), where the threshold k * emerges from the x ≤ 1 restraint. The pay-offs of both players in the SDG with strategy pairs in NE under the model (2.1) are plotted in figure 6a, being, for threshold k < k * . For k ≥ k * the CA simulations produce the exact valuesx =ȳ = 1, which induce π 11 = (2k − 1) 2 , π 12 = π 21 = 0, π 22 = 4k(1 − k) and consequently the pay-offs of the (A,W) pair, i.e. p A = 3(2k − 1) 2 , p B = 2(2k − 1) 2 , which show a fairly linear aspect for k > k * , reaching p A = 3, p B = 2 for k = 1 (as achieved with k = 0). Despite the structural proximity of both amplitude and phase-damping quantum noises mentioned at the end of §1 (only K 2 varies), the pair (Q, Q) is not in NE with phase-damping noise for high values of γ . Consequently, in QSG-CA simulations with μ = 0.5 phase-damping (not the shown here) the discontinuity observed with μ = 0.5 amplitude-damping noise in figure 7 when γ approaches π/3 does not emerge in the pay-offs graph. The parameter graph shows in this scenario how the Samaritan player drifts to the Q strategy (θ A → 0,ᾱ A → π/2) when γ increases beyond π/3, but the beneficiary player does not accompany the Samaritan in this trend. Figure 8 deals with the results obtained in the QSD-CA scenario of figure 7 but with full μ = 1.0 noise. Full noise seems to impede the emergence of the (Q, Q) pair, so that the oscillations of both pay-offs not far from (−0.2, 1.5) shown in figures 3 and 7 for low γ , remain here for higher entanglement.

Quantum noise
In figure 8, with high entanglement the (θ , α) parameters of both players tend to approach their middle levels (π/2, π/4). In this scenario, it is Π = 1 4 1−sin γ 1 1 1+sin γ , p A = 1 4 (1 − 3 sin γ ), p B = 1 4 (6 − 2 sin γ ). These pay-offs smoothly decrease as γ increases: p A from 0.25 down to −0.5, p B from 1.5 down to 1.0. The actual pay-offs (p) at γ = π/2 in figure 8a turn out to be over the expected (−0.5, 1.0). This is due to spatial effects that make it difficult to estimate the actual mean pay-offs from the mean parameters. An example of spatial structure is given in figure 9, where the parameter values (and the pay-offs in consequence) show a kind of maze-like aspect. Figure 8a shows also the mean-field pay-offs (p * ) achieved in a single hypothetical two-person game with players adopting the mean parameters appearing in the spatial dynamic simulation, those given in figure 8b. Namely, In figure 8a, the mean-field pay-offs (p * A , p * B ) are marked (brown,green), somehow following the (red,blue) colours of the actual mean pay-offs (p A ,p B ). As expected, the mean-field approaches fit fairly well the p A = 1 4 (1 − 3 sin γ ), p B = 1 4 (6 − 2 sin γ ) equations at high entanglement. Thus, at maximum γ = π/2, it is p * A −0.5, p * B 1.0. Although spatial effects arise in figures 3 and 7 before the emergence of the (Q, Q) pair, they are not particularly relevant, because the mean-field estimations do not appear in those figures. In the subsequent figures, the same criterion will be applied.

Unfair contests
Let us assume the unfair situation: a type of players is restricted to classical strategiesŨ(θ, 0), whereas the other type of players may use quantumÛ(θ, α) ones [27,28]. Since the dilemma in the SD is basically that of the charity player, the possibility of whether he can overcome the dilemma by restricting the beneficiary player to only classical strategies will be taken into account preferentially in this section, albeit the main results regarding the reversed unfair situation will be also reported. Figure 10 deals with five noiseless simulations of a quantum (θ, α)-player A (red) versus a classic θ -player B (blue) in a QSD-CA with variable entanglement factor γ . The structure of the asymptotic mean pay-offs across the lattice (p) of both players in this unfair scenario resembles that found in the fair scenario of figure 3 for not high γ , but extended now for all entanglement, without any discontinuity. Thus, rather unexpectedly, despite the fact that the beneficiary player B is restricted to classical strategies, he overrates the charity player A regardless of γ asp B oscillates around 1.5, whereasp A oscillates close to zero in most cases. Nevertheless, two simulations with high γ show higher values ofp A , that may grow up to 0.5, somehow resolving the dilemma of player A in the weak sense making θ A = π/2 in the conventional (non-CA) game [7], orθ A π/2 in CA simulations, as shown and in figure 10b.
In simulations in the scenario of figure 10 but with reversed unfairness: θ-player A versus (θ, α)-player B, the structure of the mean pay-offs versus γ (not shown here) resembles that shown in figure 10 in the (θ , α)-player A versus θ-player B scenario, although the quantum player B gets pay-offs slightly over 1.5, whereas the pay-offs of the classic player A stand slightly negative. The advantage of the quantum player (B) facing a classic player (A) is foreseeable; what may surprise here is that the quantum player B does not take a relevant advantage of his privileged role, getting pay-offs not far from those achieved in the opposite unfair scenario where he is the classic one.

Quantum noise
The simulations in the α B = 0 unfair scenario with μ = 0.5 noise (not shown here) produce roughly the same results as in the noiseless unfair scenario of figure 10, albeit the pay-offs of the charity player show a notably smaller variation around 1.5, and those of the beneficiary player become positive at a greater extent as in figure 10. Spatial effects are not relevant in figure 10, nor in the (α B = 0, μ = 0.5) simulations, so that the mean-field pay-offs estimations fit well to the actual ones in these unfair scenarios. Figure 11 shows the results in the α B = 0 unfair scenario of figure 10, but with μ = 1.0 noise. The results regarding both the pay-offs and mean parameters shown in figure 11 are qualitatively similar to those achieved in the fair, μ = 1.0 scenario of figure 8. Spatial effects arise with high entanglement, but oddly they seem to affect only the charity player A. Consequently, the parameter and the pay-offs spatial structures with high entanglement and full noise show a much fuzzified maze-like aspect compared to those in the fair scenario of figure 9. Figure 12 deals with the case of an unfair game where the beneficiary player is allowed to resort only to the parameter α instead to θ as in the unfair simulations just before considered. In such a scenario, the (Q, Q) pair emerges in NE regardless of γ in the noiseless and μ = 0.5 scenarios, as the beneficiary player can not resort to strategies such as D (or Loaf) which demands θ > 0. As a result, in the noiseless figure 12a it isp A = 3 andp B = 2, and in the μ = 0.5 noise (figure 12b), the (p A ,p B ) pay-offs fit the expressions given in the quantum noise subsection of the introductory §1. In the full noise scenario of figure 12c, wherē    the α parameters. This is so because θ = 0 makesÛ diagonal and it turns out that

Three-parameter strategies
This section deals with the full space of strategies SU(2), operating with three parameters (3P) as given in (4.1), so that a new β parameter is available.
(4.1)  Thus, player B overrates player A all along the γ variation, albeit in a lower degree as γ grows. For low γ , the mean-field pay-off estimations fit fairly well the actual mean pay-offs, particularly in respect of player A, but as γ grows heavy spatial effects emerge, so that the actual mean pay-offsp become fairly stabilized, in contrast with the variable behaviour of p * , particularly in respect of player A. No particular structure becomes apparent in the parameter patterns in figure 13b, which corresponds to a rather erratic behaviour of the mean-field estimations.

Quantum noise
In three-parameter strategies QSD-CA simulations with μ = 0.5 noise (not shown here) roughly similar results as in the noiseless 3P scenario of figure 13 are achieved. Nevertheless, the form of the graphs of the actual mean and mean-field pay-offs are altered, much advancing the features observed in the 3P simulations with full noise shown below in figure 14. This is particularly true with respect to a notable decreasing in the erratic behaviour of the mean-field estimations, particularly those of player B. Figure 14 deals with the results achieved in the scenario of figure 13, but with full noise. The actual mean pay-offs of both players monotonically increase their values as the entanglement increases, in the case of player A from approximately zero up approximately 1.5, in the case of player B from approximately 1.5 up approximately 2.0. Thus, player B overrates player A all along the γ variation, much as in figure 13, but with full noise in a more crisp manner. In fact, the actual mean pay-offs in figure 14 may be very well fitted by the equations: p A = sin γ , p B = sin γ . At variance with what happens in figure 13, the mean-field pay-off estimations fit fairly well the actual mean pay-offs with full noise, so that such mean-field estimations have not been shown in figure 13. Again at variance with what happens in figure 13, some trends become apparent in the mean parameter graphics with full noise, as the mean parameter patterns of player A andθ B drift towards π/2.
The form of the graphs in the noiseless unfair three-parameter simulations shown in figure 15 apparently differ from that of its two-parameter counterpart in figure 10. Thus, as soon as the entanglement takes off, the mean parameters, and the mean pay-offs in consequence, become fairly stabilized: bothθ close to π/2, andᾱ A andβ A close to π/4. Besides, high spatial effects emerge in figure 15 in contrast with their absence in figure 10. Spatial effects are particularly relevant regarding the charity player A, as the increase of the entanglement supports the increase of his actual mean pay-offs up to over 0.5, whereas his mean-field estimations decrease below −0.5. In simulations (not shown here) in the scenario of figure 15 but with reversed unfairness: θ-player A versus (θ, α, β)-player B, the classic charity player A gets negative pay-offs regardless of γ , but the pay-offs of the beneficiary player B keep not far  from 1.5 for all γ , thus, surprisingly, getting smaller pay-offs than those achieved in the opposite unfair scenario of figure 15, where he is the classic one. Figure 16 shows the dynamics up to T = 100, together with the parameter and pay-off patterns at T = 100 in a simulation with γ = π/2 in the QSD-CA unfair scenario of figure 15. Figure 16a indicates that both the quantum parameters and the pay-offs quickly reach their fairly permanent values, without a relevant initial transition time. As a result, the actual (and mean-field) values shown at T = 200 in figure 15, do not significantly differ from those reached at T = 100 (and even before) in figure 16. The parameter patterns values (and the pay-offs in consequence) in this figure show again, as in figure 9, a kind of maze-like aspect that is in the origin of the notable discrepancy between the actual and mean-field pay-offs, particularly that of the charity player A.
In three-parameter strategies QSD-CA unfair simulations with μ = 0.5 noise (not shown here), although the general form of the graphs of the pay-offs and parameters versus γ is similar to that in the noiseless scenario of figure 13, the graphs are altered advancing the features observed in the 3P unfair simulations with full noise shown below in figure 17. In particular, with μ = 0.5 noise the actual mean pay-offs of both players increase their values as γ increases at a lower degree compared to those achieved in the noiseless simulations.    Figure 17b indicates that the just described quantum parameters scenario applies with high entanglement in the simulations of this figure, and in consequence the mean-mean field pay-off estimations in figure 17a correspond with the theoretical values approximately from γ = π/4. Spatial effects minimally influence the actual mean payoffs of the beneficiary player A in the CA simulations, whereas they support those of the charity player A so that he achieves no negative pay-offs.
In [7], two NE are reported in the noiseless scenario with full entangling when considering the initial density matrices ρ i = (|00 00| + |11 11|)/2 and ρ i = (|01 01| + |10 10|)/2. The strategies in NE in these scenarios have in common the parameters θ A = θ B = π/2 and α B + β B = π/2; with the first density matrix it is α A + β A = π and with the second one α B + β B = 0. The results of their corresponding CA simulations are not shown here, but they are able to detect such NE in a straightforward manner, i.e. free of spatial effects, reporting in both cases non-negative pay-offs to the Samaritan player. In the particular case of full entangling, it isp A = 1.0,p B = 2.5 in both scenarios. Thus, the general 3P-SU(2) operators may become a powerful tool when the players share a classically correlated state.

Conclusion
A spatial formulation of the iterated QSD game with arbitrary entangling is studied in this work. The game is played in the cellular automata (CA) manner, i.e. with local and synchronous players' interaction. The evolution is achieved via imitation of the best-paid neighbour of the two players confronted in the SD: the charity and the beneficiary player. The paper considers both the general case of players accessing the full space of quantum strategies, which allows for three parameters (3P), and the particular case of strategies restricted to a subset of them allowing for only two parameters (2P). Although the restriction to two parameters may be criticized, the 2P model is a good and widely used test-bed to show how the quantum approach in game theory may solve dilemmas, by allowing for NE strategies out of the scope of the classic approach as summarized below.
In fair contests (facing two quantum players), the CA simulations allow for the emergence of the NE strategies. Thus, with low entanglement the strategy in NE in the classic game dominates the scene preserving the structural SD trend favouring the beneficiary player. But with high entanglement, the socalled Q strategy is adopted by both players in a dramatically defined manner, somehow resolving the dilemma as mutual Q provides the pay-offs of the (charity-Aid, beneficiary-Work) choice.
With quantum noise, the dynamics demands higher entanglement for the emergence of the (Q, Q) pair. In our study (Q, Q) emerges from γ > π/4 in noiseless simulations, and from γ > π/3 with amplitude damping quantum noise at middle level. In the extreme case, full noise impedes the emergence of the (Q, Q) pair. Besides, noise notably alters the pay-offs of mutual Q.
In the unfair quantum versus classic scenario, the beneficiary player over-scores the charity player regardless of the degree of entanglement and the degree of noise. This is so even if the beneficiary player is the classic one, albeit in this case the charity player may achieve relatively high positive pay-offs in noiseless simulations with high entangling.
Mean-field approaches of the actual mean pay-offs fail as a tool for estimation due to spatial effects in unfair contests and in simulations with high noise.
Deviations from the canonical cellular automata paradigm adopted here may lead to more realistic models. Particularly, structurally dynamic CA, asynchronous updating, spatial dismantling and, last but not least, dynamics with embedded tuneable memory of past pay-offs and parameter values [42].
Data accessibility. The data supporting the graphs in the figures of this article are available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.km170 [43].