Hidden variables, free choice, context-independence and all that

This paper provides a systematic account of the hidden variable models (HVMs) formulated to describe systems of random variables with mutually exclusive contexts. Any such system can be described either by a model with free choice but generally context-dependent mapping of the hidden variables into observable ones, or by a model with context-independent mapping but generally compromised free choice. These two types of HVMs are equivalent, one can always be translated into another. They are also unfalsifiable, applicable to all possible systems. These facts, the equivalence and unfalsifiability, imply that freedom of choice and context-independent mapping are no assumptions at all, and they tell us nothing about freedom of choice or physical influences exerted by contexts as these notions would be understood in science and philosophy. The conjunction of these two notions, however, defines a falsifiable HVM that describes non-contextuality when applied to systems with no disturbance or to consistifications of arbitrary systems. This HVM is most adequately captured by the term ‘context-irrelevance’, meaning that no distribution in the model changes with context. This article is part of the theme issue ‘Quantum contextuality, causality and freedom of choice’.


Introduction
Hidden variable models (HVMs) are arguably the main reason why contextuality and its nonlocality version have acquired prominence in the foundations of quantum mechanics (QM).Ever since it was accepted that results of a measurement, such as that of a spin, are almost always random variables (with the exception of repeated sharp measurements), physicists have been interested in the possibility of "explaining" such random variables as deterministic functions of some underlying sources of variability, even if as yet unknown to us, "hidden."This possibility is often presented as a belief famously held by Albert Einstein, and then famously ruled out by Bell's and Kochen-Specker's theorems juxtaposed with QM predictions [1,2].
However, even before any detailed analysis, there is a good reason to doubt that HVMs can play an explanatory role.The reason is that the existence of a random variable of which several jointly distributed random variables are deterministic functions is ensured trivially: the properties of being jointly distributed and being functions of a single random variable are one and the same property.Conversely, variables that are not jointly distributed, as they are predicated on mutually exclusive conditions, cannot be functions of a single random variable.This means that one must have as many hidden variables as there are mutually exclusive contexts, even if they all have the same distribution.This is not to say that HVMs cannot be meaningfully constructed and interpreted.This only means that one should be careful not to attach deep physical or otherwise substantive connotations to purely mathematical and universally satisfiable representations.This is a point elaborated throughout the paper.
In this paper I will synthesize some of my recent published work to provide a comprehensive and rigorous account of HVMs.The most restrictive HVM, one introduced by Bell and describing noncontextual systems with no disturbance, is known not to hold for many systems of random variables.When this happens, the constraints imposed on an HVM have to be relaxed, and this can be done in two ways: either by allowing for a dependence of the measurement outcome distributions on contexts or by allowing for an interdependence between the hidden variables and the choices of settings for the measurements.In Ref. [3] I proved the equivalence of these two options.In this paper I present an improved and more rigorous proof.I will argue that such assumptions as freedom of choice and context-independent mapping (of hidden variables into observable ones) are merely metaphorical depictions of some basic representations of jointly distributed random variables.Next, I discuss the problem of separating disturbance (or signaling) from contextuality in the situations in which Bell's HVM does not hold.While this is the central issue for the Contextuality-by-Default (CbD) theory [4][5][6][7], the difference between disturbance and contextuality is not apparent in the formulations of the HVMs.However, one can effectively separate disturbance from contextuality by using the consistified systems introduced in Refs.[8,9].Any system of random variables can be reformulated as an equivalent, in a well-defined sense, system that has no disturbance (is consistently connected, in the CbD terminology).The equivalence of the HVMs with context-dependent mapping and the HVMs with violations of free choice holds for these consistified systems too, but now any such HVM indicates pure contextuality.At the conclusion of the paper I will discuss two assumptions that one could suspect to be required for the development presented, and show that, once again, they are not assumptions at all, because they are trivially satisfied in the language of random variables.

Conceptual and terminological set-up
A system of random variables is a double-indexed set where Q is a set of contents, C is a set of contexts, and q ≺ c means that content q is measured in context c.A content q in R c q can be viewed as a question that the random variable R c q answers (e.g., "is the spin along axis q up?", answered "yes/no") or as a choice of measurements ("spin along axis q") whose outcomes ("up/down") are represented by R c q .The context c in R c q indicates conditions under which R c q is recorded, such as the set of all other measurements made together with R c q and the spatial and temporal relations among them.The matrix below provides an example of a system of random variables: The subsystem of all random variables within a given context c is called a bunch (of random variables), where For instance, the bunch R 2 in the system R 0 is R 2 3 , R 2 4 .Any bunch R c is a random variable, which means that all components R c q of R c are jointly distributed (are measurable functions on the same probability space).However, no two random variables from different bunches have a joint distribution, they are stochastically unrelated (are measurable functions on distinct probability spaces).Indeed, consider what a joint distribution of R c q and R c ′ q ′ with c = c ′ could look like (X and Y being any measurable sets): The question marks cannot be all replaced with zeros because they must sum to one.At the same time, any nonzero joint probability would indicate that R c q and R c ′ q ′ co-occur, which would contradict the fact that c and c ′ are mutually exclusive contexts.
3 Hidden variable models

(Excessively) general HVM
Let us begin with the most general possible HVM, denoted HVM Gen : The function α returns as its value an indexed set, and the dependence of α on Q c should be understood as its indexing, matching the indexing of R c .Thus, for system R 0 in (2), Q 2 = {3, 4}, and the HVM Gen representation for where Proj q V stands for the q-indexed component of the indexed set V . 1ne can present this model graphically as The arrows a → b in this and subsequent diagrams (where b is a random variable and a is a random variable or a parameter) should be read as "different values of a may result in different distributions of b." HVM Gen is not a falsifiable model, it can be applied to any system of random variables.This can be demonstrated by simply putting Λ c (c) = R c , with the stand-alone c in α becoming a dummy argument, and Q c extracted from Λ c (c) as its indexing set.

Context-independent mapping without free choice
The argument just presented shows that the direct dependence of the distribution of R c on c can be eliminated: or graphically, Although not obvious at first glance, this HVM would traditionally be interpreted as a model with a context-independent mapping of Λ c (c) into R c (no arrow from c to R c ) but with generally compromised freedom of choice (the distribution of Λ c may depend on c).
I will denote this model HVM −FC +CIM , using the self-evident abbreviations.We have established that HVM Gen can always be reduced to HVM −FC +CIM .Using again as an example system R 0 in (2), the Freedom of choice in the QM literature is usually discussed in terms of the relationship between one's choice of c and the hidden variable Λ c (c).This means that c is treated as a random variable (which is a dubious viewpoint, see Ref. [3]), and freedom of choice means that c and Λ c are stochastically independent.In my representation of HVMs, c is always a deterministic parameter, which, with respect to the traditional view, simply means that all random variables in the model are conditioned on fixed values of c.Any restriction of freedom of choice in the traditional sense then translates into a dependence of the distribution of Λ c on c.As a special case, this also applies to the possibility that c is a function of the hidden variable, c = f (Λ c ), which may possibly be interpreted as a depiction of superdeterminism:. in an HVM −FC +CIM , one simply replaces this function with Λ c (c), defined by f (Λ c (c)) = c.

Free choice without context-independent mapping
It is further possible to transform HVM −FC +CIM into a model that is, in a sense, its reverse.Given (9), one can form an arbitrary coupling of Λ c for all contexts c,2 and then create, for every c ∈ C, a distributional copy Γ c of Γ, so that these copies are pairwise stochastically unrelated.Then and where the variables Γ c (as indicated by the lack of c as their argument) have one and the same distribution for all c ∈ C. Note that we cannot eliminate the index ) would make all R c jointly distributed.The traditional interpretation of the HVM described by ( 14) would be that the freedom of choice is not compromised here, but context-independence is generally violated.Using our graphical representation, I will denote this model HVM +FC −CIM .We have established that HVM −FC +CIM implies (can be translated into) HVM +FC −CIM .Using our example of system R 0 in (2), the

Free choice with context-independent mapping
Both HVM −FC +CIM , and HVM +FC −CIM can be viewed as deviations from their special case or, graphically, where the random variables Γ c for all c ∈ C are identically distributed and pairwise stochastically unrelated.This model can be denoted HVM +FC +CIM , as it satisfies both freedom of choice and context-independence in the mapping of Γ c into R c .In our example of system R 0 in (2), the HVM +FC +CIM representation for Unlike the previous two HVMs, this one is a true model, as it is falsifiable.The latter is demonstrated, for instance, by relating predictions of QM to the Bell-type [1] and Kochen-Specker-type theorems (in addition to the original Refs.[1,2] see, e.g., Refs.[10][11][12]).The Bell-type theorems establish necessary and sufficient conditions for a system of random variables to be described by HVM +FC +CIM , which can then be shown to fail for some QM systems.In the Kochen-Specker-type theorems one constructs systems of random variables in accordance with QM, and then demonstrate that they cannot be described by HVM +FC +CIM .

Equivalence theorem and its consequences
Combining the implications in Subsections 3.2 and 3.3, and observing that HVM +FC −CIM is a special case of HVM Gen , we obtain the following statement.
Theorem 3.1.The models HVM Gen , HVM −FC +CIM , and HVM +FC −CIM are pairwise equivalent: HVM Gen 4 < t | q q q q q q q q q q q q q q q q q q HVM −FC Let us consider two consequences of this theorem.One of them is that when HVM +FC +CIM is not applicable to a system, one can arbitrarily choose between describing the system in the language of HVM −FC +CIM or in the language of HVM +FC −CIM .In particular, one can always use one and the same measure for the degree of deviation of these two HVMs from HVM +FC +CIM : A special case of this corollary, for a particular system of random variables, is presented in Ref. [13].
The second consequence of the theorem is that HVM −FC +CIM and HVM +FC −CIM are both unfalsifiable, either of them can describe any system of random variables.This follows from the demonstration, at the end of Section 3.1, that HVM Gen is unfalsifiable, in fact, even in the form of HVM −FC +CIM .This, in combination with the inter-translatability of HVM −FC +CIM and HVM +FC −CIM , should make one skeptical about interpreting the dependence of the distribution of Λ c on c in terms of "freedom of choice," in any substantive meaning of these words, and interpreting an arrow from c to R c as a physical influence exerted by the context.Their complete equivalence and empirical emptiness (universal applicability) suggest the view that HVM −FC +CIM and HVM +FC −CIM are purely mathematical descriptions of the joint distributions within bunches of random variables and of the differences between them.
This view does not change if one constrains or even completely specifies all distributions and functions in the formulation of HVM −FC +CIM or HVM +FC −CIM , making them thereby predictive and falsifiable.The inter-translatability of the two types of models holds irrespective of their falsifiability.Moreover, a completely specified HVM can always be thought of as a corresponding unconstrained HVM after it has been applied to the system predicted by the completely specified HVM.Clearly, the ontological interpretation of a model (say, HVM −FC +CIM ) does not depend on whether it has been applied to a particular system of random variables, because this does not change the facts that (A) it could have been applied to any other system, and (B) it can be translated into an HVM of a completely different nature (in this case, HVM +FC −CIM ).This is not to say that the notions of freedom of choice and context-(in)dependent mapping may not be assigned substantive meanings and be propitiously used in physical or other scientific theories.One should, however, distinguish HVMs per se from scientific theories that predict specific systems of random variables and therefore HVM representations thereof.My only point here is that these substantive meanings belong to the parts of theories extraneous to the HVMs the theories lead to.In other words, these meanings cannot be derived from the HVMs themselves, from the fact that a system can be described by HVM −FC +CIM or HVM +FC −CIM (or even HVM Gen , combining the two) -because any system can, and by any of them.The language of HVMs as understood in this paper (and in most discussions of the HVMs in the foundations of physics, beginning with Bell's work), is simply too crude to capture certain substantive notions and distinctions.(We will see below that it is sometimes too crude even to depict the difference between much more clear-cut notions of contextuality and signaling.)A simple analogy may help to understand this.Any real-valued random variable R can be generated by applying an appropriate transformation f to a variable U uniformly distributed between 0 and 1.As one observes values of R, it is possible that there is a computer program that de facto computes them by first generating values of U and then applying to them the function f .If this is known from some extraneous source of knowledge, then we have a valid naturalistic interpretation of the model R = f (U ), which then acquires a privileged status over other representations of R (such as R = g (E), for an exponentially distributed E).However, such an interpretation cannot be derived from the fact that R is representable as f (U ) -because this representation is mathematically guaranteed, and moreover, is not unique (referring, e.g., to the same R = g (E)).
The terms "freedom of choice" and "context-(in)dependent mapping" may still be conveniently used as labels for HVM components, provided one does not impute to them their colloquial, physical, or philosophical connotations.Moreover, the conjunction of these two notions does have a substantive meaning, because HVM +FC +CIM is a falsifiable model which de facto does not apply to some QM systems of random variables.In Ref. [3] I argued that the notions in question should only be used in conjunction: "one cannot accept local causality without free choice, because denying free choice is equivalent to denying local causality" (local causality being the specific form of context-independent mapping used by Bell in the discussion published in Ref. [14]).While the present paper only strengthens this assertion, I would like to add here that one can very well decide to abandon the terms "freedom of choice" and "context-(in)dependent mapping" altogether, and use instead a simpler way to characterize HVM +FC +CIM .Namely, this is the model in which context c is irrelevant for determining any distributions involved (which includes the distribution of the hidden variable Λ c and the distribution of the observable bunch R c ).Therefore, HVM +FC +CIM can be referred to as the the model satisfying the assumption of context-irrelevance.

Contextuality in consistently connected systems
We have managed so far to discuss HVMs without involving the notion of (non)contextuality.It is now time to involve it.The traditional definition of this notion simply coincides with that of HVM +FC +CIM : a system of random variables is noncontextual (or, for distributed systems, local) if it is described by this HVM; and a system that cannot be so described is contextual.One consequence of this definition is that a noncontextual system must be consistently connected.The latter is a CbD term for what is usually called in QM non-disturbance or non-signaling: in a consistently connected system, any two random variables sharing a content, R c q and R c ′ q , have the same distribution.Inconsistent connectedness (disturbance, therefore makes a system contextual.This definition makes the class of contextual systems too large and heterogeneous, and CbD offers a more analytic approach, presented in the next section.For now, however, let us confine consideration to consistently connected systems. 3he main consequence of R being described by HVM +FC +CIM is as follows.With reference to (17), construct the random variable S defined by where Γ has the same distribution as Γ c in (17).The variable S is called a reduced coupling of the system R [15].Its (jointly distributed) elements are indexed by the elements of Q, and for any c ∈ C, we have where d = indicates equality of distributions.Thus, for our system R 0 in (2), the reduced coupling has the form S = {S 1 , S 2 , S 3 , S 4 }, and the condition (24) means that in the matrix the rows are distributed as the corresponding rows in (2).
It is clear that the implication HVM +FC +CIM ⇒ S can be reversed, whence we have the following criterion: system R is described by HVM +FC +CIM if and only if it has a reduced coupling (23) subject to (24).For some simple systems this has been semi-formally derived as the "joint distribution criterion" by Fine [12], based on the idea of Suppes and Zanotti [16].Note that the use of the language of random variables makes this criterion obtain essentially automatically.
For these and other simple systems (notably for the important class of socalled cyclic systems [17]) other criteria have been derived, primarily in the form of inequalities involving expected values of the products of the random variables within different bunches.These additional criteria should be viewed as mere shortcuts, because in all cases when they are available and in many cases when they are not, the existence or non-existence of a reduced coupling ( 17)-( 24) can be established directly, by means of linear programing.This is a good place to note that some authors, having correctly observed that Bell-type inequalities require a system of jointly distributed variables, as in (25), and having also correctly observed that in a system of observable probabilities different bunches are not jointly distributed, have then erroneously concluded that the Bell-type theorems were wrong [18][19][20].In fact, the only problem with these theorems, from the earliest ones in the 1960s all the way to the present, is that they are usually proved less than rigorously, with unacknowledged abuse of notation.When viewed as theorems about reduced couplings, their proofs are correct.The corrected proofs do not require that different bunches be jointly distributed.They only require that a system can be described by HVM +FC +CIM , the model that does preserve stochastic unrelatedness of different bunches.

Contextuality in inconsistently connected systems
CbD offers a generalized notion of (non)contextuality, one that applies to all systems of random variables, including inconsistently connected ones (those with disturbance, or signaling). 4Given a system R in (1), its (complete) coupling is defined as a random variable where Note that calling S a random variable implies that, unlike in the system R, all components of S are jointly distributed.A system R is noncontextual if it has For any context π = (•, c), the bunch in this context is defined as To define the bunch for a context π = (q, •), we need an auxiliary notion.For a given q ∈ Q, define a random variable such that for any two components T c q , T c ′ q in T q , and the probability p T c q = T c ′ q (39) is maximal possible.Let us assume, for simplicity, that such T q exists and is unique for all q ∈ Q. 5 Then, for any context π = (q, •), the bunch in this context is defined as This completes the construction of R † .Clearly, a consistified system is (strongly) consistently connected: for any ξ = (q, c) it contains two distributional copies of R c q , in the contexts (•, c) and (q, •).It should also be clear, by comparing the CbD definition of (non)contextuality with the traditional definition applied to the consistified equivalent of a system, that the system and its equivalent are always contextual or noncontextual together.For a more rigorous argument, see Ref. [8].
For our example R 0 in (2), the consistified equivalent is (omitting commas and brackets to save space) where the bunches in the first three rows are distributional copies of the corresponding rows in R 0 , the distributions of the two variables in each column are identical, and in each of the last four rows the probability of the pairwise equality of its elements is maximal possible.

Equivalence theorem for consistified systems
The main reason why the notion of a consistified systems is useful is the fact that the inconsistent connectedness of a system R is eliminated in R † (more precisely, translated into the structure of its (q, •)-bunches) while its contextuality status is preserved.One can ascertain therefore whether R † is describable by HVM +FC +CIM as one can with any other consistently connected system.If it is not, then R † should be described by either of HVM +FC −CIM and HVM −FC +CIM , and this time there can be no confusion as to whether they depict inconsistent connectedness or contextuality -it is definitely the latter.However, the applicability of and deviations from HVM +FC +CIM acquire a specific form in the case of consistified systems.
It should be clear from the construction of R † that the indexing sets Q †(•,c) of different (•, c)-bunches are disjoint, and that the union of these indexing sets is the entire Q † (consisting of all ξ = (q, c) such that q ≺ c).This means that we can use the same function to represent all (•, c)-bunches, where X (•,c) (c) for different c ∈ C is a set of stochastically unrelated random variables whose distributions may vary with c.By forming an arbitrary coupling X of X (•,c) (c) for all c ∈ C, we can rewrite this as where X (•,c) are stochastically unrelated distributional copies of X.Since Q †(•,c) uniquely determines (•, c), the function can be rewritten as By the same argument, for all (q, •)-bunches we have The last two formulas represent the HVM Gen for consistified systems.
It can be easily shown that one can simplify this HVM by either making the two functions u and v one and the same function or making all X (•,c) and X (q,•) variables identically distributed.For the latter option, create an arbitrary Using again our example (41), the HVM +FC +CIM representation for R 8 Hidden assumptions about hidden variables?
The literature on hidden variable models and contextuality contains many attempts to explicate various assumptions underlying HVM +FC +CIM .We have seen that freedom of choice and context-independent mapping, taken separately, are not assumptions, as they are universally satisfiable.We have also seen that their conjunction is restrictive, but that it is conceptually simpler to replace it with a single assumption, one that I dubbed context-irrelevance.I will now briefly discuss two additional propositions that are sometimes presented as assumptions.
Outcome determinism is the assumption that hidden variables and parameters of the situation (contents and contexts) uniquely determine the observable outcomes.Some researchers find this assumption challengeable [21].Did we not tacitly introduce this assumption somewhere in the course of the development above?The answer is no: once one consistently describes HVMs in the language of random variables, rather than events and their probabilities, outcome determinism is satisfied automatically.Unless one imposes constraints on the possible distributions of Λ c (c), either of the two unfalsifiable HVMs we have discussed, say, HVM −FC +CIM , can be constructed for any system of random variables.The very fact that the components of R c are jointly distributed means that there is a random variable of which all these components are measurable functions.This yields the representation (9).
Factorizability is another assumption that is often presented as central for HVM +FC +CIM [22].Its meaning is that, using HVM +FC −CIM for definiteness, where G is a set of values indexed by Q c , and g is a specific value of Γ c .Did we not have to use this assumption?Within our conceptual framework, we did not.Once outcome determinism is accepted as trivially satisfied, factorizability has to be accepted too.Indeed, all probabilities in this equation are equal 0 or 1, and the left-hand side probability is 1 if and only if all the right-hand side probabilities are 1.

Conclusion
Let us summarize.
1.The propositions that are usually presented as the assumption of free choice and the assumption of context-independent mapping in constructing HVMs, when taken separately, are not in fact assumptions.Rather they are two inter-translatable and universally satisfiable ways of describing joint distributions of random variables in a system.Because of their equivalence and their substantive emptiness these notions are mere technical labels in HVMs: one should not take them as saying anything about freedom of choice or physical influences exerted by contexts in the sense in which these notions would be discussed in science or philosophy.
2. The conjunction of free choice and context-independent mapping is a falsifiable (and de facto inapplicable to some systems) model.However, rather than being a conjunction of two assumptions (as they were viewed, e.g., in the historic discussion [14]), it is a single assumption in precisely the same sense in which a single sentence can consist of two parts neither of which is a sentence.One can avoid using the terminology of free choice and context-independent mapping altogether, even as technical labels, by interpreting HVM +FC +CIM as an HVM with context-irrelevance: no distributions in this model may depend on context.
3. The positions just formulated are obtained almost automatically if one systematically and carefully uses the language of random variables in discussing HVMs.This also allows one to avoid the necessity of certain additional assumptions, such as outcome determinism and factorizability.To utilize the advantages of this language one has to pay meticulous attention to the distinction between jointly distributed variables and stochastically unrelated ones."Hidden variables" are nothing more than a tool for representing jointly distributed variables as measurable functions defined on the same probability space -which is true essentially by definition.The variables from different contexts, however, cannot be presented as functions of a single source of randomness, even in the HVMs with context-irrelevance: the "hidden variables" in these models must still be indexed by contexts.