Density of convex intersections and applications

In this paper, we address density properties of intersections of convex sets in several function spaces. Using the concept of Γ-convergence, it is shown in a general framework, how these density issues naturally arise from the regularization, discretization or dualization of constrained optimization problems and from perturbed variational inequalities. A variety of density results (and counterexamples) for pointwise constraints in Sobolev spaces are presented and the corresponding regularity requirements on the upper bound are identified. The results are further discussed in the context of finite-element discretizations of sets associated with convex constraints. Finally, two applications are provided, which include elasto-plasticity and image restoration problems.


Introduction
Convex constraint sets K as subsets of an infinitedimensional Banach space X are common to many fields in mathematics such as calculus of variations, variational inequalities and control theory. Such constraints are induced by physical limitations of control and/or state variables, but also emerge through Fenchel dualization of convex problems; e.g. [1][2][3] for fundamental concepts in variational analysis. In this vein, given a set of functions satisfying an arbitrary convex constraint, density properties of more regular functions satisfying the same restriction are of utmost importance. In abstract terms, given some dense subspace Y of X, the central point of interest is whether the closure property  with K(Y) = {u ∈ Y : u ∈ K} = K ∩ Y, is fulfilled, and how this problem is intimately linked to the solution of constrained optimization and variational inequality problems.
In the literature, problems of dense intersections appear in connection with the discretization of variational inequality problems in Sobolev spaces and the convergence analysis for finiteelement methods under minimal regularity (e.g. [4][5][6]). Moreover, the limiting behaviour of singular perturbations of elliptic variational inequalities can be traced back to the density issue (see [7] and references therein). This also pertains to the deduction of a vanishing viscosity limit for hyperbolic variational inequalities with an obstacle constraint [8]. In the context of plasticity problems, certain density properties represent an important step towards the determination of appropriate relaxed formulations (cf. [9,10]). However, to the best of our knowledge, the investigation of problem (1.1) is restricted to special cases and the literature still lacks a general and systematic treatment of the density issue.
To motivate the study of the abstract problem (1.1), §2 provides a novel unifying framework for various perturbation approaches to non-smooth constrained optimization and variational inequality problems. The general setting includes regularization, Galerkin approximation and singular perturbations, and, most remarkably, it allows to reduce the study of the corresponding limit problems for a wide range of practically relevant perturbations to the study of the density property (1.1). In particular, we prove that the dense intersection (1.1) is a necessary and sufficient stability condition for the retrieval of the original problem in the (joint) limit of vanishing regularization and/or discretization parameters.
Starting from §3 we focus on the setting where X = X(Ω) is a (R d -valued) vector space of functions over a bounded domain Ω of R N and K = K(X) denotes the subset of elements in X(Ω) bounded pointwise by a prescribed measurable function α : Ω → R ∪ {+∞}, i.e.
K(X(Ω)) = {w ∈ X(Ω) : |w(x)| ≤ α(x) a.e. (almost everywhere) in Ω}, with | · | denoting an R d -norm. Particularly in this part, X(Ω) refers to a Lebesgue or Sobolev space and Y = Y(Ω) refers to the space of continuous or infinitely differentiable functions up to the boundary. We also use the notation K(X(Ω); | · |) whenever it is necessary to make the dependence on the norm | · | explicit. Despite the fact that a small number of specific density results for very regular bounds α are available [4,9,11], a systematic investigation of density properties in terms of the regularity of α seems not available in the literature.
In order to close this gap, we prove new density results for continuous obstacles ( §4), and we also consider different classes of discontinuous obstacles. In fact, in §4a, the density issue is studied in the context of the regularity of the obstacle as a Sobolev function. More precisely, we prove that results of the type (1.1) cannot be expected if the obstacle is just a Sobolev function by providing a counterexample. The density results are then proved to be valid even for certain classes of lower semicontinuous obstacles; see §4b,c. Subsequently, in §4d, a different approach is considered for obstacles that originate from the solution of a partial differential equation (PDE).
In §5, we focus on the Mosco convergence of finite-element discretized convex sets, which, in general, is a delicate matter, and only a limited number of results for more regular obstacles are known (e.g. [4,5]). In this respect, the construction of a recovery sequence essentially reduces to the verification of density properties of the type (1.1). Making use of the density results provided by the preceding sections, we prove several new Mosco convergence results in the Hilbert spaces L 2 , H 1 and H(div) for different types of finite-element discretizations of K, even for discontinuous obstacles α. The results are extended to a more general constraint setting involving pointwise restrictions on partial derivatives. We conclude the paper by presenting two important applications that further highlight the paramount significance of dense intersections. First, we consider the regularization of an elasto-plastic contact problem, where the closure property turns out to be fundamental for the efficient solution by a semismooth Newton method. Secondly, we discuss an example from total variation-based image restoration with a distributed non-smooth regularization parameter. Here, the density property arises as an essential condition for the equivalent reformulation of the problem in the Hilbert space H(div) by means of Fenchel duality.

Motivation (a) Optimization with convex constraints
In many variational problems, one seeks the solution in a given convex, closed and non-empty subset K of an infinite-dimensional Banach space (X, . ). To illustrate the problem, let us consider the following abstract class of optimization problems: (2.1) We assume that F : X → R is continuous, coercive and sequentially weakly lower semicontinuous, but not necessarily convex. Thus, problem (2.1) admits a solution provided X is reflexive. The problem class (2.1) is ubiquitous, encompassing numerous fields, such as the variational form of PDEs, variational inequality problems of potential type, optimal control of PDEs with constraints on the state and/or control, and many other. The analysis of (2.1) and the design of suitable solution algorithms often involve the general concepts of perturbation or dualization methods comprising regularization, penalization or discretization approaches or possibly a combination of the latter (e.g. [1][2][3][4][5] and references therein). The central result of this section is that the stability of (2.1) with respect to a large class of perturbations can be characterized by the closure property (1.1), i.e.
where Y is some dense subspace of X (in the norm topology of X), and K(Y) is given by In what follows, we will identify a very general class of perturbations for which the stability analysis effectively reduces to the study of the density property (1.1).

(i) A class of quasi-monotone perturbations
To subsume as many of the above-mentioned methods as possible, we consider the sequence of perturbed problems inf F(u) + R n (u), over u ∈ X, (2.2) defined by a given sequence of functions that are perturbations of the indicator function i K : X → R ∪ {+∞} in the following sense: there exist functions R n : X → R ∪ {+∞} and R n : having the additional properties  Sufficient condition). Let the Banach space X be reflexive or assume that the dual space X * is separable. For a closed, convex and non-empty set K ⊂ X, let (R n ) be a sequence of quasi-monotone perturbations of i K with respect to the dense subspace Y according to (2.2). If the density property (1.1) holds true, then F + i K is the Γ -limit of (F + R n ) in both, the weak and strong topology.
Under the assumptions of theorem 2.1, one may infer that, provided each problem (2.2) admits a global minimizer u n , each weak cluster point of the sequence of minimizers (u n ) is a global minimizer of (2.1); see [12] for an introduction to Γ -convergence. At the end of this section, it is further clarified that theorem 2.1 is sharp in the sense that the stability result in general fails if (1.1) does not hold. We also remark that in case the (sequential) weak and strong Γ -limits coincide, one usually uses the notion Mosco convergence.
In the following, we present a selection of approximation methods that fit into the general class of perturbations defined by (2.2), which bear high practical relevance. In favour of generality, we do not leave the abstract setting. Example 2.2 (Tikhonov regularization). Let (Y, . . . Y ) be a Banach space which is densely and continuously embedded into X. For a sequence of positive non-decreasing parameters (γ n ) with γ n → +∞ and fixed α > 0, consider in (2.2) the Tikhonov regularization In fact, set R n := i K for all n ∈ N and R n := R n . Obviously, (2.3) and (2.4) are satisfied such that (R n ) fits into the context of quasi-monotone perturbations according to (2.2).

Example 2.3 (Conforming discretization).
Let X be a separable Banach space. Suppose (2.1) is approximated by a Galerkin approach using nested and conforming finite-dimensional subspaces X n , i.e. X n ⊂ X and X n ⊂ X n+1 for all n ∈ N, such that the Galerkin approximation property n∈N X n X = X is fulfilled. The resulting discrete counterpart of problem (2.1) is given by (2.2) with R n (u) = i K∩X n . Setting R n = i K , (2.3) is clearly fulfilled. Define Y = n∈N X n , then (2.4) is fulfilled with R n = R n .

Example 2.4 (Combined Moreau-Yosida/Tikhonov regularization).
Let X be a Hilbert space and (Y, . . . Y ) be a Banach space that is densely and continuously embedded into X. For two sequences of positive non-decreasing parameters (γ n ), (γ n ) with γ n , γ n → +∞ and fixed α > 0, consider the simultaneous Moreau-Yosida and Tikhonov regularization is verified as in the previous example.

Example 2.5 (Conforming discretization and Moreau-Yosida regularization)
. Let X be a separable Hilbert space and (γ n ) a sequence of positive non-decreasing parameters converging to +∞. The combination of regularization and discretization leads to the definition where the sequence of spaces (X n ) is defined as in example 2.3. Setting R n (u) = (γ n /2) inf v∈K u − v 2 and R n (u) = i K∩X n (u), (2.3) and (2.4) are fulfilled with Y = n∈N X n and the framework of (2.2) applies. Consequently, each of these perturbations is stable with respect to (2.1) provided the density result (1.1) is satisfied. It should also be emphasized that these examples only represent an assorted variety of perturbations that fit into the problem class (2.2).
Moreover, the density property (1.1) is also a necessary condition for the stability of perturbation schemes in the following sense: first, the Γ -limit of the approximation schemes defined in examples 2.2 and 2.3 can be calculated using similar arguments as in the proof of theorem 2.1. In fact, under the same conditions on X, one obtains F + i K∩Y as the weak and strong Γ -limit in both cases. Secondly, in the combined approaches of examples 2.4 and 2.5, theorem 2.1 guarantees that F + i K is obtained as the weak-strong Γ -limit for any coupling of regularization parameter pairs [γ n , γ n ] and [X n , γ n ], respectively. Let us put this statement into a perspective by means of the combined Galerkin/Moreau-Yosida approach (example 2.5). In this case, it is possible to prove the existence of a suitable combination of n and γ n to recover F + i K in the Γlimit without resorting to the density property (1.1), see [13,Prop. 2.4.6]. However, the proof is non-constructive and thus not immediately useful for the design of a stable numerical algorithm. On the other hand, if (1.1) is violated, the Γ -convergence to the original problem (2.1) cannot be guaranteed independently from the choice of the regularization/discretization parameter pair. In fact, the following result, which we prove in appendix A, holds true. Proposition 2.6 (Necessary condition). Consider example 2.5 with the corresponding definitions of Y and (R n ). Further suppose that K ∩ Y K. Then for all x ∈ K \ K ∩ Y there exists a strictly increasing sequence (γ n ) with γ n → ∞ such that for all (y n ) ⊂ X with y n → x, i.e. there exists no recovery sequence at x in the norm topology.
The analogous statement is valid in the case of combined Moreau-Yosida/Tikhonov regularizations given a fixed sequence (γ n ); cf. example 2.4. In conclusion, theorem 2.1 is sharp with respect to condition (1.1) in the sense of proposition 2.6 and the preceding discussion.

(b) Elliptic variational inequalities
The density of convex intersections of the type (1.1) is also of fundamental importance for the analysis of perturbations of variational inequalities. Assuming X to be a Hilbert space and K ⊂ X non-empty, closed and convex, we consider the general variational inequality problem of the first kind, find u ∈ X : e.g. [7,14] for an introduction. Here, l ∈ X * is a linear, bounded operator and A : X → X * denotes a, in general, nonlinear operator on X. We further assume A to be Lipschitz continuous and strongly monotone, i.e. there exists κ > 0 with In the following, we investigate three main classes of perturbations of (2.8) and their relation to the density properties of convex intersections.

(i) Quasi-monotone perturbation
Consider the perturbed variational inequality problem, find u n ∈ X : where A n and l n are appropriate perturbations of A and l, respectively, and (R n ) is a quasimonotone perturbation of i K with respect to a dense subspace Y of X. The stability of the approximation scheme (2.9) hinges on the density property (1.1). In fact, if the latter condition is fulfilled, then the sequence (R n ) Mosco converges to i K provided R n is weakly lower semicontinuous. Under mild assumptions on (A n ) and (l n ) one may then invoke known stability results, cf. [7, p.99, 15], to conclude the consistency of the perturbation scheme with respect to the limit problem (2.8).

(ii) Galerkin approximation of variational inequalities
In general, finite-dimensional approximations of K are neither conforming nor nested as it was the case in examples 2.3 and 2.5, where K was 'discretized' by K ∩ X n , which is numerically realizable only in special cases. Instead, it is often more favourable to consider non-nested approximations K n ⊂ X n that may contain infeasible elements, such that K n ⊂ K does not hold true in general [4,5]. As a result, the finite-dimensional variational inequality problems, find u n ∈ X n : do not fit into the framework of (2.9). Again, under mild assumptions on (A n ) and (l n ), the Mosco convergence of (K n ) to K ensures that the approximation (2.10) is stable with respect to the limit problem (2.8). However, Mosco convergence requires the existence of a recovery sequence (see definition 4.5) for any element u ∈ K. To construct this sequence in the context of finiteelement methods, one may use an interpolation procedure which typically is only defined on

(iii) Singular perturbations
The closure property (1.1) also plays a role in the limiting behaviour of singular perturbations. In fact, let A 1 : Y → Y * be a Lipschitz continuous and strongly monotone operator on a Hilbert space (Y, . . . Y ) that embeds densely and continuously into X. For a sequence of regularization parameters (γ n ) with γ n → +∞ consider the perturbed problems, Observe that problem (2.11) admits a unique solution u n ∈ K ∩ Y provided that K ∩ Y is closed in Y. The appropriate limit problem is then given by Note that (2.12) corresponds to the initial variational inequality problem if the density property (1.1) holds true. In this case, the sequence (u n ) converges strongly in X to the solution of (2.8).
Here, the assumptions on A 1 may be alleviated. This type of application also plays a role in the analysis and the design of algorithms for hyperbolic variational inequalities through the vanishing viscosity approach. For details, [7, section 4.9, 8] may be consulted.

Density results for continuous obstacles
We first fix some notation. In this section, Ω ⊂ R N denotes a bounded Lipschitz domain. The space of functions that are restrictions to Ω of smooth functions with compact support on R N is denoted by D(Ω), The standard Lebesgue and Sobolev spaces over Ω are denoted by L p (Ω), W 1,p (Ω) and W 1,p 0 (Ω), and we also employ the spaces In the recent paper [11], it has been shown that for any α ∈ C(Ω) with the following density result for the spaces where the constraint set K(X(Ω)) with respect to a given subspace is defined by a pointwise constraint on an arbitrary norm | · | on R d , i.e.
When considering the case X = W 1,p instead of W 1,p 0 in (3.2), the choice of the approximating sequence from [11,Theorem 1], which relies on the trivial extension of Sobolev functions, fails. As a result, a different extension operator has to be employed.
Since Ω is a bounded Lipschitz domain we may extend w to a function in W 1,p (R N ) d using for each component the extension-by-reflection operator. The resulting operator has the properties Ew| Ω = w for all w ∈ W 1,p (Ω) d and E ∈ L(W 1,p (Ω) d , W 1,p (R N ) d ); see, for instance, [16]. Since E is obtained by a partition of unity argument using local reflection with respect to the Lipschitz graphs into which ∂Ω can be decomposed, the property |w(x)| ≤ α(x) in Ω is preserved by the extension in that where E C(Ω) : C(Ω) → C(R N ) denotes the application of the extension by reflection procedure to bounded uniformly continuous functions, i.e. (E C(Ω) α)| Ω = α. Further inspecting the construction of E, it may also be observed that the support of Ew is compactly contained in R N . Analogously, we define the approximating sequence S n (w, Ω) to w by It is well known that and, since Ew has compact support in R N , it holds that S n (w, Ω)| Ω ∈ D(Ω) d . In order to achieve feasibility, we use the scaling sequence , α n converges to E C(Ω) α uniformly in R N and thus β n → 1 as n → ∞. In addition, (3.6) together with (3.8) yields for all x ∈ Ω. As a result, β n S n (w, Ω) ∈ K(D(Ω) d ) and, taking account of (3.9), the proof is accomplished.

Remark 3.2 (boundary conditions).
(i) In order to incorporate a homogeneous Dirichlet boundary condition in the context of theorem 3.1, one may use an additional reparametrization to construct a suitable approximating sequence; see [11]. (ii) If the set K(W 1,p (Ω) d ) is additionally restricted by an inhomogeneous Dirichlet boundary condition given by a function g on ∂Ω, the proof of theorem 3.1 fails. In fact, the sequence (3.8), which is based on the standard mollifier, does not preserve a given trace condition. In any case, the regularity of g (and ∂Ω) determines an a priori regularity limitation for the functions in Y in order to be compatible with a closure property analogous to (3.4), e.g. if Y = C(Ω) and g / ∈ C(∂Ω), then K ∩ Y = ∅. In this case, a different mollification approach needs to be pursued; cf. also §7 for an outlook on this matter.

Density results for discontinuous obstacles (a) Obstacles in Sobolev spaces
Note that theorem 3.1 requires continuous obstacles. In some applications, such as in the regularization and discretization of elasto-plastic contact problems or image restoration problems (see §6), it may be useful to consider obstacles that are not continuous. Under such circumstances, the following example shows that density properties of the type (3.2) or (3.4) cannot be expected if the obstacle is just a Sobolev function: without loss of generality, assume that 0 ∈ Ω ⊂ R N with N ≥ 2 and denote by the open ball with centre x ∈ R N and radius ε > 0 with respect to the Euclidean norm | · | 2 in R N . Let {x k : k ∈ N} be a countable dense subset, i.e.
. We note that ϕ is non-negative with a singularity at the origin, and its zero extension belongs to W 1,N (R N ); cf. [17,Example 4.43]. Further set and note that g ∈ W 1,N (Ω) with g being unbounded at each belongs to W 1,N (Ω); e.g. [14,Lemma A.3]. Notice also that α is bounded away from zero and that it is basically equal to 1 on the dense set {x k : k ∈ N}. Consequently, any continuous function w with w ≤ α a.e. in Ω fulfils w ≤ 1 on Ω: Assume that the latter implication is false. Then there exist k 0 ∈ N as well as μ > 0, δ > 0 such that contradicting (4.4). Hence, any sequence of continuous functions approximating α from below is bounded above by 1. However, as α(x) > 1 for a.e. x ∈ Ω by definition, and convergence in the norm topology of L p (Ω) implies convergence pointwise a.e. (along a subsequence), we obtain that for any 1 ≤ p ≤ +∞, and  Using the Baire category theorem, one may show that the set S is a non-meagre set with vanishing Lebesgue measure [19].
The previous construction of the counterexample is the basis for the following result.  Proof. We only prove assertion (iii) since (i) and (ii) follow immediately from (4.5) and (4.6). As a consequence of the Sobolev imbedding theorem, any w ∈ K(W 1,p it follows that |w(x)| ≤α(x) a.e. in Ω. Sinceα ∈ C(Ω) and (3.1) holds withα instead of α, we may invoke theorem 3.1 to infer that there exists a sequence (w n ) with w n ∈ D(Ω) d , w n → w in W 1,p (Ω) d and |w n (x)| ≤α(x) ≤ α(x) a.e. in Ω. This entails that w n ∈ K(D(Ω) d ) for all n ∈ N, which accomplishes the proof.
We immediately infer the corresponding statements for Sobolev spaces incorporating homogeneous Dirichlet boundary conditions.
the inclusion being strict. Proof.

(b) Lower semicontinuous obstacles and Lebesgue spaces
The preceding counterexample provides a regularity limit in terms of the upper bound α for which the density property (3.2) in the space X(Ω) = L p (Ω) d can be expected to hold. In this regard, however, uniform continuity is far from being a necessary condition. In order to enlarge the space of obstacles compatible with (3.2), we first consider upper bounds that allow for a lower semicontinuous representative, i.e. there exists a lower semicontinuous function in the equivalence class of functions that are Lebesgue-almost everywhere equal to α.
Proof. Let w ∈ K(L p (Ω) d ). Consider a lower semicontinuous function α : Ω → R ∪ {+∞} that fulfils (3.1). Without loss of generality, we may assume that inf x∈Ω α(x) > 0. Denote byα the extension of α given byα( for all x ∈ Ω, n ∈ N and α n (x) → α(x) a.e. in Ω; see, e.g. [2, Theorem 9.2.1]. Now consider the functions where it is understood that w n (x) := 0 if w(x) = 0. It follows from Lebesgue's theorem on dominated convergence that w n → w in L p (Ω) d . Further observe that w n ∈ K n (L p (Ω) d ) where Let ε > 0. According to (3.2), for each n ∈ N, w n can be approximated by a smooth functionw n ∈ For sufficiently large n, we conclude that which concludes the proof.
We proceed by considering the important special case of a piecewise continuous upper bound; suppose there exists a partition of Ω into open subsets Ω l ⊂ Ω with Lipschitz boundary such that

(c) Lower semicontinuous obstacles and Sobolev spaces
Conditions on the obstacle α so that the density results for Sobolev spaces hold can be relaxed from assuming that α ∈ C(Ω) to lower regularity requirements with the aid of Mosco convergence of closed and convex sets. The following definition goes back to [15].
Here, (K n k ) denotes an arbitrary subsequence of (K n ) and the subset notation (v k ) ⊂ X has to be understood in the sense that {v k } ⊂ X. The following class of obstacles encompasses functions in W 1,q (Ω) that fulfil a generalized lower semicontinuity condition. Definition 4.6. We denote by W q (Ω) for q ≥ 1 the set of functions α ∈ W 1,q (Ω) for which there exists a sequence of functions (α n ) with α n satisfying (3.1), α n ≤ α a.e. in Ω and α n ∈ C(Ω) ∩ W 1,q (Ω) for all n ∈ N such that α n α in W 1,q (Ω).
Note that the class W q (Ω) is strictly contained in W 1,q (Ω). Additionally, any obstacle α ∈ W q (Ω) has a lower semicontinuous representative, which follows easily from definition 4.6 and by extraction of a pointwise almost everywhere converging subsequence. However, the functions in W q are not necessarily continuous: it suffices to consider the example from (4.1) for Ω = B r (0), N > 1 and α(x) = ln(ln(c|x| −1 )), c ≥ er fixed. (4.9) It follows that α ∈ W 1,q (Ω) for all q ≤ N (see [17,Example 4.43]), α / ∈ C(Ω), and the sequence (α n ) defined as α n (x) = min(α(x), n) for n ∈ N satisfies the requirements of the definition of W q (Ω).

Theorem 4.7.
Let Ω ⊂ R N be a bounded Lipschitz domain. Let 1 ≤ p < ∞ and α ∈ W p (Ω). Then the following density results hold true: Proof. Without loss of generality, consider the one-dimensional case d = 1. Let w ∈ K(W 1,p 0 (Ω); | · | ∞ ) and (α n ) ⊂ W 1,p (Ω) according to definition 4.6. By Mazur's lemma, we may as well assume that (α n ) converges strongly to α in W 1,p (Ω) since convex combinations preserve order and continuity. Hence, one obtains the Mosco convergence result for the unilateral constraint sets Consequently, there exist two recovery sequences, w ± n ∈ K ± n (W it follows that the sequence w n = max(w + n , 0) + min(w − n , 0), converges to w in W 1,p 0 (Ω). Moreover, it holds that |w n | ≤ α n for all n ∈ N. For each n ∈ N, the assumptions on α n allow to use (3.2) to infer the existence of a smooth functionw n ∈ C ∞ c (Ω) with |w n | ≤ α n ≤ α a.e. in Ω that approximates w n arbitrarily well. Using w n → w in W 1,p 0 (Ω) d , the assertion follows by an ε/2-argument as in (4.7). The proof for the case X(Ω) = W 1,p (Ω) d follows analogously by invoking theorem 3.1.

(d) Supersolutions of elliptic partial differential equations
By now, density properties for pointwise constraints in Sobolev spaces of the type have been obtained on the basis of mollification and a subsequent procedure to enforce feasibility. An alternative approach is the approximation of a function via the solution of an appropriate sequence of elliptic PDEs. Using standard regularity theory, one may prove higher regularity of the approximating sequence and one is left to prove feasibility. In this section, we focus on obstacles which are solutions of an elliptic PDE. Therefore, consider a general second-order differential operator A in divergence form; Here, the matrix [a ij (x)] is symmetric a.e. and uniformly elliptic, i.e. there exists a κ a > 0 such that for a.e. x ∈ Ω. It is further assumed that a ij , b i , c are such that A : H 1 0 (Ω) → H −1 (Ω) is strongly monotone, i.e. there exists κ > 0 such that where . . . , . . . denotes the duality pairing in H −1 (Ω). For example, this is the case if b i ≡ 0 for 1 ≤ i ≤ N and c(x) ≥ 0 a.e. in Ω. We call a function α ∈ H 1 (Ω) weak supersolution with respect to the elliptic operator A, if Aα ≥ 0 in the H −1 (Ω)-sense, that is The subsequent theorem covers density properties for obstacles that are weak supersolutions of an elliptic PDE of type (4.11). (4.11) in the sense of (4.12

Theorem 4.8. Let Ω be a bounded domain. Let α ∈ H 1 (Ω) be a weak supersolution for some A as in
in the following cases: Proof. Without loss of generality, assume d = 1. First observe that the maximum principle implies α(x) ≥ 0 a.e. in Ω. Let w ∈ K(X(Ω)) be arbitrary. Consider the sequence (w n ), where w n is defined as the unique solution to the problem, find y ∈ H 1 0 (Ω) : We denote by T n the solution mapping to (4.13), i.e. w n = T n (w).
Step 2: Some convergence results for singular perturbations. The desired convergence modes of the approximating sequences rely on standard arguments for singular perturbations, cf. [7, Theorems 9.1 and 9.4] for the case of singularly perturbed variational inequalities. First, for y ∈ L 2 (Ω) it holds lim n→∞ y n = y in L 2 (Ω) ⇒ŷ n := T n (y n ) → y in L 2 (Ω). In fact, since y n ∈ H 1 0 (Ω) and A is strongly monotone, we observe that where we have used thatŷ n solves (4.13) with y n as right-hand side. Hence (ŷ n ) is bounded in H 1 0 (Ω). Employing (4.15) one obtains thatŷ n y in H 1 0 (Ω) along a subsequence, and by uniqueness, it holdsŷ n y for the entire sequence (ŷ n ). Finally, from the inequalities above, we have for w ∈ H 1 0 (Ω), respectively.
Step 3: Regularity and convergence of the approximating sequences The extra regularity of the H 1 0 (Ω)-solution T n (w) to (4.13) is different with respect to the statement cases: if a ij ∈ C 0,1 (Ω) or a ij ∈ C 1 (Ω) for 1 ≤ i, j ≤ N, the solution T n (w) belongs to H 1 0 (Ω) ∩ H 2 loc (Ω) (see [21] for the first case and [18] for the second one). The solution T n (w) belongs to [21] or when Ω is convex [22]. In case w ∈ K(L 2 (Ω)), (4.15) with y n ≡ w ensures that w n → w in L 2 (Ω). In conjunction with the regularity and the feasibility of w n = T n (w) described above, we have then established (i) and (ii) for X(Ω) = L 2 (Ω). Secondly, note that if w ∈ K(H 1 0 (Ω)), then w n → w in H 1 0 (Ω) by (4.16) with y n ≡ w, and as seen above, w n ∈ K(H 1 0 (Ω)). This, together with the regularity of w n = T n (w) established above, proves in turn (i) and (ii) for X(Ω) = H 1 0 (Ω). It is left to argue for (iii) and (iv) as follows. If a ij , b i , c ∈ C m+1 (Ω) for 1 ≤ i, j ≤ N, then for each n ∈ N, the operator T n has the following increasing regularity properties [18], and if a ij , b i , c ∈ C m+1 (Ω) for 1 ≤ i, j ≤ N and ∂Ω is of class C m+2 , for each n ∈ N, Finally, this proves (iii) given that w q n ∈ H m+2 loc (Ω) ∩ H 1 0 (Ω) for 2q ≥ m + 2, w q n ∈ K(H 1 0 (Ω)), and w q n → w as n → ∞ in L 2 (Ω) or H 1 0 (Ω) depending on the regularity of w, cf. (4.17) and (4.18). The analogous reasoning applies to (iv).
Let us briefly comment on the relation to the density results from theorem 4.4 and theorem 4.7. First, note that we do not require the obstacle to be bounded away from zero as we did in the preceding paragraphs. Secondly, the maximal regularity of the feasible approximation hinges on the coefficients of the elliptic operator associated with the obstacle and the smoothness of the boundary. Concerning the semicontinuity requirements of the upper bound, a classical result from Trudinger [23,Cor. 5.3] for the case without lower order terms (b i ≡ 0, c ≡ 0) states that any weak supersolution in the sense of (4.12) is upper semicontinuous. By contrast, the consideration of upper bounds that are weak subsolutions of an elliptic PDE is not useful as these functions may easily fail to be non-negative on Ω. For example, this is the case if a weak subsolution satisfies a Dirichlet boundary condition.

Application to finite elements (a) Finite-element discretized convex sets
In the following, we investigate the issue of the Mosco convergence (definition 4.5) of finitedimensional approximations K n of a convex constraint set K(X(Ω)) of the type (3.3); see §2b(ii) for a general motivation in the context of variational inequality problems. In this section, it is assumed that the sets (K n ) result from a suitable finite-element discretization such that the parameter n is associated with a sequence of mesh widths (h n ) tending to zero. The convergence of (K n ) in the sense of definition 4.5 ensures that the solutions of the discrete problems converge to the solution of the original infinite-dimensional problem irrespectively of the regularity of the data or the obstacle defining K(X(Ω)); see [7, ch. 4, Theorem 4.1]. Mosco convergence results of this type are rarely found in the literature and are typically confined to simpler constraint sets and higher regularity assumptions on the obstacle; see, for instance, [4] for the case of an H 1 (Ω) ∩ C(Ω)bound in the context of the obstacle problem. The density results from the preceding sections provide the basis for new Mosco convergence results under minimal regularity (of the solution) and under weaker assumptions on the regularity of the obstacle α. We further provide novel Mosco convergence results for discretized constraints on partial derivatives, including Raviart-Thomas finite-element approaches for problems in H(div). As a general rule, density results of the type (1.1) represent a powerful means to verify the convergence of finite-element methods for convex constrained problems under minimal regularity. Applications involving constraint sets of the type (3.3) with low regularity of α are manifold and comprise, for instance, the discretization of variational problems in mechanics, such as in elasto-plasticity with hardening [24], and in image restoration, with regard to the predual problem of TV-regularization [25]. Moreover, the issue occurs in fixed point-based approaches to the solution of quasi-variational inequalities through the implicit definition of obstacles. In some textbooks on finite-dimensional approximations of variational inequalities, cf. e.g. [4,6], condition (M2) is replaced by the following criterion: there exists a dense subsetK ⊂ K and an operator r n :K → X such that for all v ∈K it holds r n v → v in X and there exists n 0 ∈ N such that r n v ∈ K n for all n ≥ n 0 .
It is easy to show that (M2 ) implies (M2). In fact, let v ∈ K and denote by π K n v its (not necessarily uniquely determined) projection onto K n . By density, for ε > 0, there for sufficiently large n such that lim n→∞ v − π K n v ≤ ε, where ε was arbitrary.
The condition (M2 ) turns out to be convenient especially in the context of finite-dimensional approximations, where (r n ) is given by suitable interpolation operators, which typically are only well defined on a dense subset Y(Ω) of X(Ω) giving rise to setsK of the type K (Y(Ω)). This is precisely the point where the density results of §3 are needed.
Note that Mosco convergence is a powerful tool whenever the discrete spaces are fixed a priori, i.e. regardless of the data of the specific problem. The resulting sequence of finite-dimensional problems can be understood as an approximation of any problem in a given problem class.
By contrast, adaptive finite-element methods intend to design the sets K n in order to approximate the solution of a specific problem. However, rigorous convergence proofs with regard to adaptive discretizations of variational inequalities are restricted to special cases and usually rely on rather strong assumptions. For instance, in the case of the obstacle problem with a piecewise affine obstacle, we mention the article [26]. Moreover, density results may still be useful in the convergence analysis of adaptive schemes which require interpolation operators (cf. [27]).

(b) Finite-element spaces and interpolation operators
In this section, we assume that Ω ⊂ R N is polyhedral. Together with Ω, a sequence of geometrically conforming affine simplicial meshes (T h ) h>0 of Ω with mesh size h := max T∈T h diamT is assumed to be given. For details, we refer to [28]. In analogy to the case N = 2, we refer to each T h as a triangulation. The (N-dimensional) Lebesgue measure of an element T ∈ T h is denoted by λ(T). We also admit the standard assumption that the sequence (T h ) is shape-regular, i.e.
where diam(T) = max x,y∈T |x − y| denotes the diameter of T and ρ T designates the diameter of the largest ball that is contained in T. We further write x T for the (barycentric) midpoint of an element T, and M h = {x T : T ∈ T h }, N h and E h for the set of element midpoints, triangulation nodes and edges with respect to T h , respectively. By abuse of notation, we write |M h | and |N h | for the cardinality of the respective set. Let χ T : Ω → R designate the characteristic function of T with respect to Ω, that is We further make use of the standard H 1 (Ω)-conforming finite-element space of globally continuous, piecewise affine functions denoted by Here, P 1 denotes the space of polynomials of degree less than or equal to one. Together with the finite-dimensional subspace P 1,h (Ω) and its standard nodal basis where RT := {w ∈ P d 1 : ∃ a ∈ R d , b ∈ R : w(x) = a + bx} and ν denotes the unit outer normal to T. To incorporate homogeneous Neumann boundary conditions, one uses the H 0 (div; Ω)conforming subspace The construction of suitable edge-based basis functions {ϕ E : E ∈ E h } can be found in the literature, cf., for instance, [29], such that the boundary condition in the definition of RT 0,h (Ω) can be easily accounted for. The global Raviart-Thomas interpolation operator is given by

(c) Mosco convergence results under minimal regularity
We emphasize that the subsequent results may be extended to finite elements of higher order, which are typically useful when the solution to the variational problem, e.g. (2.8), displays a higher regularity. In this regard, higher regularity assumptions on the data and the obstacle are required and the concept of Mosco convergence is not binding to prove the convergence of the finite-element method, and a priori error estimates with a rate can be derived (cf. e.g. [30]). However, we do not want to deviate from minimal regularity assumptions on the data. Further, even for simple variational problems such as the classical elasto-plastic torsion problem, there is a regularity limitation for the solution regardless of the smoothness of the data (cf. [4]). Note also that the subsequently covered problems comprise situations where the discrete feasible sets K h are not necessarily nested and non-conforming in the sense that they are in general not contained in the feasible set K(X). In the following, c denotes a positive constant, which may take different values on different occasions.

Lemma 5.2.
Let Ω ⊂ R N be a polyhedral domain and α ∈ C(Ω) with α(x) ≥ 0 in Ω. Further let (w h ) be a sequence that fulfils for all h, w h ∈ P 1,h (Ω) d and Proof. It suffices to show that i K (w) = 0, where Moreover, it holds that i K = j * , where j * denotes the Fenchel conjugate of the mapping j : L 2 (Ω) d → R, j(v) := Ω α|v| * dx. Here, denotes the dual norm of | · |. From the definition of j * , we obtain that i K (w) = 0 is equivalent to the piecewise constant interpolants of α and v, respectively. By definition of (a h ) and (v h ) as well as the uniform continuity of α and v it follows that α h → α and v h → v, both in L ∞ (Ω). By the weak convergence of (w h ), the strong convergence of (α h ) and (v h ) as well as the midpoint quadrature rule, we obtain which proves (5.5).
Proof. The assertion follows by a slight modification of the proof of lemma 5.2. Instead of the piecewise constant interpolant we define α h as the piecewise affine interpolant of α, i.e. α h = I h α, Theorem 5.4. Let Ω ⊂ R N be a polyhedral domain and α ∈ C(Ω) such that (3.1) holds true. Then the sets Proof. Since weak convergence in H 1 (Ω) implies weak convergence in L 2 (Ω), the preceding lemma 5.2 shows that (M1) is fulfilled. We now show (M2 ). To prove the assertion, we may use a strategy that is similar to the one in [4, ch. II] and requires (3.4). Note that theorem 3.1 implies that the setK for a suitable modification of c. This implies Since any w ∈K is uniformly bounded away from α, there exists h 0 = h 0 (w) such that r h w ∈ K h ∀h ≤ h 0 , which implies (M2 ). Proof. Again, lemma 5.2 implies that (M1) with X = L 2 (Ω) d holds true. ForK defined in (5.9) it holds thatK is also dense in K(L 2 (Ω) d ) with respect to the L 2 (Ω) d -norm (cf. (3.2)). Thus, (M2 ) follows analogously to the proof of theorem 5.4.

Corollary 5.6. Under the conditions of theorem 5.4, the node-based discrete sets
Proof. The proof is analogous to the proof of theorem 5.4, noting that (5.12) also implies r h w ∈ K h ∀h ≤ h 0 with K h according to the node-based definition (5.13).

Remark 5.7.
With the help of the density property (3.2) for uniformly continuous upper bounds, the above results on the Mosco convergence of discretized convex sets carry over to spaces involving homogeneous Dirichlet boundary conditions. In this context, the set P 1,h (Ω) in the definitions of the discretized sets K h in (5.8) and (5.13) has to be replaced by the space The resulting discrete sets K h incorporate the zero boundary condition and the corresponding results on Mosco convergence for h → 0 remain valid replacing With the help of the density result (3.2), one obtains the following result for the discrete approximation of pointwise constraint sets in H(div; Ω) by the Raviart-Thomas finite-element space RT h (Ω) (cf. (5.3)). |w| ≤ α a.e. in Ω. The continuity of the normal trace mapping H(div; Ω) w → wν, v H −1/2 (∂Ω),H 1/2 (∂Ω) ∈ R for fixed v ∈ H 1 (Ω) implies wν = 0 in H −1/2 (∂Ω). We conclude that w ∈ K(H 0 (div; Ω)) whence it follows that (M1) is satisfied. Secondly, note that

H(div;Ω)
= K(H 0 (div; Ω)); cf. (3.2). For the global Raviart-Thomas interpolation operator defined in (5.4), the following interpolation error estimate holds true [28, Corollary 1.115]: for all u ∈ W 2,∞ (Ω) N . Setting r h w := I RT h w for any w ∈K, wherẽ and taking account of the fact that I RT h w → w in H(div) for all w ∈K, we may proceed analogously to the proof of theorem 5.4 to verify (M2 ).
Next we consider pointwise constraints on the divergence. For X(Ω) ⊂ H(div; Ω) let K div (X(Ω)) := {w ∈ X(Ω) : |div w| ≤ α a.e. in Ω}. (5.16) Using Raviart-Thomas finite elements, a discrete realization of the inequality constraint in (5.16) can be achieved by imposing the inequality on the midpoints of the triangulation. The following statement ensures that the resulting approach is stable as the mesh width goes to zero. Proof. Taking account of the fact that w h w in H(div; Ω), w h ∈ K h , implies div w h div w in L 2 (Ω), (M1) follows analogously to the corresponding part of the proof of corollary 5.9. Since K div (C ∞ c (Ω) N ) is dense in K div (H 0 (div; Ω)) [11,Theorem 4], the set K := {w ∈ C ∞ c (Ω) d : |divw(x)| < α(x), ∀x ∈ Ω} is also dense in K div (H 0 (div; Ω)). Setting r h = I RT h , the estimate (5.14) implies r h w → w in H(div; Ω) and div w − div r h w L ∞ (Ω) ≤ ch w W 2,∞ (Ω) N , for all w inK. In particular, one may argue as in the proof of theorem 5.4 to verify (M2 ).
For a general L p -function as upper bound, a point-based discretization is obviously not possible. As a remedy, the construction of the discrete sets K h typically involves some kind of averaging process. For this purpose, we define the integral mean T α dx := T α dx λ(T) over some given subset T ⊂ Ω (with positive measure). Now we have to take into account that the density results of the type (3.2) and (3.4), which represent the main ingredient to prove the consistency of the finite-element approximation, may fail to hold true (e.g. theorem 4.2). On the other hand, the results from §4 indicate that the density property is still guaranteed for a large class of discontinuous obstacles. To maintain the greatest level of generality, we assume that the non-negative measurable function α : Ω → R ∪ {+∞} allows for the density property (5.17) Here, we concentrate on the consistency in the L 2 -topology but an extension to the other cases is possible by appropriately modifying assumption (5.17). We stress the fact that assumption (5.17) is fulfilled in relevant situations (cf. e.g. theorem 4.4).
Lemma 5.11. Let Ω ⊂ R N be a polyhedral domain and α ∈ L 2 (Ω) with α(x) ≥ 0 a.e. in Ω. Let (w h ) be a sequence that fulfils for all h, w h ∈ P 1,h (Ω) d and |w h (x T )| ≤ T α dx for all T ∈ T h . If w h w for h → 0 in L 2 (Ω) d then it holds that |w| ≤ α a.e. in Ω.