Simultaneous inpainting and denoising by directional global three-part decomposition: connecting variational and Fourier domain-based image processing
Abstract
We consider the very challenging task of restoring images (i) that have a large number of missing pixels, (ii) whose existing pixels are corrupted by noise, and (iii) that ideally contain both cartoon and texture elements. The combination of these three properties makes this inverse problem a very difficult one. The solution proposed in this manuscript is based on directional global three-part decomposition (DG3PD) (Thai, Gottschlich. 2016 EURASIP. J. Image Video Process.2016, 1–20 (doi:10.1186/s13640-015-0097-y)) with a directional total variation norm, directional G-norm and ℓ∞-norm in the curvelet domain as key ingredients of the model. Image decomposition by DG3PD enables a decoupled inpainting and denoising of the cartoon and texture components. A comparison with existing approaches for inpainting and denoising shows the advantages of the proposed method. Moreover, we regard the image restoration problem from the viewpoint of a Bayesian framework and we discuss the connections between the proposed solution by function space and related image representation by harmonic analysis and pyramid decomposition.
1. Introduction and related work
Image enhancement and image restoration are two superordinate concepts in image processing which encompass a plethora of methods to solve a multitude of important real-world problems [1,2]. Image enhancement has the goal of improving an input image for a specific application, e.g. in areas such as medical image processing, biometric recognition, computer vision, optical character recognition, texture recognition or machine inspection of surfaces [3–5]. Methods for image enhancement can be grouped by the domain in which they perform their operations: images are processed in the spatial domain or Fourier domain, or modified, e.g. in the wavelet or curvelet domain [6]. The types of enhancement methods include contextual filtering, e.g. for fingerprint image enhancement [7–9], contrast enhancement, e.g. by histogram equalization [10], and image super-resolution [11]. Image restoration is connected to the notion that a given input image suffers from degradation and the goal is restore an ideal version of it. Degradations are caused by various types of noise, missing pixels or blurring and their countermeasures are denoising, inpainting and deblurring. In general, one has to solve a linear or nonlinear inverse problem to reconstruct the ideal image from its given degraded version. Denoising aims to remove noise from an image and denoising methods include total variation (TV) minimization-based approaches [12], the application of non-local means (NL means) [13] or other dictionaries of image patches for smoothing, and adaptive thresholding in the wavelet domain [14]. Inpainting [15] is the filling-in of missing pixels from the available information in the image, and it is applied for scratch removal from scanned photographs, for occlusion filling, for removing objects or persons from images (in image forgery [16] or for special effects), and for filling-in of pixels which were lost during the transmission of an image or left out on purpose for image compression [17]. Deblurring [18] addresses the removal of blurring artefacts and is not the focus of this paper.
Rudin et al. [19] pioneered two-part image decomposition by TV regularization for denoising. Shen & Chan [20] applied TV regularization to image inpainting, called the TV inpainting model, and they also suggested image inpainting by curvature-driven diffusions [21]. Starck et al. [22] defined a model for two-part decomposition based on the dictionary approach. Then, Elad et al. [23] applied this decomposition idea for image inpainting by introducing the indicator function in the ℓ2 norm of the residual; see eqn (6) in [23]. Esedoglu & Shen [24] introduced two inpainting models based on the Mumford–Shah model [25] and its higher order correction—the Mumford–Shah–Euler image model. They also presented numerical computation based on the Γ-convergence approximations [26,27]. Shen et al. [28] proposed image inpainting based on bounded variation and elastica models for non-textured images.
Image inpainting can be an easy or difficult problem depending on the amount of missing pixels [21], the complexity of the image content and whether prior knowledge about the image content is available. Methods have been proposed which perform only cartoon inpainting (also referred to as structure inpainting) [20,28,29] or only texture inpainting [30]. Images which consist of both cartoon (structure) and texture components are more challenging to inpaint. Bertalmio et al. [31], Elad et al. [23] and Cai et al. [32] have proposed methods for inpainting which can handle images with both cartoon (structure) and texture components.
In this paper, we tackle an even more challenging problem. Consider an input image f which has the following three properties:
(i) a large percentage of pixels in f are missing and shall be inpainted,
(ii) the known pixels in f are corrupted by noise,
(iii) f contains both cartoon and texture elements.
The co-occurrence of noise and missing pixels in an image with cartoon and texture components increases the difficulty of both the inpainting problem and the denoising problem. A multitude of methods have been proposed for inpainting and denoising. Existing inpainting methods in the literature typically assume that the non-missing pixels in a given image contain only a small amount of noise or are noise free, and existing methods for denoising typically assume that all pixels of the noisy image are known. The proposed method for solving this challenging problem is inspired by the works of Efros & Leung [30], Bertalmio et al. [31], Vese & Osher [33], Aujol & Chambolle [34], Buades et al. [13] and Elad et al. [23], and is based on the directional global three-part decomposition (DG3PD) [35]. The DG3PD method decomposes an image into three parts: a cartoon image, a texture image and a residual image. Advantages of the DG3PD model lie in the properties which are enforced on the cartoon and texture images. The geometric objects in the cartoon image have a very smooth surface and sharp edges. The texture image yields oscillating patterns on a defined scale which is both smooth and sparse. Recently, the texture images have been applied as a very useful feature for fingerprint segmentation [35–37].
We address the challenging task of simultaneous inpainting and denoising in the following way. The advanced DG3PD model introduced in the next section decomposes a noisy input image f (with missing regions D) into cartoon u, texture v and residual ϵ components. At the same time, the missing regions D of the cartoon component u are interpolated and the available regions of u are denoised by the advantage of multi-directional bounded variation. This effect benefits from the help of the indicator function in the measurement of the residual, i.e. in (2.1). However, texture v is not interpolated due to the ‘cancelling’ effect of this supremum norm for residual in unknown regions. Therefore, the texture component v is inpainted and denoised by a dictionary-based approach instead. The DG3PD decomposition drives noise into the residual component ϵ, which is discarded. The reconstruction of the ideal version of f is obtained by summation of the inpainted and denoised cartoon and texture components (see figure 1 for a visual overview).
Figure 1. Overview over the DG3PD image inpainting and denoising process.
Moreover, we uncover the link between the calculus of variations [38–40] and filtering in the Fourier domain [41] by analysing the solution of the convex minimization in equation (2.1), i.e. roughly speaking the solution of the DG3PD inpainting model which can be understood as the response of the lowpass filter , highpass filter and bandpass filter and the unity condition is satisfied, i.e.
— the scaling function and wavelet-like function for the cartoon u are from the effect of the multi-directional TV norm,
— the scaling function and wavelet-like function to extract the texture v are reconstructed by the effect of the multi-directional G-norm,
— the effect of the ℓ∞ norm is to remove the remaining signal in the known regions of the residual ϵ (due to the duality property of ℓ∞).
We also describe flowcharts to show that the method of variational calculus (or the DG3PD inpainting) is a closed loop pyramidal decomposition, which is different from an open loop one, e.g. wavelet [46], curvelet [47], see §7. By numerics, we observe that the closed loop filter design by the calculus of variation will result in lowpass, highpass and bandpass filters which are ‘unique’ for different images (figure 17). We also analyse the DG3PD inpainting model from the perspective of a Bayesian framework and then define a discrete innovation model for this inverse problem.
This paper is organized as follows. In §2, we describe the DG3PD model for image inpainting and denoising. In §3, we show how to compute the solution of the convex minimization in the DG3PD inpainting problem by the augmented Lagrangian method. In §4, we describe the proposed method for texture inpainting and denoising. In §5, we compare the proposed method with existing ones (TVL2 inpainting [48,49]). Moreover, in order to enhance our evaluation in terms of simultaneous inpainting and denoising effects, we also perform a comparison with the NL-means filter for denoising, namely block-matching and three-dimensional filtering (BM3D), in [50]. In §6, we consider our inverse problem from a statistical point of view, i.e. the Bayesian framework, to describe how to select priors for cartoon u and texture v. We analyse the relation between the calculus of variations and the traditional pyramid decomposition scheme, e.g. Gaussian pyramid, in §7. We conclude the study with §8. For more detailed notation and mathematical preliminaries, we refer the reader to [36,51].
2. Inpainting by DG3PD
We define a method for restoration of the original noisy/non-noisy image f with a set of known missing regions D. The proposed model is a generalized version of DG3PD [35] for the inpainting and denoising problem. This modification for our DG3PD inpainting is inspired by Elad et al. [23].
In particular, the discrete image f (size m × n) with missing regions D is simultaneously decomposed into cartoon u, texture v and noise ϵ; in particular, cartoon u is interpolated on the missing regions due to the indicator function χD in the modified model.
A set of missing regions D on a bounded domain Ω is defined by the indicator function χD whose complement is
Instead of putting χcD on the ℓ2 norm of the residual (see eqns (4) and (6) in [23]), we introduce χcD for the residual in the DG3PD model with the point-wise multiplication operator · × , i.e. . We propose the DG3PD inpainting as follows:
Note that if there are no missing regions, i.e. χcD[k] = 1, ∀k∈Ω, the minimization problem (2.1) becomes the DG3PD model [35]. Next, we discuss how to solve the DG3PD-inpainting model (2.1).
3. Solution of the DG3PD inpainting model
In this section, we present a numerical algorithm for obtaining the solution of the DG3PD-inpainting model stated in (2.1). To simplify the calculation, we introduce three new variables
and the Lagrange multipliers are updated after every step t. We also initialize u(0) = f, v(0) = ϵ(0) = e(0) = [r(0)l]L−1l=0 = [w(0)s]S−1s=0 = [g(0)s]S−1s=0 = [λ(0)1l]L−1l=0 = [λ(0)2a]S−1a=0 = λ(0)3 = λ(0)4 = λ(0)5 = 0, where 0 is an m × n zero matrix.
In each iteration, we first solve the seven subproblems in the listed order: ‘[rl]L−1l=0-problem’, ‘[ws]S−1s=0-problem’, ‘[gs]S−1s=0-problem’, ‘v-problem’, ‘u-problem’, ‘e-problem’, ‘ϵ-problem’, and then we update the five Lagrange multipliers, namely [λ1l]L−1l=0, [λ2a]S−1a=0, λ3, λ4, λ5 (see algorithm 1).
In this section, we only present the solution of the ‘e-problem’ and ‘ϵ-problem’ as well as the updated Lagrange multiplier λ5. We refer readers to [35] for a detailed explanation of the other subproblems and the other Lagrange multipliers.
The e-problem: Fix u, v, ϵ, [rl]L−1l=0, [ws]S−1s=0, [gs]S−1s=0 and

The solution of (3.3) is defined as a projection operator on a convex set A(ν), i.e. :
The ϵ-problem: Fix u, v, e, [rl]L−1l=0, [ws]S−1s=0, [gs]S−1s=0 and
Updated Lagrange multiplierλ5∈X:
Choice of parameters
Owing to an adaptability to specific images and a balance between the smoothing terms and updated terms for the solution of the above subproblems, the selection of parameters (μ1, μ2, β1, β2, β3) is described in [35] and β5 is defined as
The term in (2.1) serves as a good measurement for the amount of noise due to the substraction in equation (3.4) [35,36,53] (owing to the multi-scale and multi-direction properties of the curvelet transform and its duality). However, because of the indicator function χcD in the noise measurement , the interpolated texture by DG3PD inpainting is almost in the residual; see figure 2a, and its smoothed version by curvelet shrinkage (figure 2b). Because of the substraction operator in equation (3.4), these interpolated textures are cancelled out in e (figure 2c). Figure 2d illustrates the estimated texture before the substraction in equation (3.4) as

Figure 2. The ‘cancelling effect’ is caused by the noise measurement .
In order to analyse the effects of the proposed model in terms of interpolation (for u) and decomposition, we consider a one-dimensional signal (which is extracted along the red line in figure 3a–d). By the DG3PD inpainting, the mean values of the cartoon u in (figure 3e) remain almost unchanged. The homogeneous regions have ‘sharp’ edges and a ‘staircase’ effect does not occur (owing to the directional TV). Texture v in (figure 3f) is extracted in areas which contain a repeated pattern (figure 3b). Moreover, small-scale objects, e.g. noise, are removed from v.
Figure 3. Interpolation for a one-dimensional signal after DG3PD inpainting (without texture inpainting): (c) a one-dimensional signal extracted along the red line in the original image (a). (d) The signal in (c) corrupted by Gaussian noise and missing regions D; see (b). Because of the directional TV norm, the DG3PD inpainting produces ‘sharp’ edges without a ‘staircase’ effect in a cartoon u (e) and the mean values of u remain almost unchanged. Owing to noise and the shrinkage soft-thresholding operator by ∥v∥ℓ1 in (2.1), texture v is reconstructed with a ‘sharp’ transition on areas where texture presents in the original signal (c). However, due to ‘heavy’ noise ϵ on the known domains in (e), the DG3PD fails to reconstruct a ‘full’ texture. In general, this is a challenging problem to recover an oscillating (or weak) signal corrupted by heavy noise. (h) A reconstructed image (without the inpainting texture step) and (i) a quantized signal, i.e. truncated to [0, 255] and quantized.
However, the term causes a cancelling effect which removes the interpolated texture in an unknown area during the decomposition process. Therefore, texture inpainting and denoising are tackled separately, as described in the next section, by a generalized version of the dictionary learning for texture synthesis [30].
4. Texture inpainting and denoising
The proposed method for reconstructing the texture component combines the following ideas in five subsequent steps:
— texture extraction by DG3PD [35],
— texture inpainting inspired by the work of Efros & Leung [30] or [54],
— texture denoising by NL means [13],
— image synthesis by summation with the cartoon component u.
The DG3PD model enforces sparsity and smoothness on the obtained texture component v. Sparsity means that the vast majority of coefficients in v are equal to zero (the percentage of zero coefficients depends on how much texture a specific image contains). We make good use of this property to answer the following question: which pixels of those that are missing shall be inpainted with texture? First, the texture component v is segmented into zero (grey pixels in figure 4a) and non-zero coefficients (black and white pixels in figure 4a). Morphological processing (as described in [36,37] for fingerprint segmentation) obtains texture regions (figure 4b). The mask of missing pixels D shown in figure 4c is dilated (we used a circular structure element with a radius of 5 pixels to avoid border effects on the margin between existing and missing pixels) and then it is intersected with texture region segmentation to obtain the mask I shown in figure 4d, which shows the pixels to be inpainted with texture.
Figure 4. All coefficients of the obtained texture component v equal to zero are visualized by grey pixels in (a); positive coefficients are indicated by white pixels and negative coefficients by black pixels. (b) The resulting segmentation. The mask of missing pixels D shown in (c) is dilated and then intersected with (b), leading to the (white) pixels which are to be inpainted with texture in (d).
The proposed texture inpainting proceeds in two phases. First, we build a dictionary of texture patches and, second, we inpaint all pixels of mask I in a specific order. For the dictionary, we select all patches of size s × s pixels (we used s = 15 in our experiments) from texture component v which meet the following two criteria: (i) at least p1 per cent of the pixels are known (also the central pixel of the patch has to exist) and (ii) at least p2 per cent of the coefficients are non-zero (we used p1 = 70% and p2 = 60% in our experiments). The first criterion excludes patches which contain too many missing (unknown) pixels and the second criterion excludes patches without texture from the dictionary. Next, we iterate over all pixels of mask I and consider the pixel to be inpainted as the central pixel of an image patch with size s × s. We count the number of known pixels inside the patch and compute the percentage of known pixels. Pixels are inpainted in the following order. We start with a threshold percentage t = 90%; this is decreased in steps of 5% per iteration and, in each iteration, we inpaint all pixels of mask I for which at least t per cent are known. The rationale behind this ordering is that the more pixels are known (or already inpainted) in the neighbourhood around a missing pixel, the better it can inpainted, because this additional information improves the chances of finding a good match in the texture dictionary. A third constraint ensures that the overlap of known pixels between a dictionary patch and the patch around the missing pixels contains at least p3 per cent of known pixels (we used p3 = 30% in our experiments). Inpainting a pixel means that we find the best fitting patch in our texture dictionary and set the pixel value to the central pixel of that patch. Best fitting is defined as the minimum sum of squared differences per pixel, divided by the number of pixels which overlap.
After all missing texture pixels have been inpainted, the texture region is denoised by n iterations of NL means. In each iteration, we first construct a dictionary considering all patches in the texture region with known and previously inpainted pixels. Next, for each pixel to be denoised, we find the top k fitting patches in the dictionary using the same distance function (sum of squared pixel-wise differences) and set the denoised pixel to the average value of the top k central pixel values. In our experiments, we used k = 5 and we have observed that, after about n = 10 iterations, the image has reached a steady state. See figure 5 for a visualization of the status at several intermediate steps during the inpainting and denoising process.
Figure 5. Images (a)–(e) visualize inpainting progress (added to the cartoon component) from 0% to 100% in steps of 25%. Images (f)–(h) display the denoising results after one, five and 20 iterations, respectively.
Finally, the full image is reconstructed by summation of the inpainted cartoon component u with the inpainted and denoised texture component v (figure 10).
5. Comparison of DG3PD with further inpainting and denoising methods
Figure 6 shows inpainting results for the challenging problem under consideration (figure 7c) obtained by well-known inpainting methods from the literature: an approach based on Navier–Stokes equations from fluid dynamics which was proposed by Bertalmio et al. [49], an inpainting method suggested by Telea [48] and a well-known TVL2 inpainting approach with its synthesis image shown in figure 6b,c.
Figure 6. Comparison of inpainting results obtained by (a) Navier–Stokes [49], (b) Telea [48], (c, d) TVL2 and (e) DG3PD. Figure 7. An ideal image (a) with cartoon and texture components. Missing pixels (b) are shown in white. Image (c) combines missing pixels and noise. .

The image shown in figure 7c has the three properties discussed in §1:
(i) a large percentage of pixels in f are missing and shall be inpainted,
(ii) the known pixels in f are corrupted by noise,
(iii) f contains both cartoon and texture elements.
Furthermore, a noise-free, ideal version f0 depicted in figure 7a is available which serves as the ground truth for evaluating the quality of inpainted and denoised images by different methods. The availability of a noise-free image is an advantage for comparing and evaluating different inpainting and denoising methods. By contrast, images taken by a digital camera contain a certain amount of noise and, for them, an ideal version is not available, see, for example, the Barbara image we used in previous comparisons [35]. The choice of the ideal image (shown in figure 7) enables us to clearly see and understand the advantages and limitations of the compared methods (figure 6).
Owing to the fidelity term, the L2-norm, in the TVL2 inpainting model, noise is reduced and the homogeneous areas are well interpolated. However, the method is known to cause a ‘staircase’ effect on the homogeneous regions and texture, while small-scale objects tend to be eliminated. If the parameter β is chosen smaller (e.g. β5 = 0.005), the resulting inpainted image is smoother, i.e. the ‘staircase’ effect is reduced, but also more texture parts are removed.
DG3PD inpainting produces a smooth result without the staircase effect. In comparison with the other approaches, it is the only method which can reconstruct the texture regions in a satisfactory way.
In order to evaluate the proposed model in terms of image denoising, we compared it with BM3D in [50]. This NL-mean filter enhances the sparse representation of an image in a transform domain by grouping similar two-dimensional image blocks into three-dimensional ‘groups’. Collaborative filtering is applied for these three-dimensional groups by using three successive steps: three-dimensional transformation of a group, shrinkage of the transform spectrum, and inverse three-dimensional transformation. Finally, aggregation (i.e. an averaging procedure) is applied for different estimates of each pixel due to several overlapped blocks.
BM3D is a state-of-the-art method for image denoising. However, it is not designed for inpainting and, in order to be able to run the BM3D code available at http://www.cs.tut.fi/foi/GCF-BM3D/, we have to provide input values for the missing pixels; therefore, we
— set missing pixels in a corrupted image in figure 7c as uniform random values in [0, 255] (figure 8a),
— let standard deviation of noise in known regions σ = 100, and
— linearly rescale an input image to [0, 1].

Figure 8. A performance of BM3D for image denoising. An input image (a) is obtained by setting the pixels in the missing regions of figure 7c to uniformly random values in [0, 255]. Image (b) is an output image of (a) by BM3D.
In the end, we realized that, while BM3D can separate noise reasonably well from a corrupted image in figure 8a, the resultant image illustrated in figure 8b does not have a good performance. In particular,
— from an inpainting aspect, BM3D cannot interpolate mixing areas on homogeneous or textured regions and
— from a denoising aspect, reconstructed areas by BM3D are blurred without sharp edges and with a loss of contrast as well as with several artefacts on homogeneous regions.
6. From the Bayesian framework to the DG3PD inpainting model
Following [55], in this section, we describe the DG3PD inpainting model from the perspective of a Bayesian framework through a maximum a posteriori (MAP) estimator. We assume that u and v are independent random variables (r.v.).
— Prior of cartoon image u: A cartoon u = [u[k]]k∈Ω consists of an element pixel u[k] which is considered as an independent r.v. with a prior pUk.
Given k∈Ω, denote r[k] = ∇+Lu[k] = [rl[k]]L−1l=0. To describe the distribution of a r.v. Uk, we firstly define the distribution of an L-variate r.v. Rk = [R0,k, …, RL−1,k]. We assume that r.v. Rk has a multi-variate Laplace distribution which is a part of multi-variate power exponential distribution [56–58], i.e. with and Γ( · ) function:
Thus, the distribution of a r.v. Uk is a multi-dimensional Laplace distribution with operator ∇+L:The joint probability density function (p.d.f.) of a prior u isWe choose . The potential function of u in matrix form isNote that the original u is not a Laplace distribution, but a transform of u under an operator ∇+L has an independent multi-dimensional Laplace distribution.— Prior of texture image v: Under a transform operator , the texture image v is decomposed in different orientations. Operator is suitably chosen to capture the texture components in the original image f. As for the definition of a discrete multi-dimensional G-norm [35], the transform (in a direction s, s = 0, …, S − 1) is defined by its inversion (note that ):
We assume that a texture v is sparse in a transform domain and sparse in the spatial domain (non-zero coefficients of v are only due to texture). To satisfy these conditions, we assume that a r.v. Vk has a mixture of a Laplace distribution in a spatial domain with a p.d.f. pVk,2(v[k]) (w.r.t. sparse in the spatial domain) and a multi-variate Laplace distribution (in an operator ) with a p.d.f. pVk,1(v[k]) (w.r.t. sparse in the transform domain ). Thus, we have a p.d.f. of r.v. Vk as follows:6.1Since V2k∼Laplace(0, γ21), we have6.2To define the distribution of a r.v. V1k, we define the distribution of the S-dimension r.v. Gk = [G0,k, …, GS−1,k]. Similar to Uk, we assume that r.v. Gk has a multi-variate Laplace distribution, i.e. Gk∼LaplaceS(0, Σ) with . It is easy to see that (with )or6.3Putting (6.2) and (6.3) into (6.1), we have the p.d.f. of r.v. Vk asThe joint p.d.f. of a texture image v = [v[k]]k∈Ω isand we choose μ2 = 1/2γ2 and μ1 = 1/2γ1. The potential function of v with an anisotropic version in matrix form is defined as— Likelihood (or the joint p.d.f. of pF|U,V (f|u, v)): If we assume that the residual ϵ in an image f has a power exponential distribution [56,57], i.e. Ek∼PE(μ, σ2, ξ), e.g. Gaussian (ξ = 1) or Laplacian , its density function at k∈Ω is
Owing to the inpainting problem with a missing region D, the likelihood is defined on a known domain Ω\D as— A posteriori: Since u and v are independent of each other, a posteriori is written as
Let ϵ = f − u − v, and the MAP estimator, i.e. , is defined as
In [59], if noise in image f is Gaussian, as in extreme value theory, the r.v. has a Gumbel distribution. Thus, there is a condition to choose ν. However, in practice, if noise cannot be identified, ν is chosen by α-quantile. Note that the condition is similar to the Dantzig selector [60].
Figure 9 illustrates the empirical density functions of the solution of the minimization (3.2). The statistical properties of the solution can be characterized in the Bayesian framework with the priors, likelihood and posterior as mentioned above.
Figure 9. The empirical density functions of the solution in (3.2) by an alternating direction method of multipliers: (a)–(g) are the empirical density functions of u, v, ϵ, r0, g0, w0 and the QQ-plot of the residual ϵ, respectively. We assume that the prior of ∇+Lu has a multi-Laplacian distribution, i.e. a changed variable r = [rl]L−1l=0 = ∇+Lu has a multi-Laplace distribution, or rl has a univariate Laplace distribution. This effect causes the sparseness of rl due to the shrinkage operator; see (d). The sparsity for ∇+Lu causes the smoothness of the objects in cartoon u (a). The smooth and sparse texture v (due to the expansion of the whole range in intensity value of its non-zero coefficients caused by ∥v∥ℓ1 in (2.1)) is illustrated in (b). Because of an additive white Gaussian noise in a simulation, the distribution of ϵ in Ω\D is approximately normal; see plot (c) and its QQ-plot (g). The reasons for the differences near the boundary in (g) may be as follows. (1) A simulation of a Gaussian noise, i.e. the tail of a Gaussian noise does not go to infinity in a simulation. (2) The remaining texture in ϵ (due to a selection of ν). Changing variable w = g causes a non-sparsity in g and a sparsity in w (which is shrunk from g); c.f. (e) and (f).
Note that the DG3PD inpainting model (2.1) can be described as an inverse problem in image restoration (e.g. image inpainting and denoising); see figure 10 for the discrete innovation model.
Figure 10. Overview of the discrete innovation model for the DG3PD-based image inpainting and denoising. In this study, a blurring operator H is an identity matrix. Note that the does not exist, in general, but the reconstructed image is obtained by the function space of the inverse problem.
7. Relation between variational analysis and pyramid decomposition
In this section, we discuss connections and similarities between the proposed DG3PD inpainting model and existing work in the area of pyramid representations, multi-resolution analysis and scale-space representation. Historically, Gaussian pyramids and Laplacian pyramids have been developed for applications such as texture synthesis, image compression and denoising. Early works in this area include a paper by Burt & Adelson [42] applying Laplacian pyramids for image compression. Pyramidal decomposition schemes [42–44] can be grouped into two categories. Their main property is either lowpass or bandpass filtering. Closely related are transforms from multi-resolution analysis, including wavelet, curvelet [47,53], contourlet [61] and steerable wavelet [62]. A difference in the proposed DG3PD inpainting model is that its basis elements for obtaining the decomposition are constructed by discrete differential operators and the decomposition is solved in a nonlinear way which adapts to each image (enabled by update iterations, corresponding to the loop in figure 15). The discussion of commonalities and differences is made more precise and detailed in the rest of the section.
Let us denote as a discrete Fourier transform pair. Given z1 = ejω1 and z2 = ejω2 and the Dirac delta function δ( · ), the impulse responses of the discrete directional derivative operators (l = 0, …, L − 1) and their spectra are
— the forward operator:
— the backward operator:
7.1. The ‘u-problem’
In order to describe the variational analysis for the u-problem from the point of view of filtering in the Fourier domain for a pyramid decomposition scheme, we define the discrete inverse Fourier transform of and in the ‘u-problem’ in [35] as
Given β1 = c1β4, the spectra and impulse responses of the scaling and frame functions in (7.1) (l = 0, …, L − 1) are
— scaling function φ( · ) and its dual with the interpolant ,
— frame function ψl( · ) and its dual ,
— proximity operator for frame elements, i.e. Shrink( · , · ).
Note that the frame soft-thresholding FST( · , · , · ) in (7.1) consists of frame element ψl, its dual and shrinkage operator Shrink( · , · ) together with the updated Lagrange multiplier. Although FST( · , · , · ) is similar to the wavelet/curvelet shrinkage operator, this operator is obtained from the calculus of variation.
It is easy to obtain the unity property of frame elements in the Fourier domain as

Figure 11. The spectra of frame functions in the ‘u-problem’ and the ‘g-problem’ with the parameters: L = S = 4, β1 = β4 = 0.04, β2 = β3 = 0.3 and . The unity properties for these frame functions are and Ξa(z) + Ψa(z)Ψ′a(z) = 1, a = 0, …, S − 1.

Figure 12. The spectrum of the frame functions in the u and g problems with the parameters L = S = 8, β1 = β4 = 0.04, β2 = β3 = 0.3 and ; see figure 13 for more directional frame functions of the g problem.

Figure 13. The spectrum of the frame functions in the g-problem with S = 8 directions.
7.2. The ‘[gs]S−1s=0-problem’
In analogy to the u-problem in the variational framework, we define the frame operators to extract the directional features of the texture v in the pyramid decomposition scheme. The discrete inverse Fourier transforms of and in the solution of the ‘v-problem’ [35] are defined as
Similar to the u-problem, it is easy to obtain the unity property of frame elements (ξ, ψa, ψ′a) with a∈[0, S − 1] in the Fourier domain:
7.3. The ‘v-problem’
The solution to the v-problem is rewritten as
7.4. The ‘e-problem’
The noise term in equation (2.1) is suitable to capture high oscillating patterns, especially small-scale types of noise, because of the advantages of the multi-orientation and multi-scale in a curvelet domain and the supremum norm. For a convenient calculation, we introduce a variable e as a residual ϵ in a known domain Ω\D. The explanation for using the supremum norm of ϵ in the curvelet domain can be found in (3.4). In particular, there are two terms in (3.4), including a residual in Ω\D with an updated Lagrange multiplier λ5, i.e. (χcD · × ϵ − λ5/β5), and its curvelet smoothing term at a level ν, i.e. CST(χcD · × ϵ − λ5/β5, ν). Assume that, at iteration t, there are some remaining signals, e.g. texture, in residual ϵ in Ω\D. The curvelet soft-thresholding operator CST( · , · ) reduces noise (or small-scale objects) in (χcD · × ϵ − λ5/β5) at a level ν in different scales and orientations. By a subtraction operator in (3.4), e at iteration t contains mainly noise or small-scale objects (figure 14).
Figure 14. The effect of a noise measurement on Ω\D in (2.1) by introducing a new variable e = χcD · × ϵ. The first and second rows are the images e1, e2 and e and their one-dimensional signals along the red lines at iteration 20, respectively. Similarly, the third and fourth rows are at iteration 100. We observe that the cartoon and texture still remain in e1 and its curvelet smoothing term e2 at iteration 20. However, these geometrical signals are reduced in their difference e as a result of a subtraction operator. At iteration 100, there is almost no cartoon in e1, e2 and e. Note that there is some information in the missing domain D due to the updated λ5.
7.5. The ‘ϵ-problem’
At iteration t, we denote the updated residual by the unity condition as

Figure 15. The pyramid scheme for the DG3PD inpainting model.

Figure 16. The three-dimensional filters for the fingerprint image in figure 17a by DG3PD with the same parameters as in figure 18. The linear filters are different and unique (in terms of computation) to obtain a ‘good’ approximated solution. The scheme of filtering design depends on the characteristics of the image. The variational method will automatically design a filter in a ‘nonlinear way’ (due to minimization via an iterative method) and adapt to the characteristics of each image. MSE, mean squared error.

Figure 17. (a–h) Decomposition by DG3PD applied with the same filters as in figure 19 (in a linear convolution) for a similar fingerprint image to that in figure 18. We observe that applying the same linear filter for a ‘similar’ image cannot give a good result for the decomposition. Thus, the linear filters are ‘unique’ for every image to obtain a suitable decomposition. We apply DG3PD with an iterative method to obtain the decomposition in (i–o) with the same parameters as in figure 19, and MSE for the reconstructed image (i) is much smaller than that in (b); see figure 16 for its three-dimensional linear filters.

Figure 18. To test the directional property of texture and the piece-wise smooth cartoon in a decomposition, we apply DG3PD [35] for a fingerprint image. Parameters: β4 = 0.03, θ = 0.9, L = 100, S = 4, c1 = 1, c2 = 1.3, cμ1 = cμ2 = 0.03, Iter = 200 and ν = 15. Cartoon u is similar to a lowpass signal and directional texture [vs]3s=0 is similar to directional bandpass signals. Traditionally designing a linear filter is difficult to obtain a smooth lowpass signal with a sharp edge as in (b), because it is difficult to define frequencies (i.e. location in the Fourier domain) and their magnitude to approximate a cartoon-like image; see figure 19.

Figure 19. (a–f) The Fourier spectra of u, ϵ and [vi]3i=0 in figure 18 obtained by DG3PD [35]. Given a spectrum F(z) of the original image, their corresponding lowpass, highpass and directional bandpass in (g–l) are defined as and [BPs(z)]3s=0 = [Vres(z)]3s=0/F(z); see (m)–(r) for their three-dimensional filters. The error in (s) is due to the iterative method in ALM. In theory, when the number of iterations goes to infinity, the solution of a decomposition satisfies the condition of the perfect reconstruction. Given N = 200 iterations, MSE for the unity condition of all filters is small enough, i.e. it closely satisfies the unity property and the reconstructed signal well approximates the original one.
8. Conclusion
In this paper, we have addressed the very challenging task of restoring images with a large number of missing pixels, whose existing pixels are corrupted by noise and, importantly, the image to be restored contains both cartoon and texture elements. The proposed DG3PD model for inpainting and denoising can cope with this threefold difficult problem. The task of simultaneous inpainting and denoising for cartoon and texture components is solved by DG3PD decomposition, followed by inpainting and denoising both components separately and, finally, image restoration by synthesis of the restored components. More specifically, the DG3PD inpainting model is based on a regularization in the Banach space in a discrete setting which is solved by the augmented Lagrangian method and the alternating direction method of multipliers.
In summary, the decomposition step of the proposed method has two major advantages for tackling this challenging problem. Firstly, the decomposition step denoises simultaneously the cartoon and texture component. Secondly, it allows the cartoon and texture to be handled in different ways. The cartoon image is inpainted by DG3PD, as described in §§2 and 3. Separately, the texture component is inpainted followed by further denoising as described in §4. Therefore, the proposed DG3PD decomposition can also be understood as breaking the full problem down into ‘smaller’ subproblems (e.g. texture only inpainting) which are easier to solve.
Image restoration (or image denoising and inpainting) can be understood as an inverse problem and it can be described by a discrete innovation model (figure 10). The assumption in this procedure is that signal has a sparse representation in suitable transform domains. It is known from probability theory that the sparsity (by ℓ1) is connected to the Laplace distribution of the signal, which results in the heavy tail distribution. Note that the Laplace distribution with the ℓ1 norm is one of the best approximations of the sparsity by the ℓ0 norm.
Moreover, by choosing the priors and posterior according to the Bayesian framework and an MAP, we can understand the selection of parameters (see figure 9 for the results of the selection of the heavy-tailed distribution, e.g. the Laplace distribution). Note that the MAP requires an assumption on the noise, e.g. Gaussian or Laplacian, to establish a minimization. In our proposed model, there is no requirement for an assumption on noise distribution, e.g. independent identical distributed, or correlated and Gaussian or non-Gaussian (due to the measurement , which is similar to the Dantzig selector [60]).
Data accessibility
Data are available from https://figshare.com/s/76d30e70e6e131da2a08.
Author's contributions
D.H.T. and C.G. wrote the manuscript and conducted the experiments. Both authors gave their final approval for publication.
Competing interests
We declare we have no competing interests.
Funding
D.H.T. is supported by the National Science Foundation under grant DMS-1127914 to the Statistical and Applied Mathematical Sciences Institute.
Acknowledgments
C.G. gratefully acknowledges the support of the Felix–Bernstein Institute for Mathematical Statistics in the Biosciences and the Niedersachsen Vorab of the Volkswagen Foundation.
Footnotes
References
- 1
Gonzalez RC, Woods RE . 2002Digital image processing. Upper Saddle River, NJ: Prentice Hall. Google Scholar - 2
Szeliski R . 2011Computer vision: algorithms and applications. London, UK: Springer. Crossref, Google Scholar - 3
Sonka M, Hlavac V, Boyle R . 2008Image processing, analysis, and machine vision. Toronto, Canada: Thomson. Google Scholar - 4
Steger C, Ulrich M, Wiedemann C . 2008Machine vision algorithms and applications. Weinheim, Germany: Wiley. Google Scholar - 5
Davies ER . 2012Machine vision: theory, algorithms, practicalities. Waltham, MA: Academic Press. Google Scholar - 6
Ma J, Plonka G . 2010The curvelet transform. IEEE Signal Process. Mag. 27, 118–133. (doi:10.1109/MSP.2009.935453) Crossref, ISI, Google Scholar - 7
Gottschlich C . 2012Curved-region-based ridge frequency estimation and curved Gabor filters for fingerprint image enhancement. IEEE Trans. Image Process. 21, 2220–2227. (doi:10.1109/TIP.2011.2170696) Crossref, PubMed, ISI, Google Scholar - 8
Gottschlich C, Schönlieb C-B . 2012Oriented diffusion filtering for enhancing low-quality fingerprint images. IET Biometrics 1, 105–113. (doi:10.1049/iet-bmt.2012.0003) Crossref, ISI, Google Scholar - 9
Bartůněk JS, Nilsson M, Sällberg B, Claesson I . 2013Adaptive fingerprint image enhancement with emphasis on preprocessing of data. IEEE Trans. Image Process. 22, 644–656. (doi:10.1109/TIP.2012.2220373) Crossref, PubMed, ISI, Google Scholar - 10
Marukatat S . 2015Image enhancement using local intensity distribution equalization. EURASIP. J. Image. Video. Process. 2015, 1–18. (doi:10.1186/s13640-015-0085-2) Crossref, ISI, Google Scholar - 11
Yang J, Wright J, Huang TS, Ma Y . 2010Image super-resolution via sparse representation. IEEE Trans. Image Process. 19, 2861–2873. (doi:10.1109/TIP.2010.2050625) Crossref, PubMed, ISI, Google Scholar - 12
Vese LA, Osher S . 2004Image denoising and decomposition with total variation minimization and oscillatory functions. J. Math. Imaging Vis. 20, 7–18. (doi:10.1023/b:jmiv.0000011316.54027.6a) Crossref, ISI, Google Scholar - 13
Buades A, Coll B, Morel JM . 2005A review of image denoising algorithms, with a new one. Multiscale Model. Simul. 4, 490–530. (doi:10.1137/040616024) Crossref, ISI, Google Scholar - 14
Chang SG, Yu B, Vetterli M . 2000Adaptive wavelet thresholding for image denoising and compression. IEEE Trans. Image Process. 9, 1532–1546. (doi:10.1109/83.862633) Crossref, PubMed, ISI, Google Scholar - 15
Schoenlieb C-B . 2009Modern PDE techniques for image inpainting. PhD thesis, University of Cambridge, Cambridge, UK. Google Scholar - 16
Battisti F, Carli M, Neri A . 2012Image forgery detection by using no-reference quality metrics. In Proc. SPIE 8303, Media Watermarking, Security, and Forensics 2012, Burlingame, CA, 22–26 January 2012, 83030K. Bellingham, WA: SPIE. Google Scholar - 17
- 18
Joshi N, Kang SB, Zitnick CL, Szeliski R . 2010Image deblurring using inertial measurement sensors. In Proc. Special Interest Group on Computer Graphics and Interactive Techniques Conference (SIGGRAPH), Los Angeles, CA, 26–30 July 2010. New York, NY: ACM. Google Scholar - 19
Rudin L, Osher S, Fatemi E . 1992Nonlinear total variation based noise removal algorithms. Physica D 60, 259–268. (doi:10.1016/0167-2789(92)90242-F) Crossref, ISI, Google Scholar - 20
Shen J, Chan TF . 2002Mathematical models for local nontexture inpaintings. SIAM J. Appl. Math. 62, 1019–1043. (doi:10.1137/S0036139900368844) Crossref, ISI, Google Scholar - 21
Chan TF, Shen J . 2001Non-texture inpainting by curvature-driven diffusions (CDD). J. Vis. Commun. Image. Represent. 12, 436–449. (doi:10.1006/jvci.2001.0487) Crossref, ISI, Google Scholar - 22
Starck J-L, Elad M, Donoho DL . 2005Image decomposition via the combination of sparse representations and a variational approach. IEEE Trans. Image Process. 14, 1570–1582. (doi:10.1109/TIP.2005.852206) Crossref, PubMed, ISI, Google Scholar - 23
Elad M, Starck J-L, Querre P, Donoho DL . 2005Simultaneous cartoon and texture image inpainting using morphological component analysis (MCA). Appl. Comput. Harmon. Anal. 19, 340–358. (doi:10.1016/j.acha.2005.03.005) Crossref, ISI, Google Scholar - 24
Esedoglu S, Shen J . 2002Digital inpainting based on the Mumford-Shah-Euler image model. Eur. J. Appl. Math 13, 353–370. Crossref, ISI, Google Scholar - 25
Mumford D, Shah J . 1989Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure Appl. Math. 42, 577–685. (doi:10.1002/cpa.3160420503) Crossref, ISI, Google Scholar - 26
Ambrosio L, Tortorelli VM . 1990Approximation of functional depending on jumps by elliptic functional via Γ-convergence. Commun. Pure Appl. Math. 43, 999–1036. (doi:10.1002/cpa.3160430805) Crossref, ISI, Google Scholar - 27
Giorgi ED . 1991Some remarks on Γ-convergence and least squares method. Compos. Media Homogenization Theory 5, 135–142. (doi:10.1007/978-1-4684-6787-1_8) Crossref, Google Scholar - 28
Shen J, Kang SH, Chan TF . 2002Euler's elastica and curvature-based inpainting. SIAM J. Appl. Math. 63, 564–592. (doi:10.1137/s0036139901390088) Crossref, ISI, Google Scholar - 29
Ballester C, Bertalmio M, Caselles V, Sapiro G, Verdera J . 2001Filling-in by joint interpolation of vector fields and grey levels. IEEE Trans. Image Process. 10, 1200–1211. (doi:10.1109/83.935036) Crossref, PubMed, ISI, Google Scholar - 30
Efros AA, Leung TK . 1999Texture synthesis by nonparametric sampling. In Proc. of the 7th IEEE Int. Conf. on Computer Vision, Kerkyra, Greece, 20–27 September 1999, pp. 1033–1038. New York, NY: IEEE. Google Scholar - 31
Bertalmio M, Vese L, Sapiro G, Osher S . 2003Simultaneous structure and texture image inpainting. IEEE Trans. Image Process. 12, 882–889. (doi:10.1109/TIP.2003.815261) Crossref, PubMed, ISI, Google Scholar - 32
Cai J-F, Chan RH, Shen Z . 2010Simultaneous cartoon and texture inpainting. Inverse Probl. Imag. 4, 379–395. (doi:10.3934/ipi.2010.4.379) Crossref, ISI, Google Scholar - 33
Vese LA, Osher S . 2003Modeling textures with total variation minimization and oscillatory patterns in image processing. J. Sci. Comput. 19, 553–572. (doi:10.1023/A:1025384832106) Crossref, ISI, Google Scholar - 34
Aujol J-F, Chambolle A . 2005Dual norms and image decomposition models. Int. J. Comput. Vis. 63, 85–104. (doi:10.1007/s11263-005-4948-3) Crossref, ISI, Google Scholar - 35
Thai DH, Gottschlich C . 2016Directional global three-part image decomposition. EURASIP. J. Image. Video. Process. 2016, 1–20. (doi:10.1186/s13640-015-0097-y) Crossref, ISI, Google Scholar - 36
Thai DH, Gottschlich C . 2015Global variational method for fingerprint segmentation by three-part decomposition. IET Biometrics 5, 120–130. (doi:10.1049/iet-bmt.2015.0010) Crossref, ISI, Google Scholar - 37
Thai DH, Huckemann S, Gottschlich C . 2015Filter design and performance evaluation for fingerprint image segmentation. (http://arxiv.org/abs/1501.02113) Google Scholar - 38
Aubert G, Kornprobst P . 2002Mathematical problems in image processing. New York, NY: Springer. Crossref, Google Scholar - 39
Scherzer O, Grasmair M, Grossauer H, Haltmeier M, Lenzen F . 2009Variational methods in imaging. New York, NY: Springer. Google Scholar - 40
Scherzer O (ed.). 2011Handbook of mathematical methods in imaging. New York, NY: Springer. Crossref, Google Scholar - 41
Prandoni P, Vetterli M . 2008Signal processing for communications. Boca Raton, FL: CRC Press. Crossref, Google Scholar - 42
Burt PJ, Adelson EH . 1983The Laplacian pyramid as a compact image code. IEEE Trans. Commun. 31, 532–540. (doi:10.1109/TCOM.1983.1095851) Crossref, ISI, Google Scholar - 43
Simoncelli EP, Freeman WT . 1995The steerable pyramid: a flexible architecture for multi-scale derivative computation. In Proc. of the Int. Conf. on Image Processing, Washington, DC, 23–26 October 1995, pp. 444–447. New York, NY: IEEE. Google Scholar - 44
Unser M, Chenouard N, Van De Ville D . 2011Steerable pyramids and tight wavelet frames in . IEEE Trans. Image Process. 20, 2705–2721. (doi:10.1109/TIP.2011.2138147) Crossref, PubMed, ISI, Google Scholar - 45
Khalidov I, Unser M . 2006From differential equations to the contruction of new wavelet-like bases. IEEE Trans. Signal Process. 54, 1256–1267. (doi:10.1109/TSP.2006.870544) Crossref, ISI, Google Scholar - 46
- 47
Candès E, Donoho D . 2004New tight frames of curvelets and optimal representations of objects with piecewise singularities. Commun. Pure Appl. Math. 57, 219–266. (doi:10.1002/cpa.10116) Crossref, ISI, Google Scholar - 48
Telea A . 2004An image inpainting technique based on the fast marching method. J. Graphics Tools 9, 25–36. (doi:10.1080/10867651.2004.10487596) Crossref, Google Scholar - 49
Bertalmio M, Bertozzi AL, Sapiro G . 2001Navier–Stokes, fluid dynamics, and image and video inpainting. In Proc. of the 2001 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR 2001), Kauai, HI, 8–14 December 2001, pp. 355–362. New York, NY: IEEE. Google Scholar - 50
Dabov K, Foi A, Katkovnik V, Egiazarian K . 2007Image denoising by sparse 3D transform-domain collaborative filtering. IEEE. Trans. Image Process. 16, 2080–2095. (doi:10.1109/tip.2007.901238) Crossref, PubMed, ISI, Google Scholar - 51
Thai DH . 2015Fourier and variational based approaches for fingerprint segmentation. PhD thesis, University of Goettingen, Goettingen, Germany. Google Scholar - 52
Meyer Y . 2001Oscillating patterns in image processing and nonlinear evolution equations: the fifteenth Dean Jacqueline B. Lewis memorial lectures. Boston, MA: American Mathematical Society. Crossref, Google Scholar - 53
Candès E, Demanet L, Donoho D, Ying L . 2006Fast discrete curvelet transforms. Multiscale Model. Simul. 5, 861–899. (doi:10.1137/05064182X) Crossref, ISI, Google Scholar - 54
Criminisi A, Perez P, Toyama K . 2004Region filling and object removal by exemplar-based image inpainting. IEEE Trans. Image Process. 13, 1200–1212. (doi:10.1109/TIP.2004.833105) Crossref, PubMed, ISI, Google Scholar - 55
Mumford D . 1994Bayesian rationale for energy functionals, pp. 141–153. Dordrecht, The Netherlands: Kluwer Academic. Google Scholar - 56
Kotz S, Kozubowski TJ, Podgórski K . 2001The Laplace distribution and generalizations. Berlin, Germany: Springer. Crossref, Google Scholar - 57
Kotz S, Nadarajah S . 2004Multivariate T-distributions and their applications. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar - 58
Lindsey JK, Lindsey PJ . 2006Multivariate distributions with correlation matrices for nonlinear repeated measurements. Comput. Stat. Data Anal. 50, 720–732. (doi:10.1016/j.csda.2004.09.011) Crossref, ISI, Google Scholar - 59
Haltmeier M, Munk A . 2014Extreme value analysis of empirical frame coefficients and implications for denoising by soft-thresholding. Appl. Comput. Harmon. Anal. 36, 434–460. (doi:10.1016/j.acha.2013.07.004) Crossref, ISI, Google Scholar - 60
Candès E, Tao T . 2007The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35, 2313–2351. (doi:10.1214/009053606000001523) Crossref, ISI, Google Scholar - 61
Do MN, Vetterli M . 2005The contourlet transform: an efficient directional multiresolution image representation. IEEE Trans. Image Process. 14, 2091–2106. (doi:10.1109/TIP.2005.859376) Crossref, PubMed, ISI, Google Scholar - 62
Unser M, Van De Ville D . 2010Wavelet steerability and the higher-order Riesz transform. IEEE Trans. Image Process. 19, 636–652. (doi:10.1109/TIP.2009.2038832) Crossref, PubMed, ISI, Google Scholar - 63
Unser M . 2000Sampling—50 years after Shannon. Proc. IEEE 88, 569–587. (doi:10.1109/5.843002) Crossref, ISI, Google Scholar


