Adaptive recursive algorithm for optimal weighted suprathreshold stochastic resonance

Suprathreshold stochastic resonance (SSR) is a distinct form of stochastic resonance, which occurs in multilevel parallel threshold arrays with no requirements on signal strength. In the generic SSR model, an optimal weighted decoding scheme shows its superiority in minimizing the mean square error (MSE). In this study, we extend the proposed optimal weighted decoding scheme to more general input characteristics by combining a Kalman filter and a least mean square (LMS) recursive algorithm, wherein the weighted coefficients can be adaptively adjusted so as to minimize the MSE without complete knowledge of input statistics. We demonstrate that the optimal weighted decoding scheme based on the Kalman–LMS recursive algorithm is able to robustly decode the outputs from the system in which SSR is observed, even for complex situations where the signal and noise vary over time.


Introduction
Stochastic resonance in multi-threshold systems was initially investigated in [1], where the input signal is subthreshold. More interestingly, the concept of suprathreshold stochastic resonance (SSR) was also introduced in multi-threshold systems [2][3][4]. Note that SSR is an important variation of stochastic resonance [5,6], where the output is counterintuitively enhanced by noise,  operating with signals of arbitrary magnitude, not restricted to weak or subthreshold signals. Since its introduction, SSR has received considerable attention in diverse areas concerned with the transmission of signals, and has also been considered in the design of cochlear implants [7], analogue-to-digital converter circuits [8][9][10], nonlinear detectors [11,12], digital accelerometers [13] and stochastic quantizers [14]. In the seminal works of Stocks and co-workers [2][3][4], the nonlinearity in each element of the model is assumed to be a binary quantizer. Thus, such threshold systems can be described as stochastic signal quantizers that have been analysed in terms of lossy source coding and quantization theory [14][15][16]. Following the works in [14][15][16], we investigated the decoding scheme of quantized signal, named as optimal weighted decoding, by using weighted coefficients [17,18]. Our results show that under certain conditions the performance of the optimally weighted quantizer response is superior to that of the original unweighted arrays [17,18].
However, the previous studies [17,18] have been undertaken assuming that the input characteristics are statistically stationary, and the decoding schemes are based on a priori knowledge of the inputs. Specifically, the noise is modelled as white Gaussian distributed [17,18]. However, in practical applications, the statistical characteristics of the input signals are generally unknown or are often varied with time. Furthermore, noise in real systems is coloured, and the idealization of white noise is never exactly realized [5,19]. These constraints severely limit the decoding operation based on performance enhancement of real systems. A highly successful solution to this more difficult problem is found in adaptive filtering, which is a powerful approach with a wide variety of engineering applications [20][21][22][23][24]. Adaptive filtering has the ability to adjust system parameters automatically with no a priori knowledge of inputs, and allows processing the case wherein the properties of inputs are unknown, non-stationary or time variable [25][26][27]. Interestingly, a toy model has been established that illustrates a process of optimization that works without a priori knowledge of the input statistical distribution-so we do have a precedent to show mathematical tractability of this class of problem [28].
Specifically, the Kalman filter and the least mean square (LMS) algorithm are two of the most popular adaptive estimation methods in adaptive signal processing, with the former as a realization of the optimal Bayesian estimator and the latter as a recursive solution to the optimal Wiener filtering problem [27]. Recently, Mandic et al. subtly developed a joint perspective on these two algorithms, and proposed the Kalman-LMS recursive algorithm that permits the implementation of Kalman filters without any notion of Bayesian statistics [29].
The purpose of this paper is to extend optimal-weighted decoding to more general input characteristics by using the Kalman-LMS recursive algorithm, wherein the weighting coefficients can be adaptively adjusted based on real-time measurements of the input signals. For the typical SSR model of threshold arrays, we find this Kalman-LMS recursive algorithm can deal with not only the simple situation of stationary signals, but also more complicated cases of non-stationary signals and coloured noise. The decoding performance of the mean square error (MSE) distortion obtained by the Kalman-LMS recursive algorithm illustrates interesting progress in optimal weighted SSR that may be of benefit in adaptive signal processing algorithms applied to nonlinear noisy systems.

Model
We consider an adaptively weighted summing array of N noisy nonlinear elements, as shown in figure 1. All elements receive the same input signal x k representing the kth element in the time series, which is assumed to be a deterministic signal or a stationary stochastic process [29]. Each element of the array is endowed with the same input-output characteristic, modelled by the static (memoryless) function g. The i-th nonlinear element is subject to independent and identically distributed (i.i.d.) additive noise component η i,k with standard deviation σ η , which is independent of the signal x k . Accordingly, each element produces the output signal y i,k = g[x k + η i,k ]. The output signal y i,k is multiplied by the weighting coefficient w i,k (w i,k ∈ ). w i,k are adjusted by the Kalman-LMS recursive algorithm aiming to minimize the performance metric of MSE. All weighted outputs are summed to give the overall output of arraŷ y k = N i=1 w i,k y i,k . The Kalman-LMS recursive algorithm combines Kalman filtering with LMS-type algorithms to control both the direction and magnitude of adaptation steps along the shortest path so as to achieve the global minimum of MSE [29]. Our purpose is to adaptively adjust the weights by this algorithm to make the decoding outputŷ k approximate the input x k , i.e. minimizing the MSE distortion.  . Each element operates on a common signal x k subject to additive noise η i,k at time k. The output of each individual element y i,k is multiplied by the weighting coefficient w i,k , and the overall outputŷ k = N i=1 w i,k y i,k .

Method
To begin, we introduce a column vector y k = [y 1,k , y 2,k , . . . , y N,k ] to represent the N output signals of nonlinear elements for each random input x k . We denote a vector of weights as w k = [w 1,k , w 2,k , . . . , w N,k ] and the optimal weights as w o k that corresponds to the minimum of MSE.

Performance evaluation criteria
Ideally, we wish to achieve where the aim is to estimate the optimal weight vector w o k for minimizing the MSE distortion. It can be fixed, i.e. w o k = w o , or time varying as in equation (2.1) [29]. For the stationary input signal x k and noise η k , equation (2.1) can be written as: When the inverse matrix (y k y T k ) −1 exists, the Wiener optimal weight vector w o is given by w o = (y k y T k ) −1 y k x k [16,21]. For the stationary input x k , the memoryless nonlinearity g and the stationary noise η k , we assume the optimal weight vector w o k converges to the Wiener solution of w o . We desire to estimate the optimal weight vector w o recursively, based on the existing weight vector w k−1 , the input signal x k and output signals of nonlinear elements y k , i.e.ŵ o = w k = f (w k−1 , x k , y k ). Then the decoding output iŝ y k = y k w k−1 . The error between the input x k , and the decoding outputŷ k is given by Similarly, the weight error vectorw k between the weight vector estimate w k and the optimal weight vector w o is written by [29]w and its contribution to the error e k is given by The decoding performance metric of MSE represents the power of the output error e k , and is expressed as where the statistical expectation E(·) introduced in equation (2.6) is defined in terms of the joint probability of the output y k and the weight error vectorw k . The theoretical calculation of MSE is difficult, and we will numerically obtain the MSE distortion for a sufficiently large observation time in the following experiments. For the memoryless function g, we assume the output y k at the recursion step k is not related to the weight error vectorw k−1 at the time step k − 1. Thus, the instantaneous time-varying MSE is given in [29] is the symmetric and positive semi-definite weight error covariance matrix, and the statistical expectation E(·) is calculated in terms of the probability density ofw k−1 . In [29], another performance evaluating metric-the mean square deviation J k -is introduced, which represents the power of the weight error vectorw k and is given by The mean square deviation J k is related to the instantaneous MSE in equation (2.7) through the weight error covariance matrix P k . Therefore, minimizing J k is equivalent to minimizing the instantaneous MSE [29]. Based on equation (2.8), we will deduce the weight vector estimate w k at each recursion step k.

Optimal learning gain
The LMS algorithm uses the stochastic gradient descent and uses a recursive estimation of the optimal weight vector, where μ k is a step size and ∇ w is a gradient vector [29]. Based on the instantaneous estimate E[e 2 k ] ≈ e 2 k , the LMS solution is given in [21] Note that the second term in equation (2.9) the weight update, μ k y k e k , has the same direction as the vector y k . It turns out that the gradient descent performs locally optimal steps but has no means to follow the globally optimal shortest path to the solution w o . Therefore, it is necessary to control both the direction and magnitude of the adaptive steps μ k to follow the shortest, optimal path to w o [29]. In this way, Mandic et al. introduce a positive definite learning gain matrix G k , in the context of Kalman filters, to replace the scalar step size μ k so as to control both the direction and magnitude of the gradient descent [29]. Thus, the weight update recursion in equation (2.9) generalizes to where the gain vector g k = G k y k [29]. Subtracting w o from both sides of equation (2.10) and replacing the error with equation (2.5), we can rewrite equation (2.10) in terms of the weight error vector as [29] w k =w k−1 − g k y kw k−1 . (2.11) Utilizing equation (2.11), we can obtain the recursion for the weight error covariance matrix P k as [29] Substituting equation (2.12) into equation (2.8), the mean square deviation is obtained as [29] This derivation of equation (2.13) uses the facts that tr(P k−1 y k g k ) = tr(g k y k P k−1 ) = g k P k−1 y k and tr(g k g k ) = g k g k = g k 2 .
Based on equation (2.13), the optimal learning gain vector g k can be obtained by differentiating J k with respect to g k , setting to zero and solving for g k , to give [29] This optimal gain vector is precisely the Kalman gain [22]. Substituting equation (2.14) into equation (2.12), the update for P k is then obtained

Kalman-LMS recursive algorithm
From equations (2.10), (2.14) and (2.15), Kalman-LMS recursive algorithm of estimating the optimal weights w o is outlined as: At each instant k > 0, based on measurements (x k , y k ) (1) compute the optimal learning gain (Kalman gain): where the constant δ > 0 is an initial disturbance for preventing from recursion stop. Without loss of generality, the symmetric and positive semi-definite matrix P 0 is chosen as unit matrix. (2) update the weight vector estimate: (3) update the weight error covariance matrix: The above operating process exhibits that Kalman-LMS recursive algorithm iteratively updates the weights after each sample, which requires little or no a priori knowledge of the signal or noise characteristics. A proof of convergence of the mean square deviation for stationary inputs is provided in appendix A. In the meanwhile, the fastest convergence time corresponding to the array of N static nonlinear elements in figure 1 is also analysed. Our analysis shows that the fastest convergence time is about N times sampling time t. It means that, for a given sampling time t, the larger the parallel array size N is, the longer the convergent time is. This argument will be shown in the following experiments.

Results
It is interesting to note that the above-mentioned decoding scheme using the Kalman-LMS recursive algorithm can be applied to an array composed of arbitrary nonlinear elements. Here, we consider the static function g of figure 1 as Heaviside function that is a typical threshold element [2,3]. The individual output y i,k is given by the response function where θ i (i = 1, 2, . . . , N) is the threshold level for each Heaviside function g.
In the following, we will explore two cases of input characteristics, i.e. stationary and non-stationary, to examine the MSE distortion performance of optimal weighted decoding scheme based on the Kalman-LMS recursive algorithm. In addition, for the noise components η i,k in figure 1, white Gaussian and coloured noises are considered, respectively.

Gaussian noise with identical thresholds
We first consider the case where all threshold levels in equation (3.1) are identical, i.e. θ i = θ, and the noises η i,k are Gaussian distributed.
It is well known that the stationary assumption of inputs is ideal and is inadequate for dealing with situations in which non-stationarity of the signal and or noise is intrinsic to the problem. The ability to adapt in a non-stationary environment is an important function of an adaptive algorithm. Specifically in a non-stationary environment it can offer a tracking capability for the time-varying input signal, provided that the variations are sufficiently slow [24].
We now consider the case where the input x k is non-stationary stochastic signal. For instance, x k is Gaussian distributed, but the standard deviation σ x (t) is time varying. Here, the signal standard deviation is chosen as σ x (t) = √ 2 sin(2π ft) with the modulation frequency f , and the sampling time t = 10 −3 s. Of course, we can have other forms of σ x (t) provided that it is slow time-varying. In appendix A, we investigate the tracking capability of the proposed Kalman-LMS algorithm, and an intrinsic time scale of N t is approximately provided for analysing the temporal dynamics of the adaptive process.  Unless specifically mentioned, all results in the following parts are obtained for the example case where the input signal for non-stationary characteristics is Gaussian distributed with standard deviation σ x (t) = √ 2 sin(2π t), and the input signal for stationary characteristics is Gaussian distributed with σ x = 1.
The MSE distortions for the non-stationary stochastic input are plotted in figure 3 (dashed red lines). Figure 3 clearly illustrates that the optimal value of σ η for minimizing the MSE distortion is nonzero for N > 1, and thus SSR occurs. As N increases, the MSE distortion at the optimal σ η decreases, while the optimal value of σ η will gradually increase, even to a value larger than unity for N = 63.
For comparison, the solid blue lines correspond to the MSE distortion curves for stationary Gaussian input signal with standard deviation σ x = 1. It is noted that these two inputs have been chosen to give the same average power. For a sufficient time duration, their average powers are all unity. It is clear in  figure 3 that the MSE value is far larger for the non-stationary case than stationary case. In addition, for two cases of the input characteristics, very similar qualitative behaviours are all seen. Thus, this also validates that the SSR effect is quite general, not restricted to the stationary signals considered in [17,18].

Coloured noise with identical thresholds
In the above studies, noise is assumed to be white Gaussian distributed. In the physical world, however, such an idealization is never exactly realized. The effects of various noise correlation times on stochastic resonance have been previously investigated [5,19,[30][31][32][33][34][35][36][37]. Owing to the practical importance of coloured noise, we next evaluate the optimal weighted decoding performance under the coloured noise circumstances.
We consider the model driven by an additive exponentially correlated Gaussian noise, i.e. Ornstein-Uhlenbeck noise (OU noise). The archetypal source for OU noise is given in [19] where ξ w (t) denotes Gaussian white noise with autocorrelation ξ w (t)ξ w (s) = 2δ(t − s) and D is the noise intensity. The stationary autocorrelation of ξ (t) with correlation time τ is then represented by [19] When correlation time τ → 0, equation (3.3) reproduces the white-noise source frequently used in stochastic resonance studies. Figure 4 shows the MSE distortion against the standard deviation σ η of the OU noise with different correlation time τ = 0.1 s, 1 s and 10 s. Here, the array size is N = 63. The dashed red lines represent the MSE distortion performance for non-stationary inputs, while the solid blue lines correspond to the decoding performance for stationary inputs. As the correlation time τ increases, it is seen in figure 4 that the minimum MSE value at the optimal σ η increases. The characteristic behaviour for stochastic resonance is in good agreement with results in [5,19,[30][31][32]. The reason is that, since the variance σ 2 η of the OU noise is D/τ in equation (3.3), the noise strength D increases proportionally with the increase of τ for a given σ 2 η . However, we assume the output y k at the recursion step k is not related to the weight error vectorw k−1 at the time step k − 1 in equation (2.7). This restriction will hinder the application of this adaptive algorithm to the case of coloured noise. Despite this, this adaptive algorithm still plays, and presents the corresponding MSE distortions in figure 4. It is seen that, as the correlation time τ increases, the corresponding MSE distortions increase for both the non-stationary (dashed lines) and stationary (solid lines) inputs. Under coloured noise conditions, to compare the MSE distortion performance for stationary and nonstationary input situations, figure 5 shows MSE for OU noise with the correlation time τ = 0.1 s and the array sizes N = 1, 3, 15 and 63.
Note from figure 5 that, as N increases, very similar decoding performance appears when compared with figure 3. Specifically, these two figures both indicate that the MSE is far larger for the non-stationary case than in the stationary case. The results in figure 5 show that this Kalman-LMS recursive algorithm can deal with not only the simple situation of stationary signals, but also more complicated cases of non-stationary signal buried in coloured noise.

Gaussian noise with grouped thresholds
Our recent work [18] showed that the decoding performance for multigroup parameter settings is superior to that of identical parameter settings. Following the approach presented in [18], we next study the decoding performance of the Kalman-LMS recursive algorithm for the case of grouped thresholds.
We divide the set of threshold elements into M (M ≤ N) groups. Within each group, the threshold element size is equal, i.e. N/M, and the threshold levels are equally spaced and set as θ m = mσ x /(M + 1) for m = 1, 2, . . . , M. Figure 6a exhibits the MSE distortion for Gaussian noise with various group sizes M in the cases of non-stationary and stationary inputs. Here, the array size N = 120, and the group sizes M = 1, 2, 3, 5, 10 and 120. It is apparently illustrated in figure 6a that, for small noise levels, the decoding performance greatly improves as the group size M increases. While, for very large noise levels, all of the MSE values that correspond to different groups tend to the same observation, which equals the MSE distortion of the identical threshold level setting. This fact tells us that, for weak and moderate noise intensities, the multigroup setting scheme can reduce the MSE distortion. Figure 6a also reveals that the grouped threshold setting can be extended to adaptively weighted summing arrays.

Coloured noise with grouped thresholds
It is interesting to study the performance of weighted decoding for the case of grouped thresholds under the coloured noise circumstances. Similarly, we also illustrate the MSE curves for OU noise with various group sizes M in the cases of non-stationary and stationary inputs in figure 6b. It can be seen from figure 6a,b that, as the group sizes M increase, the grouped decoding performance under the coloured noise circumstances is similar to that under Gaussian noise circumstances.

Conclusion and discussion
In this paper, we extend the optimal weighted decoding approach to more general input characteristics in the model of a weighted summing array of N noisy nonlinear elements based on the Kalman-LMS recursive algorithm. A proof of convergence of the mean square deviation for stationary inputs is derived. We especially apply the algorithm to a parallel array of threshold elements and investigate the decoding performance for inputs with stationary, non-stationary characteristics under Gaussian noise and coloured noise circumstances. In the case of stationary inputs, after successive iterations of the Kalman-LMS recursive algorithm it converges to the optimum Wiener solution in some statistical sense [24]. Our previous work [17] has shown that, for the case of identical thresholds, the optimal weighted decoding is equivalent to Wiener linear decoding. Therefore, for stationary inputs, the decoding performance exploiting Kalman-LMS recursive algorithm is consistent with that of optimal weighted decoding in the case of identical thresholds. Figure 7 clearly illustrates that the results for these two methods are the same, and MSE distortion curves completely overlap. Notably, although for the case of stationary inputs, the decoding performance of these two methods is equal under the condition of identical thresholds, the Kalman-LMS recursive algorithm is simple and generally easy to implement since the output weights are updated following each input signal sample, instead of calculating the output weights using all the training data in one shot. Therefore, compared with the optimal weighted decoding scheme [17], this Kalman-LMS recursive algorithm is a practical method for finding close approximate solutions to equation (2.10) in real time. Moreover, this Kalman-LMS recursive algorithm can adaptively adjust the weighting coefficients so as to minimize the MSE distortion without complete knowledge of input statistics and thus can be applicable to the non-stationary inputs and the coloured noise situations as shown in figures 2 and 3-6 (dashed lines). Additionally, we also apply the Kalman-LMS recursive algorithm to grouped threshold setting for minimizing MSE distortion. The obtained results show that the multigroup setting scheme can be extended to adaptively weighted summing array, and is a significant scheme for many potential applications inspired by SSR mechanism. Although the Kalman-LMS recursive algorithm is applied in the summed array of N noisy nonlinear elements, we expect that our findings may be important for future work on more complex models that will stimulate further studies on SSR.
Beyond the memoryless nonlinearity considered in figure 1, we argue that this Kalman-LMS adaptive algorithm may potentially be extended to other nonlinear systems, for instance, the Hodgkin-Huxley neuron model [5], the reduced FitzHugh-Nagumo neuron model [4,32,34], the auditory model [6,7,16] or biomedical devices [38]. However, the dynamical nonlinear system has its intrinsic time scale that controls the evolution of the system state. The system outputs at adjacent time steps are also correlated with each other to a certain extent, even the input signal or the input noise are independently identically distributed. These factors will lead to larger misalignment between the true and estimated weights. Thus, a more general Kalman-LMS adaptive algorithm for estimating the time-varying weight vector needs to be developed.