A new proof of the generalized Hamiltonian–Real calculus

The recently introduced generalized Hamiltonian–Real (GHR) calculus comprises, for the first time, the product and chain rules that makes it a powerful tool for quaternion-based optimization and adaptive signal processing. In this paper, we introduce novel dual relationships between the GHR calculus and multivariate real calculus, in order to provide a new, simpler proof of the GHR derivative rules. This further reinforces the theoretical foundation of the GHR calculus and provides a convenient methodology for generic extensions of real- and complex-valued learning algorithms to the quaternion domain.


Introduction
Quaternions were introduced by William Hamilton in 1843, as an associative but not commutative algebra over reals, and have been used in many fields, such as physics, computer graphics and signal processing.
Because of the non-commutativity of quaternion product, definitions of quaternion derivative are very different from those for real and complex derivatives. For example, Sudbery [1] establishes that only linear quaternion functions fulfil the requirements of traditional derivative definition. However, such a derivative is too stringent for practical optimization problems, whereby the cost functions are often real-valued and therefore non-analytic. In order to relax the derivative condition, the recently introduced Hamiltonian-Real (HR) calculus [2] deals with both analytic and non-analytic quaternion functions within a unified framework. However, the HR calculus does not comprise effective derivative rules (product, chain) for dealing with complicated derivation operations in practical problems. By exploiting quaternion rotation, the recently introduced generalized HR (GHR) calculus embarks upon the HR calculus to provide the powerful product and chain derivative rules that facilitate the calculation of quaternion gradient and Hessian in quaternion optimization problems [3]. In particular, we note that the product rule is a distinctive feature of the functional calculi. However, the original proof of the product and chain rules for the GHR calculus is quite involved and difficult to understand for non-experts.
The aim of this paper is to further elucidate the main ideas of the GHR calculus, and to give simpler proofs of the main results in [3]. The new proof is based on the duality between the GHR calculus and multivariate real calculus. One of the advantages of such an approach is that it allows for a direct treatment of quaternion-valued functions, without an intermediate transition to real functions. Based on the so-introduced relationships, we illustrate the effectiveness of such an approach by providing a dual version of the widely linear quaternion least mean square (WL-QLMS) adaptive learning algorithm [4,5], and show that it produces the same output as the primal WL-QLMS, but with a reduced computational complexity.

Preliminaries
In the following, we state some basic definitions of the GHR calculus [3,6,7].

Definition 2.1 (Quaternion rotation [8]).
For any quaternion q, the quaternion rotation is defined as where μ is any non-zero quaternion.

Definition 2.2 (Real-differentiability [1]). A function
, f c (q) and f d (q) are differentiable functions with respect to four real components q a , q b , q c and q d of a quaternion variable q = q a + iq b + jq c + kq d .

Definition 2.3 (The GHR derivatives [3]
). If f : H → H is real-differentiable, then, the GHR derivatives of the function f with respect to q μ and q μ * (μ = 0, μ ∈ H) are defined as where q = q a + iq b + jq c + kq d , quaternion components q a , q b , q c , q d ∈ R, and ∂f /∂q a , ∂f /∂q b , ∂f /∂q c and ∂f /∂q d are the partial derivatives of f with respect to q a , q b , q c and q d , whereas the set {1, i μ , j μ , k μ } is an orthogonal basis of H.

Lemma 2.5 (Duality of quaternion gradients and real gradient vectors [7]).
For a realdifferentiable function f : H → R, the relation between the quadrivariate real gradient vector ∇ r q f = (∂f /∂q a , ∂f /∂q b , ∂f /∂q c , ∂f /∂q d ) T and the augmented quaternion gradient vector ∇ a q * f = (∂f /∂q * , ∂f /∂q i * , ∂f /∂q j * , ∂f /∂q k * ) T is given by Proof. The proof follows directly from definition 2.3.

Theorem 3.1 (Chain rule of GHR derivatives). Let S ⊆ H and let g : S → H be real-differentiable at an interior point q of the set S. Let T ⊆ H be such that g(q)
∈ T for all q ∈ S. Assume that f : T → H is real-differentiable at an interior point g(q) ∈ T. Then, the composite function h(q) = f (g(q)) satisfies the following chain rule
Proof. Using the chain rule of multivariate real calculus, we have Multiplying both sides of (3.2) by A and (A μ ) −1 , we have Using lemma 2.6 further yields This completes the proof of theorem 3.1.

Theorem 3.2 (Product rule of GHR derivatives).
If the functions f , g : H → H are real-differentiable, then the derivative of their product fg satisfies the following product rule

Proof.
Denote h(q) = f (q)g(q), f = f a + if b + jf c + kf d and g = g a + ig b + jg c + kg d . Then, Using the product rule of real calculus yields Upon multiplying both sides of (3.9) by A and (A μ ) −1 and using lemma 2.6, we have Upon extracting the first row of the matrix equation (3.10), we arrive at where (X) 1 denotes the first row of the matrix X. The first element of the above vector equation finally yields ∂h ∂q μ = f ∂g ∂q μ + ∂f ∂q gμ g. (3.13) This completes the proof of theorem 3.2.

Application example
The WL-QLMS algorithm is based on the quaternion widely linear model y(n) = w T (n) q(n), where q(n) = (x T (n), x iT (n), x jT (n), x kT (n)) T ∈ H 4N is the augmented input vector and w(n) ∈ H 4N the associated weight (parameter) vector. The cost function to be minimized is a real-valued function of quaternion variables, given by where e(n) = d(n) − y(n) is the error between the desired signal d(n) and the filter output y(n). The weight update of WL-QLMS is then given by [3] w(n + 1) = w(n) + αe(n)q * (n), (4.2) where α > 0 is the step size. Similar to the scalar duality in lemma 2.4, the duality between the augmented quaternion vector q(n) and its dual quadrivariate real vector r(n) given by [7] q(n) = Jr(n), q H (n) = r H (n)J H = r T (n)J H , where J = A ⊗ I N denotes the Kronecker product of A (cf. (2.3)) and the N × N identity matrix I N . The filter output y(n) can now be rewritten as where v T (n) = w T (n)J ∈ H 4N is an alternative augmented quaternion weight vector. Multiplying both sides of (4.2) by J and noting that J H J = 4I 4N , we have where the constant 4 in (4.5) is absorbed into the step size α.
Remark 4.1. From (4.4), we can see that the D-WL-QLMS and WL-QLMS have the same filter output. Therefore, the D-WL-QLMS has the same performance as the WL-QLMS, which is better than the strictly linear QLMS and the real LMS (RLMS) [5,11], when dealing with non-circular inputs. Table 1 shows that D-WL-QLMS also has a lower computational complexity than the WL-QLMS, owing to the use of realvalued input vector r(n) in the calculations. This is similar to the operation of the RLMS; however, the weight vector v(n) of the D-WL-QLMS is quaternion-valued, which is different from the RLMS.

Remark 4.2.
Note that if we start from y(n) = w H (n)q(n), then an alternative form of the WL-QLMS is given by w(n + 1) = w(n) + α q(n)e * (n) [3]. Denote v T (n) = w H (n)J, then the filter output becomes y(n) = w H (n)J r(n) = v T (n)r(n), that is, the same as in (4.4). Finally, the proposed D-WL-QLMS keeps the same form of update rule in (4.6), and has the same computational complexity as that shown in table 1.

Conclusion
We have provided a new proof for the GHR calculus through the duality relationships between the GHR calculus and multivariate real calculus. These results complement the original proof given in [3], are easier to understand and are physically meaningful, and thus provide additional insights into the operation of the GHR calculus. An application example in adaptive learning theory demonstrates the advantages of the proposed approach.