Three-dimensional localization of point acoustic sources using a planar microphone array combined with beamforming

This paper presents a beamforming-based acoustic imaging (BBAI) method employing a two-dimensional (2D) microphone array that not only can locate an acoustic source in the XY plane parallel to the array, but can also identify the distance between the source and array in the Z direction, denoted as the source depth, and thus provides three-dimensional (3D) localization ability. In this method, the acoustic field is reconstructed on virtual XY planes at different distances along the Z direction. The source depth is then determined according to the virtual plane providing the maximum response of the acoustic field. The location of the source in the X and Y directions of the identified virtual plane can then be easily determined based on the standard beamforming principles of a planar array. The proposed BBAI method is evaluated based on simulations involving single- and multiple-point sources, and corresponding experimental evaluations are similarly conducted in an anechoic chamber. Both simulation and experimental results demonstrate that the proposed method is capable of locating acoustic sources in 3D space.


Introduction
Beamforming [1,2] is an advanced acoustic imaging technique that has been applied effectively for localizing and identifying acoustic sources on moving objects [3], high-speed trains and civil aircrafts in aeroacoustics [4][5][6]. At present, beamforming methods employing microphone arrays combined with signal processing technology have been widely used for both two-dimensional (2D) and three-dimensional (3D) acousticsource localization in numerous fields [7][8][9]. In this paper, layer-by-layer scanning of the sound source field is achieved and thus realizes 3D acoustic-source localization and 3D sound source image output.
Beamforming methods have been extensively applied for 2D localization using planar microphone arrays, which can locate an acoustic source in the XY plane parallel to the array, but cannot identify the distance between the source and the array in the Z direction, denoted as the source depth. For example, generalized cross-correlation (GCC) beamforming has obtained precise 2D acoustic-source localization results in the time-domain [10]. Similarly, chirp Z transform (CZT) digital beamforming has been proposed for far-field acoustic-source localization in the frequency domain [11]. This method was demonstrated to overcome typical problems affecting other frequency-domain beamforming techniques such as zero-padded fast Fourier transform beamforming. In particular, the accelerated proximal gradient singular value thresholding-based linearly constrained singular canceler (APG-LCSC) algorithm [12] has been demonstrated to provide highly accurate 2D beamforming using a sparse array.
In 3D beamforming methods based on 3D microphone arrays, the concept of spherical harmonics has been employed with a spherical microphone array [13], and GCC has been employed with a polyhedral microphone array [14] for near-field reconstruction. Deconvolution based on spherical harmonics [15] and functional delay and sum (FDAS) [16] beamforming methods with spherical arrays have been shown to provide good spatial resolution and low sidelobes in the near-field. Moreover, FDAS with ridge detection (RD) and FDAS with RD and a deconvolution approach for the mapping of acoustic sources (DAMAS) [17] realized rapid acoustic-source localization as well as high resolution. Similarly, both generalized inverse beamforming (GIB) [18] and functional GIB (FGIB) [19] exhibited these characteristics using a double-layer microphone array.
In terms of acoustic sources, monopole and dipole sources are typically of great interest in aeroacoustics. For dipole sources, high-quality source maps have been established using orthogonally aligned planar microphone arrays [11,20,21] and non-planar microphone arrays [21]. For monopole sources, a planar-phased array [22] has provided good resolution for 3D acoustic imaging with Fourier deconvolution in the near field.
According to the above discussion, previous beamforming methods employing planar microphone arrays have mainly focused on acoustic-source localization on a 2D surface. While these methods provide an acoustic field hologram, they cannot determine the source depth, so they are inappropriate for 3D source localization [1,23 -29]. However, present applications are increasingly concerned with acoustic sources located on the surfaces of complex objects or on complicated structures in 3D space. Yet, research regarding 3D acoustic-source localization remains relatively rare, and beamforming methods employing 3D microphone arrays remain limited to near-field reconstruction. And also compared with the 3D microphone array [30], using the 2D planar array in this study which has been commercialized can also achieve three-dimensional recognition ability with greater adaptability. Furthermore, quantitative analyses of localization error and the influence of frequency have been rarely investigated [31,32].
Deconvolution algorithms [33,34], especially the DAMAS algorithm [35], are the main methods used in recent years and provide high precision that cannot be achieved by traditional beamforming algorithms. But, there is a new problem of deconvolution algorithm, as the misleading point replaces the position of the continuous point distribution [36]. And its inevitable iterations result in much more computational complexity than traditional beamforming. For this problem, some scholars abandoned the deconvolution algorithm, returned to the traditional beamforming algorithm and proposed some advanced beamforming algorithms, for example, orthogonal beamforming [37], robustness adaptive beamforming [38] and functional beamforming [39]. In this paper, the 3D recognition ability of the traditional beamforming algorithm is realized with its faster calculation speed, which is different from the 3D recognition of the deconvolution algorithm [40,41].
To address these issues, this paper presents a beamforming-based acoustic imaging (BBAI) method employing a planar microphone array for the localization of point sources, which are similar to monopole sources. In the proposed method, the acoustic field is reconstructed on virtual XY planes at different distances along the Z direction. The source depth is then determined according to the virtual plane providing the maximum response of the acoustic field. The location of the source in the X and Y directions of the identified virtual plane can then be easily determined based on the standard beamforming principles of a planar array. As such, the proposed method not only can locate an acoustic source in the 2D XY plane parallel to the array, but can also determine the source depth, and thus provides 3D localization ability. The localization error and the influence of frequency of the proposed BBAI method are quantitatively evaluated by simulations and corresponding experiments in an anechoic chamber involving single-and multiple-point sources and a planar microphone array in the form of a 60-channel Brü el & Kjaer WA-1558 sliced wheel array.

Experimental method
2.1. Three-dimensional localization of acoustic point sources using the BBAI method As shown in figure 1, the BBAI method reconstructs the entire 3D acoustic field on virtual planes perpendicular to the Z axis at different distances according to the spherical wave hypothesis. The virtual plane spacing is defined as DZ, a meshed virtual plane is defined as a reconstruction plane, a mesh node is defined as a reconstruction point and the spacing intervals of adjacent points along the X and Y axes of an equivalent reconstruction plane are defined as DX and DY, respectively. Acoustic field reconstruction is conducted by calculating the normalized beamforming power output at all reconstruction points, which is also referred to as the acoustic field response. Based on the spherical wave hypothesis, the distance a wave travels between an acoustic source located at (x s , y s , z s ) and a microphone is equal to the distance between (x s , y s , z s ) and that microphone. Here, we assume a planar microphone array consisting of a total of M microphones with coordinates (x m , y m , z m ), m ¼ 1, 2, . . . , M. Then, we designate the microphone denoted by m ¼ 1 as the reference microphone with coordinates (x 1 , y 1, z 1 ). The signal pressure received from the reference microphone P 1 (v) can be defined as a function of the angular frequency v of the source as follows [1]: Here, P 0 is the source strength, is the distance between the reference microphone and the source and n ¼ v/c is the wave number, where c is the propagation velocity of sound. In addition, we define D 0 m ðrÞ as the delay in the wave arrival times between the reference microphone and the mth microphone, and is given as follows: is the distance between the mth microphone and the source. Because a spherical sound wave is assumed to be radiated by the source, and the planar microphone array is far away from the source, the pressure signal received at the mth microphone (P m (v)) will undergo attenuation relative to P 1 (v), which can be expressed as follows [1]: The values of P m (v) are then employed to reconstruct the fth reconstruction point (x f , y f , z k ) on the kth reconstruction plane, k ¼ 1, 2, . . . , K, where K represents the total number of reconstruction planes. First, we define the time delay D m ðrÞ for signals associated with the distance r c between (x f , y f , z k ) and (x 1 , y 1 , z 1 ) and the distance r fm between (x f , y f , z k ) and (x m , y m , z m ) as follows: According to the principle of delay and sum, the complex normalized beamforming pressure output B(r,v) relative to the actual output on reconstruction point f is given as follows [1]: Here, g m is the weighting coefficient of microphone m. According to the triangle inequality in complex form [29], the normalized beamforming power output is obtained from equation (2.5) as follows: From equation (2.6), jBðr,vÞj 2 will be a maximum, which, in this paper, is denoted as jBðr,vÞj 2 max , only if the following condition is met: The values of jBðr,vÞj 2 max are compared for all virtual reconstruction planes, and the position of the plane with the largest value along the Z direction represents the source depth. The source location in the X and Y directions can then be easily identified based on standard beamforming principles. We considered four additional cases to verify the acoustic-source localization performance of the proposed method. For single-source conditions, the X, Y and Z positions of a source are denoted as x s , y s and z s , respectively. For multi-source conditions, the individual sources are denoted according to subscripts s1 and s2. We also consider varying values of differences in source depth between the two acoustic sources Dz s ¼ z s2 2 z s1 and separation between the two sources in the X direction Dx s ¼ x s2 2 x s1 .

Simulation procedure
Case 1: Simulations were conducted with various source depths z s to evaluate the localization capability in the Z direction for a single acoustic source. Case 2: Simulations were conducted with various differences in source depth Dz s between the two acoustic sources to evaluate the localization capability in the Z direction under multi-source conditions. Case 3: Simulations were conducted with different separations Dx s between the two sources in the X direction at an equivalent source depth z s to evaluate the localization capability in the X and Y directions under multi-source conditions.  compare the values of |B(r,w)| 2 max for each reconstruction plane to determine the source depth, and then localize the source in the X and Y directions calculate |B(r,w)| 2 according to equation (6) for all reconstruction points on the (k + 1)th reconstruction plane input the boundary conditions of the reconstruction space, DX, DY and DZ and input the acoustic source coordinates, frequency, strength and microphone array position    such as the positions of the source and array, the dimensions of the reconstruction space and the space coordinate system were equivalent to those employed in the simulation.
To evaluate the resolution of the proposed method quantitatively, we define the localization error d as follows:   3. Results and discussion

Simulation results
The single-source simulation results for jBðr,vÞj 2 are shown in figure 3, and the multi-source results are shown in figure 4. We note from the figures that both the single source and multi-sources can be effectively localized according to the position of jBðr,vÞj 2 max . The obtained relationships between jBðr,vÞj 2 max and z k for each value of z 0 at different frequencies are shown in figure 6.
The obtained relationships between jBðr,vÞj 2 max and z k for each value of 4z s are shown in figure 7. The distributions of jBðr,vÞj 2 on the XY plane obtained at a source frequency of 4 kHz are shown in figure 8.
The distributions of jBðr,vÞj 2 on the XZ or YZ planes are shown in figure 9.

Experimental results
As discussed, the experimental conditions were equivalent to the simulation conditions to provide reliable verification of the proposed localization method. As was presented in figures 3 and 4 based  Figure 11. Experimental results for two acoustic sources. (a)

Discussion
In accordance with comparison of figures 4 and 10, figures 5 and 11, the simulation and experimental results indicate that the positions of jBðr,vÞj 2 max are equivalent for both single-source and multi-source localization. By comparing figures 6 and 12, figures 7 and 13, the Z coordinates of jBðr,vÞj 2 max obtained from both simulation and experiments are also equivalent. Moreover, the source can be located in the Z direction regardless of the source frequency or source depth. In this regard, it should be noted that the resolution in the Z direction is related to the distance between the array and the source, where the resolution decreases with increasing distance. In addition, the resolution in the Z direction is related to the value of DZ, where the resolution increases with decreasing DZ, while decreasing DZ also increases the computational burden of the method, resulting in an increasing computational time. Comparing figures 8 and 14, the simulation and experimental results provide equivalent X and Y coordinate positions for jBðr,vÞj 2 max , where zk is equal to z0. This indicates that the proposed method can locate the source along the X and Y directions after determining the position along the Z direction. Therefore, the concept of the recognition direction must be derived, and the optimum condition of the surface position of the array is parallel to the measurement surface. Under this condition, these conclusions are applicable to the whole search area. The resolution for the X and Y directions is related to the distance between the array and the source in an equivalent manner as was discussed for the Z direction. Comparing figures 9 and 15, both the simulation and experimental results indicate that the resolution along the X, Y and Z directions is significantly related to the source frequency.
The experimentally obtained values of d for a single source ( f ¼ 4.0 kHz) are given in figure 16 with respect to source depth. The value of d is maintained within 15% in all directions. In particular, the value of d is less than 10% for a source depth less than 2 m. The value of d increases with increasing source depth, which reflects the discussed decrease in the resolution with increasing source depth owing to the increasing distance between the source and the array. The experimentally obtained values of d for a single source (z s ¼ 1.5 m) are given in figure 17 with respect to source frequency. The value of d is

Conclusion
This paper presented a BBAI method to locate acoustic sources using a planar microphone array. The proposed BBAI method was evaluated by simulations and physical experiments in an anechoic chamber employing single and multiple monopole acoustic sources. The results obtained from both simulations and experiments demonstrated that the proposed method can effectively locate monopole sources in 3D. Based on the obtained results, we can conclude that the localization error (d) in the X, Y and Z directions increases with increasing source depth (z 0 ) for a single acoustic source or with increasing difference between source depths (Dz s ) for two acoustic sources, and the spatial resolution correspondingly decreases. Good spatial resolution can be expected for a source depth less than 2.0 m. In addition, the values of d tended to decrease with increasing source frequency ( f s ), particularly for f s greater than 3.0 kHz, resulting in increased spatial resolution with increasing f s . Furthermore, the values of d tended to decrease as the interval between the X coordinates of the acoustic sources (Dx s ) increased, and d was maintained within 15% for Dx s greater than 0.5 m. Moreover, fluctuations in d flattened and d decreased under these conditions with increasing f s . However, the fact that the localization precision declined with increasing Dz s under multi-source localization indicates that the proposed BBAI method includes some limitations that must be addressed. Therefore, the proposed method should be subjected to further development in terms of several aspects, such as regarding multi-source identification and parameter optimization in terms of the shape and size of the focusing plane or the mesh size employed in the analysis. In addition to the influence of measurement noise, the influences of array sensor installation errors, confusion error and other measurement errors [31,32,42] should also be considered. We hope that this method can be used in more industrial applications. And next, we will try new research in the field of medical ultrasound [8] and underwater acoustic sensor [43,44].
Data accessibility. This article does not contain any additional data. Authors' contributions. H.D. conceived the basic idea, designed the study and drafted the manuscript. Q.H. refined the