The local mean decomposition and its application to EEG perception data
Abstract
This paper describes the local mean decomposition (LMD), a new iterative approach to demodulating amplitude and frequency modulated signals. The new method decomposes such signals into a set of functions, each of which is the product of an envelope signal and a frequency modulated signal from which a time-varying instantaneous frequency can be derived. The LMD method can be used to analyse a wide variety of natural signals such as electrocardiograms, functional magnetic resonance imaging data, and earthquake data. The paper presents the results of applying LMD to a set of scalp electroencephalogram (EEG) visual perception data. The LMD instantaneous frequency and energy structure of the EEG is examined, and compared with results obtained using the spectrogram. The nature of visual perception is investigated by measuring the degree of EEG instantaneous phase concentration that occurs following stimulus onset over multiple trials. The analysis suggests that there is a statistically significant difference between the theta phase concentrations of the perception and no perception EEG data.
1. Introduction
The nature of information is one of the central issues in modern science. Given a signal, or a set of data, how do we extract information from it, or give meaning to it? Fourier analysis is a particularly powerful method of describing signals in terms of their energy and frequency. Representing a signal as being the sum of a set of plane waves, each of which has a constant phase, frequency, and envelope, can often be very revealing. For example, Fourier analysis can show that an apparently very complicated quasiperiodic signal consists merely of the sum of two tones with incommensurate frequencies. However, it is generally the case that complicated amplitude and frequency modulated signals, such as most biological signals, have correspondingly complicated Fourier spectra. Such signals can also be represented as being the product of an envelope signal and a frequency-modulated signal, or as the sum of a finite set of such product functions. A well-behaved and physically meaningful time-varying instantaneous frequency can then be calculated from the frequency modulated signal.
The local mean decomposition (LMD) was developed recently by the author to decompose amplitude and frequency modulated signals into a small set of product functions, each of which is the product of an envelope signal and a frequency modulated signal from which a time-varying instantaneous phase and instantaneous frequency can be derived. The instantaneous frequency and envelope values can then be plotted together in the form of a demodulated signal time–frequency representation (tfr). Although a form of instantaneous frequency can be calculated from the spectrogram (the conditional mean frequency, investigated by Loughlin 1999), it is window-dependent. The conventional method of deriving a time-varying frequency from a signal is to use the so-called analytic signal developed by Gabor (1946). The instantaneous frequency is defined as being the derivative of the phase of the analytic signal. The problem with this approach is that, particularly for amplitude and frequency modulated signals, the resulting instantaneous frequency can be wildly erratic, even negative at times (Mandel 1974; Melville 1983; Boashash 1992). Such an instantaneous frequency is extremely hard to interpret, and is of debatable physical significance. Consequently, the analytic signal instantaneous frequency has been of limited practical utility. By contrast, the new LMD scheme is a robust, and conceptually simple way of analysing complicated data in terms of time-varying frequency, phase and energy. Most importantly, the results of the LMD analysis are physically plausible; the instantaneous frequency generally corresponds to the oscillatory frequency we can actually see in the signal.
The LMD method can be used to analyse any data, but is of particular relevance with regard to the analysis of amplitude and frequency modulated biological signals. It could be used, for example, to analyse changes in blood pressure, or to detect small fluctuations in the frequency of the electrocardiogram (ECG). In neuroscience, potential applications include the analysis of datasets as diverse as action potential spike-trains and functional magnetic resonance imaging (fMRI) data. Perhaps one of the most challenging signals to analyse in terms of frequency and energy is the electroencephalogram (EEG) signal. The complexity of the EEG makes it highly suitable as a test signal for analysis using LMD. Furthermore, the decomposition of EEG data using LMD into a set of product functions with associated time-varying frequencies is of particular interest, because various frequency bands of the EEG have been correlated with particular cognitive states (Basar et al. 1999; Makeig et al. 1999; Kahana et al. 2001). For this paper, the LMD method was applied to a set of EEG visual perception data. A subject viewed a screen showing a running figure made up of black dots on a white background. Depending on the number of distractor dots, the subject either perceived, or did not perceive the running figure. Several issues regarding the nature of perception are addressed using the LMD results. Recent research has focussed on the role of gamma oscillations in perception (Desmedt & Tomberg 1994; Tallon-Baudry et al. 1996, 1997; Rodriguez et al. 1999). In this paper the general EEG frequency–energy structure of the response to the experimental stimulus is examined. In addition, the significance of differences in the degree of instantaneous phase concentration between perception and no perception trials are assessed statistically using the ratio of the corresponding circular variances.
2. The local mean decomposition
Essentially the LMD scheme involves progressively separating a frequency modulated signal from an amplitude modulated envelope signal. This separation is achieved by smoothing the original signal, subtracting the smoothed signal from the original signal, and then amplitude demodulating the result using an envelope estimate. The envelope estimate and the smoothed version of the original signal are both obtained by using moving averaging weighted by the time-lapse between the successive extrema of the original signal. If the resulting signal does not have a flat envelope, the procedure is repeated until a frequency modulated signal with a flat envelope is obtained. The instantaneous frequency can then be calculated from the frequency modulated signal. The corresponding envelope estimates are multiplied together to form a final envelope. This envelope is then multiplied by the frequency modulated signal to form a product function, which is subtracted from the original signal. The whole process is then repeated on the resulting signal, to produce a second product function, with an associated envelope and instantaneous frequency. This decomposition continues until the remaining signal contains no more oscillations. The resulting instantaneous frequency and envelope values can be plotted together in the form of a demodulated signal tfr.
The LMD of a signal is accomplished by progressively smoothing the signal using moving averaging. This averaging is weighted by the distance between the successive extrema of the signal in the following way. Consider the sample portion of EEG data shown in figure 1a. The first step of the decomposition involves calculating the mean of the maximum and minimum points of each half-wave oscillation of the signal. So the ith mean value mi of each two successive extrema ni and ni+1 is given by
So the initial envelope estimate shown in figure 1b is denoted by a11(t), and the initial mean shown in figure 1a is denoted by m11(t). m11(t) is subtracted from the original data x(t),
The demodulated signal tfr of the sample EEG data (figure 4) is markedly different from the representation provided by the spectrogram of the same data (figure 5). Details of the calculation of the spectrogram can be found in §3.1. The product functions should perhaps be regarded as being ‘best fit’ functions; they are the best match of the product of a frequency modulated signal and an envelope to the data. The aim of the decomposition has been to produce a continuous positive time-varying frequency. Inspection of the EEG sample signal (figure 1a) shows that it contains both high and low frequency oscillations during the period of the experimental stimulus from 0 to 0.5 s. This is reflected in the variation in the instantaneous frequency values corresponding to the various product functions following stimulus onset (figure 3). The instantaneous frequency values track the frequency of the oscillations that we actually see in the data (figure 1a). We can see that the evoked response to the stimulus constitutes a low frequency oscillation, and this is reflected in the dip in the instantaneous frequency values during this period. It is these large continuous variations in instantaneous frequency which distinguish the LMD approach from the Fourier-based description of the same signal. Such a decomposition could generate new information about the frequency and energy structure of the EEG and other biological signals. It should be stressed that it is not the intention to present the LMD scheme as being superior to Fourier analysis. LMD constitutes an alternative representation of the data in terms of continuously varying positive instantaneous frequency and envelope values.
It is important to point out that the EEG sample data was Fourier band-pass filtered between 1 and 45 Hz. If the data had been filtered between 1 and 20 Hz, for example, then the LMD instantaneous frequency and envelope results would have been somewhat different. It can be argued that such Fourier filtering fundamentally alters the nature of the waveforms of the data, and should therefore be avoided. In the case of EEG data, where there is known 50 Hz narrow-band noise in the data, it is, however, legitimate to remove such added noise using a narrow band-pass Fourier filter.
Finally, it should be noted that the LMD scheme, implemented in this paper using Matlab, is computationally expensive compared to, for example, the short-time Fourier transform (STFT). It took 20 iterations to generate the frequency modulated signal shown in figure 2c. However, as mentioned above, for practical purposes it is not necessary to iterate until the envelope of the frequency modulated signal is exactly equal to one. If the envelope is approximately equal to one, then the extrema of the frequency modulated signal can simply be set equal to ±1. Another approach to reducing the computation time could be to alter the moving averaging of the smoothing regime described in the electronic supplementary material.
3. Application to EEG data
This part of the paper concerns the analysis of a set of scalp EEG data recorded from a single subject during a perception experiment. The first section describes the methods used to analyse the data, and the nature of the perception experiment. The second section looks at the time–frequency–energy structure of the EEG as revealed by various different methods of analysis. The third section examines differences between the instantaneous phase concentrations of the perception and no perception EEGs following the onset of the experimental stimulus.
3.1 Methods
The EEG data used in this study was recorded from 64 electrodes attached to the scalp of a single male subject. The EEG sampling rate was 1000 Hz and the data was Fourier band-pass filtered between 1 and 45 Hz. For this paper only the data from the three occipital electrode sites O1, O2 and OZ, the three parietal electrode sites P3, P4 and PZ, and the three frontal electrode sites F3, F4 and FZ has been analysed. The visual perception experiment involved the subject viewing a computer screen showing a running figure composed of black dots on a white background. The screen to eye distance was 55 cm. A variable number of randomly distributed distractor dots obscured the motion of the running figure. If the subject was able to perceive the running figure, he pressed a button, and the number of dots was increased by one for the next trial, otherwise the number of dots was reduced by one. The image on the screen lasted for 0.5 s and there were 2.7 s between each trial. There were 101 trials for which the subject perceived the running figure, and 96 trials for which the running figure was not perceived.
For this paper the EEG data for each trial was decomposed using LMD into four product functions. Each demodulated signal tfr was created using envelope and instantaneous frequency information associated with these four product functions. Having derived the instantaneous frequencies, for display purposes it is necessary to give them some spread or ‘width’ in the time–frequency plane. The average demodulated signal tfrs in figure 7 were produced by dividing the time–frequency plane into frequency bins, each of width 1 Hz. For all the instantaneous frequency values allocated to a particular bin at a particular time, the average of the corresponding envelope values was calculated. A similar approach was used to create the phase concentration and circular variance ratio plots in figure 9. A smoother presentation of the demodulated signal tfr can be achieved by the convolution of the envelope values with a Gaussian in the time–frequency plane. This approach was used to produce figure 4.
The demodulated signal tfr results have been compared with corresponding results obtained using the spectrogram, and, in figure 6, the squared Wigner–Ville distribution (WVD), and the Hilbert spectrum developed by Huang et al. (1998). The spectrograms were calculated using a 0.6 s Gaussian window. Each Hilbert spectrum was created using the instantaneous frequency and instantaneous amplitude values from a total of five intrinsic mode functions (IMFs) generated using empirical mode decomposition (EMD).
In addition to representing the data in the form of time–frequency–energy distributions, the instantaneous phase concentration following stimulus onset has been examined. The phase concentration is given by
3.2 The time–frequency–energy structure of the EEG
Figure 6 compares four different average tfrs obtained using the experimental EEG data. Each tfr is an average of 591 tfrs of the perception and no perception EEGs recorded at the three occipital electrode sites O1, O2 and OZ. The duration of the visual stimulus was 0.5 s, and is indicated by the black line on each plot. All four of the average tfrs shown in figure 6a–d indicate that there is a strong alpha (10 Hz) oscillation in the pre-stimulus EEG. Following the stimulus at 0 s, there is a dip in the alpha energy which lasts for about 1 s (alpha blocking). The average squared WVD (figure 6b) suggests that the response to the stimulus takes the form of an amplitude modulated chirp. The peak energy of the chirp occurs at about 5 Hz, 0.2 s after stimulus onset. This result is echoed by the average demodulated signal tfr (figure 6d). The chirp-like nature of the response to the stimulus is not so apparent in the average spectrogram (figure 6a). Compared with the average WVD and the average demodulated signal tfr, the energy of the evoked response has a greater spread over the time–frequency plane of the spectrogram. Similarly, the energy of the evoked response as represented by the average Hilbert spectrum (figure 6c) is less concentrated around 0.2 s than it is for either the average WVD or the average demodulated signal tfr. It can be argued that the repeated iterations using cubic splines in the EMD of the EEG data causes a loss of amplitude and frequency information in the resulting Hilbert spectrum compared with the gentler filtering effected by LMD. The LMD approach tends to result in a tfr which retains more of the amplitude and frequency information present in the original signal than is the case for EMD and the Hilbert spectrum. Further discussion of this issue can be found in the electronic supplementary material.
In recent years, much attention has been focussed on gamma energy increases following the onset of a perceptual stimulus. The spectrogram, the Hilbert spectrum and the demodulated signal tfr all provide evidence of the existence of a high frequency gamma (30–40 Hz) band of energy in the EEGs recorded at the three occipital electrode sites. This gamma band of energy is not so apparent in the averaged tfrs of EEGs recorded at parietal and frontal electrode sites. Figure 7, for example, shows the averaged demodulated signal tfrs of the perception and no perception EEG data recorded at O1, O2 and OZ (figure 7a), P3, P4 and PZ (figure 7b) and F3, F4 and FZ (figure 7c). The averaged demodulated signal tfr of the frontal EEG data, in particular, has a noticeable beta (20 Hz) band of energy. As a comparison, spectrograms of the same data are shown in figure 8. It is interesting to note that although the demodulated signal tfr and the spectrogram of an individual segment of EEG data are quite different (figures 4 and 5), the average tfrs (figures 7 and 8) are similar.
3.3 Instantaneous phase concentration
There has recently been an increased interest in analysing the phase content of EEGs (Brandt 1997; Kolev et al. 2001; Makeig et al. 2002; Rizzuto et al. 2003). LMD instantaneous phase values can be used in a number of ways to examine the nature of the evoked response to a perceptual stimulus. For example, as described previously, the time–frequency plane can be divided into frequency bins, each of width 1 Hz. For the perception or no perception results at a particular electrode site, the average phase of all the instantaneous phase values at a particular time, for a particular frequency bin can be calculated. With regard to the nature of perception, it is particularly interesting to compare the degree to which perception and no perception phase values are synchronized following stimulus onset. An effective method of assessing the degree of phase synchronization at the various electrode sites is to measure phase concentration. Details of the calculation of phase concentration can be found in §3.1. The phase concentration plots shown in figure 9 are calculated by dividing the time–frequency plane into frequency bins each of width 1 Hz, and then calculating the phase concentration of the phase values in each bin at each instant in time. Figure 9a–c shows the phase concentrations for the perception data at the three occipital electrode sites (figure 9a), the three parietal electrode sites (figure 9b) and the three frontal electrode sites (figure 9c). Figure 9d–f shows the corresponding phase concentration plots for the no perception data. Randomization of trials was used to assess the significance of the differences between the perception and no-perception phase concentrations. Figure 9g–i shows the circular variance ratios (equation (3.3)) for the perception and no perception phase concentrations at the occipital, parietal and frontal electrode sites. Superimposed on the circular variance ratio plots are black contour lines which represent the 1% significance level. Details concerning the calculation of the significance levels can be found in the electronic supplementary material. Clearly, the most significant differences between the perception and no perception phase concentrations occur at the occipital electrode sites for the theta frequencies at 0.2 s after stimulus onset (figure 9g). A complementary set of results for the same data obtained using STFT phases corresponding to the spectrograms shown in figure 8, are shown in figure 10. Again, the most significant differences between the perception and no perception phase concentrations occur following stimulus onset. However, unlike for the LMD phases, there appears to be a significant difference between the perception and no perception STFT phase concentrations at the parietal electrode sites. The discrepancy between the results may be due to differences in the calculation of the significance levels. Alternatively, it may be the case that the STFT method is a slightly more sensitive way of detecting any differences between EEGs corresponding to different experimental conditions, than the current LMD approach. It is possible that there is some loss of phase information during the LMD of the data. However, this would be somewhat surprising in view of the fact that the LMD phase information in figure 9 appears to be more concentrated in the time–frequency plane than is the case with the STFT phases (figure 10). The STFT windowing of the data has caused the phase information to have a greater spread over the time–frequency plane compared with the LMD results, which suggests that the LMD approach ought to be a more precise method of analysing the data than the STFT.
4. Discussion
The purpose of this paper has been to describe the new LMD method and illustrate its utility in the analysis of amplitude and frequency modulated signals. LMD can separate such signals into a set of functions each of which is the product of an envelope signal and a purely frequency modulated signal from which a physically meaningful time-varying instantaneous frequency can be derived. It can be argued that this approach to decomposing signals into a set of components characterized by their frequency is particularly suitable for EEGs, since the frequency content of EEGs (gamma, alpha, theta and delta) has been correlated with specific cognitive states. It is this separation of the EEG into a small set of components, each of which could be associated with some aspect of cognitive function, which makes schemes such as EMD and LMD potentially so interesting.
Using LMD on simple test signals produces similar results to those produced using EMD. However, the following significant differences between the two schemes should be noted. Firstly, the LMD iteration process is designed to separate frequency modulated signals from their corresponding envelope signals. Effectively, the iteration process consists of flattening the envelope of the frequency modulated signal estimate (cf. Kumaresan & Rao 1999). A well-behaved, positive instantaneous frequency can then be derived from the final frequency modulated signal. By contrast, the EMD iteration process is designed to produce a function (an IMF) which can be Hilbert transformed to generate an analytic signal from which the instantaneous frequency can be derived. If the instantaneous frequency is calculated from an IMF with a time-varying envelope, it may be distorted. In view of the somewhat notorious nature of the analytic signal instantaneous frequency, it is perhaps sensible to try to avoid using the Hilbert transform and the analytic signal, if possible. The LMD iteration process is designed to produce a frequency modulated signal and a corresponding envelope signal without recourse to the Hilbert transform. Because the instantaneous frequency is calculated from a frequency modulated signal, it should be well-behaved. Secondly, IMFs are not the same as the LMD product functions. By definition, IMFs do not contain oscillations which do not cross zero between their successive extrema. By contrast, since the LMD product functions are the product of a frequency modulated signal and a possibly time-varying envelope signal, they may well contain oscillations which do not cross zero. Thirdly, the LMD iteration process which uses smoothed local means and local magnitudes appears to be a gentler way of decomposing the data than the cubic spline approach used in EMD. Consequently, the LMD product functions seem to retain more of the frequency and amplitude variation present in the original signal than is the case for the EMD IMFs. The average Hilbert spectrum shown in figure 6c, suggests that the EMD spline-based iteration process has caused a certain degradation of the amplitude and frequency information in the data, compared with the average demodulated signal tfr shown in figure 6d (see also electronic supplementary material, figure 3).
The main advantage of a scheme such as LMD compared with tfrs such as the spectrogram and the scalogram, is that it provides a natural, a priori way of decomposing data into physically meaningful components (the product functions) each of which can have an associated time-varying positive instantaneous frequency and envelope. These components must be physically meaningful because they contain, or are an approximation of, the oscillations that we see in the data. This is what distinguishes LMD from Fourier-based tfrs. Whereas Fourier analysis can reveal hidden relationships between components, such as an apparently complicated quasiperiodic signal actually being the sum of two components with incommensurate frequencies, LMD quantifies the oscillations that we see in the data in terms of variable frequency and energy. Such an approach can offer an alternative interpretation of data to that provided by Fourier analysis. In the case of the EEG signal shown in figure 1a, we can see that the evoked response to the stimulus at 0 s takes the form of a low frequency, large amplitude oscillation. This is reflected in the decrease in the derived instantaneous frequency values and the increase in energy following stimulus onset shown in figure 3a–h. In short, the LMD process naturally generates components with changing frequency and amplitude, and the derived instantaneous frequency and energy values track the changing frequency and energy of the oscillations that we actually see in the data. Finally, it is important to emphasize that although the average demodulated signal tfrs shown in figure 7 are similar to the average spectrograms shown in figure 8, it is in unaveraged plots such as figures 4 and 5 where the differences between the results of the LMD and the Fourier approach are most striking.
5. Conclusion
The extraction of meaning from data is fundamental to our interpretation of the physical world. The dominant method of analysing data over the past 200 years has been Fourier analysis. LMD provides an alternative method of analysing amplitude and frequency modulated signals. Such signals can be represented as being the sum of a set of functions, each of which is the product of an envelope signal and a frequency modulated signal. Representing signals in product form not only leads naturally to the definition of physically meaningful time-varying frequency and energy values, but it also offers the possibility of extracting genuinely new information from data.
The author thanks Drs C. Guy and K. Parker of Imperial College London for the many helpful discussions which contributed to the formulation of the LMD scheme. The author also thanks Dr D. H. ffytche of the Institute of Psychiatry for providing the EEG data. The author has filed for a patent on the LMD method.
Footnotes
The electronic supplementary material is available at http://dx.doi.org/10.1098/rsif.2005.0058 or via http://www.journals.royalsoc.ac.uk.
It should be noted that this definition of instantaneous frequency does not violate Heisenberg's uncertainty principle. As pointed out by Cohen (1989), in signal processing the uncertainty principle merely relates the time duration of a signal to the frequency spread of its Fourier spectrum.