Ratiometric analysis using Raman spectroscopy as a powerful predictor of structural properties of fatty acids

Raman spectroscopy has been used extensively for the analysis of biological samples in vitro, ex vivo and in vivo. While important progress has been made towards using this analytical technique in clinical applications, there is a limit to how much chemically specific information can be extracted from a spectrum of a biological sample, which consists of multiple overlapping peaks from a large number of species in any particular sample. In an attempt to elucidate more specific information regarding individual biochemical species, as opposed to very broad assignments by species class, we propose a bottom-up approach beginning with a detailed analysis of pure biochemical components. Here, we demonstrate a simple ratiometric approach applied to fatty acids, a subsection of the lipid class, to allow the key structural features, in particular degree of saturation and chain length, to be predicted. This is proposed as a starting point for allowing more chemically and species-specific information to be elucidated from the highly multiplexed spectrum of multiple overlapping signals found in a real biological sample. The power of simple ratiometric analysis is also demonstrated by comparing the prediction of degree of unsaturation in food oil samples using ratiometric and multivariate analysis techniques which could be used for food oil authentication.

Raman spectra of five selected saturated fatty acids ranging in chain length from C14 to C22. Spectra were acquired using a 20× objective, 633 nm wavelength excitation, 10 s acquisition time and 50%/10 mW laser power, followed by smoothing, baseline subtraction and min-max scaling. Spectra are offset for clarity and each spectrum represents the mean of 3 acquisitions (solid line) with shaded standard deviation. Low wavenumber region spectra indicated that the peak position at ~1100 cm −1 (indicated by black dashed line) was sensitive to chain length (a). A closer view of this region of the low wavenumber spectra shows this shift more clearly (b). High wavenumber region spectra where the peak positions at 2850 cm −1 (C−H stretch CH2) and 2935 cm −1 (C−H stretch CH3) are highlighted with black dashed lines (c). Pink: myristic acid (C14:0); green: palmitic acid (C16:0); blue: stearic acid (C18:0); cyan: arachidic acid (C20:0); red: behenic acid (C22:0).

Figure S2
Raman spectra of five selected saturated fatty acids ranging in chain length from C14 to C22. Spectra were acquired using a 20× objective, 785 nm wavelength excitation, 10 s acquisition time and 50%/95 mW laser power, followed by smoothing, baseline subtraction and min-max scaling. Spectra are offset for clarity and each spectrum represents the mean of 3 acquisitions (solid line) with shaded standard deviation. Low wavenumber region spectra indicated that the peak position at ~1100 cm −1 (indicated by black dashed line) was sensitive to chain length (a). A closer view of this region of the low wavenumber spectra shows this shift more clearly (b). High wavenumber region spectra where the peak positions at 2850 cm −1 (C−H stretch CH2) and 2935 cm −1 (C−H stretch CH3) are highlighted with black dashed lines (c). Pink: myristic acid (C14:0); green: palmitic acid (C16:0); blue: stearic acid (C18:0); cyan: arachidic acid (C20:0); red: behenic acid (C22:0).

Table S1
Straight line fit parameters for linear regression on the plots in Figure 2, Figure S1 and Figure

Figure S5
The ratio of the peak intensity at 1665 cm −1 relative to the intensity of the peak at 1448 cm −1 (a) and the ratio of the peak intensity at ~3005 cm −1 relative to the intensity of the peak at 2850 cm −1 (b) showed poorer linear regression fits when plotted against the number of C=C bonds instead of the ratio of C=C to CH2 groups (Figure 4(b)) and H-C= to CH2 groups (Figure 4(c)) respectively. R 2 values of 0.93 using 532 nm excitation (blue), 0.98 using 633 nm excitation (red) and 0.95 using 785 nm excitation (green) were obtained for the plots in (a) while R 2 values of 0.91 using 532 nm excitation (blue), 0.88 using 633 nm excitation (red) and 0.84 using 785 nm excitation (green) were obtained for the plots in (b).

Table S2
Straight line fit parameters for linear regression on the plots in Figure 4, Figure S3 and Figure S4, along with R 2 values for each plot.

Figure S6
Estimated mean squared prediction error plotted against number of components for partial least squares regression analysis performed on a training set (70%) of a series of fatty acid spectra (3 replicates of each) for prediction of number of C=C bonds per fatty acid using low wavenumber spectra (a) and high wavenumber spectra (b).

Figure S7
Predicted number of C=C per fatty acid vs. known number of C=C per fatty acid for a partial least squares regression (PLSR) model using 2 principal components of a series of fatty acid spectra (3 replicates of each) split randomly into 70% training and 30% test data for low wavenumber spectra. Mean squared prediction error was 0.47 and R 2 for the training dataset was 0.98 and test dataset was 0.96.