Application of fingerprint combined with quantitative analysis and multivariate chemometric methods in quality evaluation of dandelion (Taraxacum mongolicum)

A quality assessment method based on quantitative analysis of multi-components by single marker (QAMS) and fingerprint was constructed from 15 batches of dandelion (Taraxacum mongolicum), using multivariate chemometric methods (MCM). MCM were established by hierarchical cluster analysis (HCA) and factor analysis (FA). HCA was especially performed using the R language and SPSS 22.0 software. The relative correction factors of chlorogenic acid, caffeic acid, p-coumaric acid, luteolin and apigenin were calculated with cichoric acid as a reference, and their contents were determined. The differences between external standard method (ESM) and QAMS were compared. There was no significant difference (t-test, p > 0.05) in quantitative determination, proving the consistency of the two methods (QAMS and ESM). Dandelion material from Yuncheng, Shandong was used as a reference chromatogram. The fingerprints in 15 batches of dandelion were established by HPLC analysis. The similarity of the fingerprints in different batches of dandelion material was greater than or equal to 0.82. A total of 10 common peaks were identified. This strategy is simple, rapid and efficient in multiple component detection of dandelion. It is beneficial in simplifying dandelion's quality control processes and providing references to enhance quality control for other herbal medicines.

CL, 0000-0003-1583-6436; CZ, 0000-0003-2827-1999 A quality assessment method based on quantitative analysis of multi-components by single marker (QAMS) and fingerprint was constructed from 15 batches of dandelion (Taraxacum mongolicum), using multivariate chemometric methods (MCM). MCM were established by hierarchical cluster analysis (HCA) and factor analysis (FA). HCA was especially performed using the R language and SPSS 22.0 software. The relative correction factors of chlorogenic acid, caffeic acid, p-coumaric acid, luteolin and apigenin were calculated with cichoric acid as a reference, and their contents were determined. The differences between external standard method (ESM) and QAMS were compared. There was no significant difference (t-test, p > 0.05) in quantitative determination, proving the consistency of the two methods (QAMS and ESM). Dandelion material from Yuncheng, Shandong was used as a reference chromatogram. The fingerprints in 15 batches of dandelion were established by HPLC analysis. The similarity of the fingerprints in different batches of dandelion material was greater than or equal to 0.82. A total of 10 common peaks were identified. This strategy is simple, rapid and efficient in multiple component detection of dandelion. It is beneficial in simplifying dandelion's quality control processes and providing references to enhance quality control for other herbal medicines.

Introduction
Dandelion (Taraxacum mongolicum Hand.-Mazz.) is a perennial plant in the Composite family. Its flowering period is from April to October [1]. Dandelion was distributed widely in many countries. There are more than 2000 kinds of varieties of dandelion; about 70 kinds of varieties are distributed in various provinces in China [2]. The edible value, medical value and nutritional value of dandelion have been highly appraised and affirmed in Compendium of Materia Medica and other ancient medical ceremonies [3]. The edible portion of dandelion reaches 84%; the leaves of dandelion, consumed as vegetable food, contains vitamin C, vitamin D, carotene and a lot of iron, calcium and other nutrients [4]. Dandelion has been reported to slow down the damage by the effects of oxygen [5], suppress or reduce inflammation [6], fight against cancer [7], resist high concentration of sugar in the blood [8], prevent or impair coagulation [9], soothe soreness [10] and reduce the pathological reaction caused by strong stimulation of the body [11].
Dandelion is rich in phenolic compounds and flavonoids compounds, which are known to promote health [12]. At present, HPLC and HPLC-MS have been used for qualitative and quantitative analysis of the main bioactive components of dandelion [13,14]. These external standard methods (ESM) rely on relative retention time, weak ultraviolet absorption, complex background interference and other shortcomings, which limit the application of these methods [15][16][17]. Above all, ESM was unable to concurrently determine multiple components in the target sample, resulting in a complicated process and low efficiency [18]. The quantitative analysis of multi-components by single marker (QAMS) only needs to select a reference in the sample. Establishing its relationship with other components in the sample can make the simultaneous determination of the content of multiple components become feasible [19]. This could reduce the time and cost spent in the quality control of herbaceous plant products and bring about ulteriorly improving the HPLC practicability [20,21]. Therefore, QAMS has extensive adhibition to regulate the quality of traditional Chinese medicine (TCM) [22], but it has not been reported in quality control of dandelion.
Recently, researchers used the chromatographic fingerprint to analyse the quality of TCM; it has been approved by many national drug administrations (FDA, SFDA, EMA) [23]. Chromatographic fingerprint method was used to identify substitutes and adulterants according to a limited number of characteristic peaks of genuine materials [24], but the characteristic fingerprint cannot give expression to the content of the active natural ingredients of dandelion. The whole information of dandelion is blurred by the characteristic fingerprint, and the multi-components of dandelion need to be determined. The combination of characteristic fingerprint and QAMS by multivariate chemometric methods (MCM) was used to compare the similarity of dandelion fingerprint. MCM were established by hierarchical cluster analysis (HCA) and factor analysis (FA) [25], and HCA was especially performed using the R language and SPSS 22.0 software.

Chemicals and materials
A total of dandelion samples (S1-S15) were collected from different Chinese provinces. Table 1 lists the detailed local information. Six standard controls (chlorogenic acid, caffeic acid, p-coumaric acid, cichoric acid, luteolin and apigenin) with purity greater than 98% were from Chengdu MUST Biotech Co., Ltd, [26]. The HPLC grade formic acid, acetonitrile and methanol were acquired from DIKMA Technologies (Beijing, China). Other chemicals used in the experiment were from Tianjin Tianli Reagents Co., Ltd (Tianjin, China).

Instruments and chromatographic conditions
The analytical instrument was Agilent 1260 series HPLC device. Analytes were separated by Eco-silC18 column (5 µm, 250 × 4.6 mm). The HPLC system stood a flow rate of 0.8 ml min −1 ; the column temperature was settled as 35°C and the injection volume of the sample was set as 10 µl. The measurement wavelength was set at 254 nm. Mobile phase A was 0.2% phosphoric acid aqueous solution and B was acetonitrile. The elution gradient was 0-5 min, 20-27% B; 5-12 min, 27

Preparation of sample solutions
One gram of dandelion powders was accurately weighed. It was soaked into 30 ml of 70% methanolwater solution, placed in a conical flask and ultrasonication (25°C, 250 W, 60 kHz) performed for 30 min. After the extract was fully mixed and shaken, the centrifugation was carried out at a fast speed of 10 000 r.p.m. The collected supernatant was filtered with a 0.45 µm filter membrane, and the obtained sample solution could be directly analysed by HPLC.

Preparation of standard solution
The chlorogenic acid, caffeic acid, p-coumaric acid, cichoric acid and luteolin standard references were weighed and dissolved into a standard solution of 1.0 mg ml −1 with methanol. The apigenin standard reference was weighed and dissolved into 0.5 mg ml −1 solution with methanol. The mixed standard solution was procured by blending 0.2 ml of the individual stock solutions. Except the concentration of apigenin was 0.083 mg ml −1 , the other standard reference concentrations were 0.167 mg ml −1 .

Computation of relative conversion factors
There are a variety of components in the sample. Among these components, which being stable, easy to obtain and separate from other components are selected as a single marker, so that a single marker can accurately determine other multiple components. And simultaneously cichoric acid is rich in dandelion [27], thus, it is suitable for the quality indicator of dandelion. Using cichoric acid as a single marker [28], the factor ratio of a single factor marker with other analytes is ƒ si using formula (2.1) [29]. The concentration of each other analyte (C i ) in the sample could be calculated according to formula (2.2) [30], and A s is the peak area of cichoric acid and A i is the peak area of other analytes. C s is the concentration of cichoric acid and C i is the concentration of other analytes (mg ml −1 ).

Statistical analysis
The data were analysed and evaluated by a similarity evaluation system for the chromatographic fingerprint of TCM (2012, China), which was recommended by SFDA [31]. The similarity among different chromatograms was quantified by calculating the correlative coefficient. The similarity between the samples was acquired by computing the correlation coefficients of different chromatograms. R language conducts HCA according to the similarity degree of each component among different samples. IBM SPSS Statistical 22.0 software (IBM, New York, USA) applies the square Euclidean distance computing of the content of each component in the sample to perform HCA. HCA based on R language and SPSS distinguish herbal species. In order to verify the feasibility of QAMS, the other five active components in dandelion samples were determined by applying cichoric acid as an internal reference.

Screening of chromatographic conditions
The suitable extraction method and HPLC parameters were tested, and the optimal chromatographic fingerprint was finally obtained. We got the optimized extraction efficiency by three column temperatures (30°C, 35°C, 40°C), solid-liquid ratio (1 : 25, 1 : 30, 1 : 35 g ml −1 ), concentration of solvent (60%, 70%, 80%), extracting time (15, 30, 45 min). One gram of dandelion powder was soaked in 70% methanol-water ultrasonication for 30 min. It was simpler and more effective for the extraction of dandelion (table 2). Finally, the gradient solvent system consisted of 0.2% phosphoric acid in water (eluent A) and acetonitrile (eluent B) was at a column temperature of 35°C with a flow rate of 0.8 ml min −1 ; the detection wavelength was set at 254 nm. The above conditions were given the necessary best performance (reconstruction and separation) in a chromatographic fingerprint.

Linearity
Six standard solutions (chlorogenic acid, caffeic acid, p-coumaric acid, cichoric acid, luteolin and apigenin) were diluted with methanol to six different concentrations. According to the relationship between the peak area (Y) and the concentration of each analyte (X ), the partial least square method

The evaluation of quantitative analysis of multi-components by single marker and external standard method
In order to assess and validate QAMS feasibility for the determination of multi-compounds in dandelion, the contents of chlorogenic acid, caffeic acid, p-coumaric acid, cichoric acid, luteolin and apigenin in 15 batches of dandelion (S1-S15) were determined by ESM and QAMS, respectively. The relative conversion factors (RCF) (ƒ si ) between the selected reference and other references in QAMS can be affected by a change in experimental conditions, such as flow rate, column temperature and standard concentration. Therefore, ƒ si affect the final analysis result. The RCF (ƒ si ) is calculated by linear regression equation, which is relatively stable (table 5). Errors caused by instruments, reagents, experimental methods or environmental conditions in the course of an experiment are relative errors (REs). RE was built between QAMS and ESM to examine the deviations using formula (3.1). The six compound contents in dandelion between two methods are shown in table 6. The changes of RE and RSDs were within the range of 5%, and there was no significant difference (t-test, p > 0.05) in quantitative determination proving the consistency of QAMS and ESM. It was observed that among these six components, the average contents of them were 0.7456, 0.4048, 0.2242, 9.1278, 0.0566 and   mAU  1600  1550  1500  1450  1400  1350  1300  1250  1200  1150  1100  1050  1000  950  900  850  800  750  700  650  600  550  500  450  400  350  300  250  200  150  100  10 S15 S14 S13 S12 S11 S10 S9 S8 S7 S5 S4 S3 S2 S1 S6 Figure 2. HPLC characteristic fingerprints of 15 dandelion samples. 2: chlorogenic acid, 4: caffeic acid, 7: p-coumaric acid, 8: cichoric acid, 9: luteolin and 10: apigenin. S14 S15 S5 S10 S11 S7 S13 S1 S8 S2 S9 S3 S6 S4 S12 A6 A5 A4 A3 A2 A1

Quality evaluation of dandelion by fingerprint
From each of 15 batches of dandelion treatment solution was taken 10 μl for HPLC determination, and generated characteristic chromatogram of the model with 10 common peaks using the similarity evaluation system for the chromatographic fingerprint of TCM (2012) (figure 1a). Six common peaks (chlorogenic acid, caffeic acid, p-coumaric acid, cichoric acid, luteolin and apigenin) were identified through retention time compared with the mixed standard reference. The chromatographic fingerprint of mixed standard reference is shown in figure 1b. In order to obtain an eminent fingerprint, the sample (S15) of good quality is screened as the reference chromatogram. HPLC characteristic fingerprints of 15 dandelion samples are shown in figure 2. The similarity of 15 batches of dandelion samples was evaluated (table 1). As a result, their similarity values calculated were greater than or equal to 0.82, which has a high degree of fit in different regions.

Hierarchical cluster analysis and factor analysis result
For the sake of highlighting the differences of a dandelion from different areas, 15 batches of dandelion collected from different areas were classified by HCA according to their similarities. Moreover, R language and SPSS software were used for HCA. The results are shown in figures 3 and 4.
The R language heat map used the similarity degree of the contents of six active components in dandelion for HCA. The 15 batches of samples were mainly divided into two categories according to the similarity difference between luteolin and apigenin. S14, S15, S5, S10 and S11 were the mother category, and the rest of the batches were the second category. According to the similarity difference of cichoric acid content, the first group can be also divided into two categories, S14, S15, S5 as a group and S10, S11 as a group. The R language heat map refined the content difference of S13 S15 S6 S1 S14 S4 S12 S8 S7 S3 S11 S5 S10 S9 S2 2  royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210614 different components in different batches of dandelion. The contents of six active components in 15 selections of dandelion were taken as variables, and HCA was performed using SPSS 22.0 software, intergroup mean linking method and square Euclidean distance. When the square Euclidean distance was 5, it was divided into three groups: S2, S5, S9, S10, S11 as a group; S3, S7, S8, S12 as a group; S1, S4, S6, S13, S14, S15 as a group. The result corresponds to the FA ranking situation, and the batches with similar scores were classified into one group. From the two different HCA, it can be seen that the dandelion from the same province may not always be in the same category, which may be related to planting methods, harvesting methods, harvesting time and preliminary processing methods.
FA is to simplify the index through dimensionality reduction on the premise of keeping the original data information as much as possible. In this experiment, 10 common peak areas of 15 batches of samples were assessed by SPSS. The results are shown in table 7. The results of Kaiser-Meyer-Olkin (KMO) test and Bartlett test of sphericity show that KMO statistic is 0.542, Bartlett statistic of sphericity test is 45.487, and p-value is 0.000. It shows that the data have correlation and can be used for FA. FA was carried out after data conversion. The six factors were simplified into three main factors, and the load matrix of the rotated factors was obtained by orthogonal rotation with maximum variance. As can be seen from the table 7, the first three principal component eigenvalues are greater than 0.8, and the cumulative contribution rate of the difference is 89.283%. Therefore, multiple components of dandelion can be simplified into three principal components for analysis. The first major factor played a major role, and the contribution rate was 38.006%, which was mainly determined by luteolin and apigenin. The contribution rate of the second major factor was 27.720%, which was mainly determined by cichoric acid and chlorogenic acid. The contribution rate of the third major factor was 23.556%, which was mainly determined by p-coumaric acid and caffeic acid. According to the scoring coefficient of each factor after rotation, the scores of the first three main factors were calculated as F1, F2 and F3 (table 8). The comprehensive scoring model of dandelion quality, F = (38.006F1 + 27.720F2 + 23.556F3)/ 89.283, was established with the contribution rate of each major factor as the weight (table 1). The dandelion (S14) in Nanjing city, Jiangsu province, has the best quality due to the highest overall score. The overall score of dandelion in East China is higher, which is possible due to the superior natural environment conditions in this region. The terrain is mainly plain, monsoon climate and abundant water resources. Different growing environment, such as sunlight, soil and climatic conditions, havea great influence on the quality of dandelion.

Conclusion
In order to improve the quality assurance of dandelion on the basis of HPLC method, to overcome the shortage of the multi-component determination method, characteristic fingerprint combined with QAMS method was established. The similarity values of 15 selections of dandelion were calculated (greater than or equal to 0.82), which indicates that although dandelion is widely distributed, it still has a high degree of fit in different regions. The method is suitable for the determination of six active compounds in the dandelion sample. The correlation coefficient of dandelion content greater than 0.998 and RSD% less than 0.05 were determined by the single marker method and traditional ESM. HPLC-QAMS method can get as good results as ESM. The combination of fingerprint and QAMS via MCM (HCA, FA) was a comprehensive and efficient method for quality analysis and evaluation of dandelion.
Ethics. The study was approved by the College of Chemistry, Chemical Engineering and Resource Utilization, Northeast Forestry University.