Forensic efficiency estimate and phylogenetic analysis for Chinese Kyrgyz ethnic group revealed by a panel of 21 short tandem repeats

Short tandem repeats (STRs) with a high level of polymorphisms and convenient detection method play an indispensable role in human population and forensic genetics. Recently, we detected the 21 autosomal non-combined DNA index system (non-CODIS) STR loci in a Kyrgyz ethnic group, calculated their forensic parameters and analysed its genetic relationships with reference populations from China. In total, 168 alleles were observed at 21 non-CODIS STRs with corresponding allelic frequencies from 0.0016 to 0.4788. No significant deviations at these STRs were observed from the Hardy–Weinberg equilibrium. The values of cumulative power of discrimination and probability of exclusion for all the 21 non-CODIS STRs were 0.99999999999999999998835 and 0.9999994002, respectively. Furthermore, the analyses of phylogenetic trees, genetic distances and interpopulation differentiations demonstrated that the Kyrgyz group had relatively close genetic relationships with the Uygur and Kazak groups. These 21 non-CODIS STRs were characterized by high genetic diversities in the Kyrgyz group and could be applied as a robust tool for individual identification and kinship testing in forensic sciences.


Introduction
Short tandem repeats (STRs), as the most common genetic markers, well widespread in the human genome, have had a broad range of applications in DNA profiling of routine casework (especially in individual identification and paternity testing) for several decades [1][2][3][4]. Although many commercial kits of autosomal STRs have been developed [5][6][7], most of them contain the 13 overlapping loci researched by the combined DNA index system (CODIS) [8]. In forensic practice, to solve some disputed kinship testing, such as the duo parentage analysis which lacked the sample from father or mother, usually needs more non-CODIS STR loci to achieve the identifying criterion. In addition, the mutation rates of STR loci are relatively high; for this reason, the result of parentage testing tends to be complex if even one or two mismatches occur between parent and offspring. As a result, more non-CODIS STR loci are needed as a supplementary. However, the kits mentioned above are not suitable for using together as complements to maximize the distinguishability [9]. Therefore, it is meaningful to select more STR loci without the overlapping 13 CODIS core loci in the forensic applications, especially in the complicated kinship cases and missing person investigations.
The Kyrgyz group is one of the 56 ethnic groups in China and comprises a population of 186 708, which is mainly spread over Kizilsu Kirghiz Autonomous Prefecture of Xinjiang Uygur Autonomous Region, with small proportions distributed in different regions of Xinjiang; only a few remain dwelling in Fuyu County, Heilongjiang Province (all the data were taken from the Sixth National Population Census of the People's Republic of China) (http://www.stats.gov.cn/tjsj/pcsj/rkpc/6rp/indexch.htm). Their language belongs to the Altai language family, and the written language which they use today was created based on the Arabic alphabet.
There have been quite a few research works on the Uygur, Kazak and other ethnic groups in Xinjiang [10][11][12][13][14], but very few available regarding the Kyrgyz from China. To enrich the population genetic data library and explore the genetic background of the Kyrgyz, a panel of 21 non-CODIS STR loci was employed to analyse the individuals of the Chinese Kyrgyz group by comparing them with 11 previously published populations.

Sample collection and DNA extraction
Peripheral blood was extracted from 307 unrelated healthy individuals dwelling in the Kizilsu Kirghiz Autonomous Prefecture of Xinjiang Uygur Autonomous Region for more than three generations. Written informed consent was obtained from every participant. After collection of the peripheral blood, a small part of the blood, which was spread on a fresh filter and allowed to dry at room temperature, was made into a bloodstain for long-term conservation, and the remaining part was frozen for storage. This study was carried out according to the humane and ethical research principles approved by the ethical committee of Xi'an Jiaotong University Health Science Center, China (no. XJTULAC201). DNA was extracted from the bloodstain mentioned above by the Chelex-100 method [15].
S a l a r T u j i a N i n g x i a H a n G u a n z h o n g H a n T i b e t a n B a i Y i R u s s i a n M o g o l i a n K a z a k U y g u r K y r g y z Figure 1. Population structure analyses was conducted by the raw data of Kyrgyz and the 11 reference groups (K = 6).

Interpopulation differentiations
The locus-by-locus F st and p-values were calculated by the analysis of molecular variance (AMOVA) method between the Kyrgyz group and the 11 reference populations using ARLEQUIN software v. 3                           was detected at the three loci (D10S1248, D12ATA63 and D1S1627) with significant differences observed between the Kyrgyz and the other 10 groups; in contrast, the lowest ethnic diversity was detected at locus D6S474 with no significant differences observed between the Kyrgyz group and the other reference groups.

Genetic distances and population differentiations
For further study, the pairwise D A and F st values between the Kyrgyz and the other reference groups were calculated, which are not only presented in table 4 but also shown with a clustered bar chart (electronic supplementary material, figure S2). The largest two values of D A were observed between the Kyrgyz and the Yi group (0.0356) and then the Russian group (0.0252), whereas the smallest two were found between the Kyrgyz and the Uygur group (0.0083), and then the Kazak group (0.0097). Correspondingly, the F st values were ranged from 0.0002 (between the Kyrgyz and the Kazak group) to 0.0350 (between the Kyrgyz and the Russian group), which were basically in line with the D A values. The parameters in table 4 directly demonstrated that the studied Kyrgyz ethnic group and the two Central Asian populations (Kazak and Uygur ethnic groups) had close genetic relationships, which contrasted with the Yi and Russian groups, with relatively distant genetic relationships.

Phylogenetic analysis of 12 populations
The NJ tree was constructed by MEGA v. 6  case of using different software based on different data formats, the three groups from Central Asia were clustered together in the two NJ trees, which revealed that they had stronger genetic relationships than the other populations from East Asia.

Multidimensional scaling based on the pairwise F st values
As shown in figure 3, MDS was performed among the 12 populations based on the pairwise F st values and the studied Kyrgyz ethnic group was marked with a red colour. The result indicated that the 12 populations could be divided into three parts: Tibetan, Mogolian, Bai, Tujia, Guanzhong Han and Ningxia Han ethnic groups were clustered in the upper quadrant; Salar, Russian and Yi ethnic groups in the lower right quadrant; whereas, Kazak, Uygur and the studied Kyrgyz ethnic groups in the lower left quadrant. Compared with the nine East Asian ethnic groups, the Kyrgyz ethnic group had even more intimate relationships with the Kazak and Uygur ethnic groups, which indicated that the Kyrgyz group probably had close genetic relationships with the two ethnic groups from Central Asia. According to historical records, from the Western Han Dynasty to the middle of the Qing Dynasty, the Kyrgyz group, mainly stemming from the Yenisai River to the Tianshan Mountains and Central Asia, experienced five westward migrations which were basically facilitated by warfare [30]. In this study, the Kyrgyz group residing in the southwestern part of the Xinjiang Uygur Autonomous Region, China, broadly assimilated the culture of the western regions after long-term dwelling with the Uygurs, Kazaks and other ethnic groups in Xinjiang.

Conclusion
In short, the 21 non-CODIS STRs were detected in 307 individuals from the Kyrgyz ethnic group to evaluate the forensic effectiveness of these loci and to explore the genetic background of the Kyrgyz group. The present result indicated that these non-CODIS loci could be well applied in individual identification and kinship testing for their high level of genetic polymorphisms. The studies on population genetics also demonstrated that the Kyrgyz ethnic group had more similar consanguineous relationships with the Kazak and Uygur groups than the other reference groups to some extent.