Skip to main content

Population genetic structure analysis and forensic evaluation of Xinjiang Uigur ethnic group on genomic deletion and insertion polymorphisms

Abstract

Background

The Uigur ethnic minority is the largest ethnic group in the Xinjiang Uygur Autonomous Region of China, and valuable resource for the study of ethnogeny. The objective of this study was to estimate the genetic diversities and forensic parameters of 30 insertion-deletion loci in Uigur ethnic group from Xinjiang Uigur Autonomous Region of China and to analyze the genetic relationships between Xinjiang Uigur group and other previously published groups based on population data of these loci.

Results

All the tested loci were conformed to Hardy–Weinberg equilibrium after Bonferroni correction. The observed and expected heterozygosity ranged from 0.3750 to 0.5515; and 0.4057 to 0.5037, respectively. The combined power of discrimination and probability of exclusion in the group were 0.99999999999940 and 0.9963, respectively. We analyzed the D A distance, interpopulation differentiations and population structure, conducted principal component analysis and neighbor-joining tree based on our studied group and 21 reference groups. The present results indicated that the studied Xinjiang Uigur group (represented our samples from the whole territory of Xinjiang Uigur Autonomous Region) had a close relationships with Urumchi Uigur (represented previously reported samples from Urumchi of Xinjiang) and Kazak groups.

Conclusions

The present study may provide novel biological information for the study of population genetics, and can also increase our understanding of the genetic relationships between Xinjiang Uigur group and other groups.

Background

The short tandem repeats (STRs) are commonly used genetic makers in the field of forensic sciences, and single nucleotide polymorphisms (SNPs) are considered as alternative and supplementary markers to STRs (Gill 2001; Kidd et al. 2005; Tan et al. 2015; Ye et al. 2014). SNPs can be captured in smaller amplicons than STRs without stutter in the profile. Insertion-deletion polymorphisms (InDels) as biallelic polymorphic markers are considered to have potential values in forensic application due to number of advantages properties shared with the similar binary variation of SNPs, for example, smaller amplicons, lower mutation rates than STRs and widely distribute in the human genome (Phillips et al. 2007; Fondevila et al. 2012; Shi et al. 2015; Romanini et al. 2012). At present, InDels have been applied in forensic genetic applications including individual identification (Pereira et al. 2009), inferring biogeographic ancestry (Yang et al. 2005) and population genetic studies et al. (Zaumsegel et al. 2013).

The Investigator DIPplex® kit (Qiagen, Hilden, Germany) contains the following components for the simultaneous amplification of Amelogenin and 30 autosomal InDels (the genomic information regarding chromosomal localization of the 30 InDel loci was shown in Table 1). The allele length variations of the InDels range from 4 to 22 bp and all amplicons are shorter than 160 bp, which make them more suitable for highly degraded DNA samples in forensic caseworks. To date, several populations’ genetic data have been published based on this kit, e.g. Japanese, Poland and Korean groups, and so on (Nunotani et al. 2015; Pepinski et al. 2013; Kim et al. 2014).

Xinjiang Uigur Autonomous Region is located in the northwest border of China with the land of 1.6649 million square kilometers and account for one-sixth of China’s total area (Fig. 1). It lies in the heart of the ancient Silk Road which has historically experienced migration of many groups of Eastern and Western Eurasians. The Uigur, as the main nationality of Xinjiang Uigur Autonomous Region, has a population of 10.06 million in 2010 (http://www.stats.gov.cn/tjsj/pcsj/rkpc/6rp/indexch.htm). The Uigurs mainly live in Kashi which is located in the south of Tianshan Mountain, and others are scattered in Ili and Urumchi area. Uigurs have their own language and words and their language belongs to the Turkic branch of Altaic language family. The belief of the Uigurs is Islamism which has a great influence on Uigurs’ culture and custom (Shan and Deng 2012). In the present study, we obtained the population genetic data and calculated the forensic parameters of 30 InDels in the studied Xinjiang Uigur group. We also collected the population data from previously reported groups to analysis their genetic relationships including Uigurs living in different area, other groups in China, Asian, European and Amerindian groups.

Fig. 1
figure 1

A map showing the geographic location of the Xinjiang Uigur Autonomous Region, China

Methods

Sample collection and DNA extraction

A total of 136 bloodstain samples were collected from Xinjiang Uigur Autonomous Region. All volunteers resided in Xinjiang Uigur Autonomous Region for more than three generations and signed the informed consents before being involved in the study. This study was approved by Institutional Ethics Committee, Xinjiang Medical University, China. Genomic DNA was extracted from bloodstained samples using the Chelex-100 method according to Walsh et al. (1991).

Amplification and genotyping

Amplification of 30 InDel loci was performed using the Investigator DIPplex® kit on GeneAmp PCR System 9700 Thermal Cycler (Applied Biosystems, Foster City, CA, USA) according to the Investigator DIPplex handbook instruction. Amplification products were separated via capillary electrophoresis on an ABI3500 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA) according to manufacturer’s instruction. The control DNA 9948 (Promega, Madison, WI, USA) was analyzed as positive control. Genotyping results were obtained using the software GeneMapper v3.2 (Applied Biosystems, Foster City, CA, USA) by comparing to allelic ladder.

Reference groups

InDel data from 21 previously published groups including 9 Chinese groups: Beijing Han, Tibetan, Kazak, Urumchi Uigur (represented previously reported samples from Urumchi of Xinjiang) (Wei et al. 2014), Guangdong Han (Hong et al. 2013), Shanghai Han, She (Wang et al. 2014), Yi (Zhang et al. 2015) and Xibe (Meng et al. 2015); 6 Mexican groups: Chihuahua Mexican, Mexico Mexica, Jalisco Mexican, Veracruz Mexican, Yucatan Mexican and Mexican Amerindian (Martínez-Cortés et al. 2015); South Korean (Korea) (Seong et al. 2014); Dane (Denmark) (Friis et al. 2012); Two Spanish groups: Basque and Central Spanish (Martín et al. 2013); Uruguayan (Uruguay) (Saiz et al. 2014); and Hungarian (Hungary) (Kis et al. 2012) were collected for population genetic analysis.

Quality control

We strictly followed International Society for Forensic Genetics (ISFG) recommendations on the analysis of the DNA polymorphisms (Schneider 2007).

Statistical analysis

Allele frequencies and forensic parameters including observed heterozygosity (Ho), Hardy–Weinberg equilibrium (HWE), match probability (MP), polymorphic information content (PIC), power of exclusion (PE), discrimination power (PD) and typical paternity index (TPI) were estimated by the modified Powerstat v1.2 spreadsheet (Promega, Madison, WI, USA). Expected heterozygosity (He) was calculated according to the formula: \(He = \frac{n}{n - 1}\left( {1 - \sum\nolimits_{i = 1}^{k} {p_{i}^{2} } } \right)\) (Nei 1978), p i was the allele frequency of allele i, k was the number of alleles and n was the number of samples. The pairwise Fst and p values were calculated by Arlequin statistical software v.3.5 (Excoffier and Lischer 2010). Principal component analysis (PCA) based on allele frequencies was evaluated in MATLAB2007a (MathWorks Inc., USA). Linkage disequilibrium (LD) analysis was performed using the SNP Analyzer V2.0 (Istech, South Korea) (Yoo et al. 2008). The D A distances were obtained using the DISPAN program (Ota 1993). According to the D A distances the neighbor-joining (NJ) tree was conducted. Population structure analysis was conducted by the STRUCTURE program (version 2.2) using Admixture Model with parameters adjusted to: burn-in-period, 100,000; run time, 100,000 steps in the Markov Chain; K values, 2–7; and iteration time, 15 (Pritchard et al. 2003; Jakobsson and Rosenberg 2007).

Results and discussion

Forensic parameter analysis

All studied loci were found to be in accordance with HWE in Xinjiang Uigur group after Bonferroni correction when the significance level was adjusted to 0.0017 (p = 0.05/30). The allele frequencies and forensic parameters of 30 InDel loci in Xinjiang Uigur group were shown in Table 1; and the raw genotyping data were shown in Additional file 1: Table S1. The Ho and He ranged from 0.3750 (HLD56 and HLD84) to 0.5515 (HLD83, HLD92 and HLD131); 0.4057 (HLD64) to 0.5037 (HLD101), respectively. The PIC, TPI, PD and PE values ranged from 0.3216 to 0.3750; 0.8000 to 1.1148; 0.5563 to 0.6513 and 0.0994 to 0.2366, respectively. The highest and lowest MP were 0.4437 (HLD64) and 0.3487 (HLD125), respectively. The combined power of discrimination (CPD) and probability of exclusion (CPE) in the group were 0.99999999999940 and 0.9963, respectively. The high CPD value demonstrates that the panel of 30 InDel loci had potential in forensic individual identification.

Table 1 Allele frequencies and forensic parameters for 30 InDels in Uigur group from Xinjiang Uigur Autonomous Region (n = 136)

Linkage disequilibrium analysis

Linkage disequilibrium has been tested for all possible combinations between each locus. The linkage disequilibrium pattern revealed by r 2 values between each locus was shown in Additional file 2: Table S2, The results showed that there was no linkage disequilibrium observed among all the loci with the values of r 2 less than 0.1, which indicated those genetic markers were relatively independent for subsequent comparison among 22 groups.

Clustering analysis

Before conducting the comparison, we had re-read the references and made sure that loci in all reference populations showed no deviation from HWE and linkage equilibrium. We analyzed the population structures of Xinjiang Uigur group (represented our samples from the whole territory of Xinjiang Uigur Autonomous Region) and 21 referenced groups and the results were shown in Fig. 2. The Asian groups were separated from both Amerindian groups and European groups at K = 2, the 5 European groups and 6 Amerindian groups constituted almost entirely by green component while 8 Asian groups by red; The Kazak, Urumchi Uigur and Xinjiang Uigur groups displayed admixture constitution of both green and red components. At K = 4, we could clearly separate Amerindian groups from European groups. Uigurs and Kazaks were much better separated from both Europeans and Asians by K = 6.

Fig. 2
figure 2

The cluster analysis of 22 groups using the STRUCTURE program based on the genotyping data of 30 InDel loci

Principal component analysis

A PCA was constructed to analyze the relationships between the Xinjiang Uigur group and other 21 groups. The result was shown in Fig. 3. The first and second component accounted for 58.95 and 23.23 %, respectively; and the cumulative contribution of the first two principal components defined 82.18 % of the total variance. In the plot figure, 5 European groups and 6 Amerindian groups located in the left part, while the 8 Asian groups located in the right part and the 3 Eurasian groups (Kazak, Urumchi Uigur and Xinjiang Uigur groups) in the central part. The Xinjiang Uigur group had short distance with the Urumchi Uigur and Kazak groups in PCA plot, which indicated the Xinjiang Uigur group had closer genetic relationships with those two groups.

Fig. 3
figure 3

PCA based on population data of 30 InDel loci of Xinjiang Uigur group and 21 reference groups

Interpopulation differentiations

We estimated pairwise Fst and p-values utilizing analysis of molecular variance method between Uigur group and previously published groups at the 30 InDel loci, which were given in Additional file 3: Table S3. The results showed that the least differences were found between the Xinjiang Uigur group and the Urumchi Uigur and Kazak groups, with significant differences at one and three loci, respectively; whereas differences were observed between Xinjiang Uigur group and other groups at 5–20 loci. The results indicated that the distribution of allele frequencies in different groups were different. Therefore, InDel would be a useful tool to study the migration patterns, geneflow, admixture and ancestry with the discovery of more available loci (Hefke et al. 2015).

D A distance

The D A distance was calculated to elucidate the genetic distance. The D A distances between Xinjiang Uigur group and other reference groups were shown in Table 2. According to the D A distances, the Xinjiang Uigur group was closest to the Urumchi Uigur group (D A  = 0.0012), and followed by the Kazak (D A  = 0.0019) group, both of them belongs to Altaic language family. The greatest distance was detected when comparing the Xinjiang Uigur group with Yucatan Mexican (D A  = 0.0353) and Mexican Amerindian (D A  = 0.0473) groups.

Table 2 The D A distances among the 22 groups based on 30 InDel loci

Phylogenetic analysis

A NJ-tree was constructed based on D A distances as presented in Fig. 4, the NJ tree showed that the Xinjiang Uigur group was first clustered with the Urumchi Uigur and Kazak groups. The result was consistent with the above mentioned results of STRUCTURE, D A distance and PCA. According to the relevant historical records, Uigurs were the descendants of ancient Uighur and with large proportion of the descent from Caucasian. Uigurs and Kazaks have common religious belief which indicated that they were likely having the same or similar origin in the process of the formation and development (Palstra et al. 2015; Xu et al. 2006). Therefore, the genetic distances could be relatively close among them. Yuan et al. (2015) studied the genetic polymorphism of 38 STR loci in Uigur group from Southern Xinjiang of China; their Fst distance results (21 loci) indicated the Uigur group was closest to Kazak, and our result was similar to theirs.

Fig. 4
figure 4

The neighbor-joining tree based on population data of 30 InDel loci of Xinjiang Uigur group and 21 referenced groups

Conclusions

In summary, the 30 InDel loci showed relatively high forensic-efficacy in the Xinjiang Uigur group and could be used in forensic individual identification, and also be used as complement for STR loci in forensic paternity testing. The result of D A distance, STRUCTURE, PCA and NJ tree indicated that the studied Xinjiang Uigur group had a close relationship with Urumchi Uigur and Kazak groups. This study provided valuable data for analysis of genetic relationship and forensic application.

References

  • Excoffier L, Lischer HE (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour 10:564–567

    Article  Google Scholar 

  • Fondevila M, Phillips C, Santos C et al (2012) Forensic performance of two insertion-deletion marker assays. Int J Legal Med 5:725–737

    Article  Google Scholar 

  • Friis SL, Børsting C, Rockenbauer E et al (2012) Typing of 30 insertion/deletions in Danes using the first commercial indel kit–Mentype® DIPplex. Forensic Sci Int Genet 6:e72–e74

    Article  Google Scholar 

  • Gill P (2001) An assessment of the utility of single nucleotide polymorphisms (SNPs) for forensic purposes. Int J Legal Med 4–5:204–210

    Article  Google Scholar 

  • Hefke G, Davison S, D’Amato ME (2015) Forensic performance of Investigator DIPplex indels genotyping kit in native, immigrant, and admixed populations in South Africa. Electrophoresis 36:3018–3025

    Article  Google Scholar 

  • Hong L, Wang XG, Liu SJ et al (2013) Genetic polymorphisms of 30 Indel loci in Guangdong Han population. J SUN Yat-sen Univ (Med Sci) 34:299–304 (In Chinese)

    Google Scholar 

  • Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 15(23):1801–1806

    Article  Google Scholar 

  • Kidd KK, Pakstis AJ, Speed WC et al (2005) Developing a SNP panel for forensic identification of individuals. Forensic Sci Int 1:20–32

    Google Scholar 

  • Kim EH, Lee HY, Yang IS et al (2014) Population data for 30 insertion-deletion markers in a Korean population. Int J Legal Med 128:51–52

    Article  Google Scholar 

  • Kis Z, Zalán A, Völgyi A et al (2012) Genome deletion and insertion polymorphisms (DIPs) in the Hungarian population. Forensic Sci Int Genet 6:e125–e126

    Article  Google Scholar 

  • Martín P, García O, Heinrichs B et al (2013) Population genetic data of 30 autosomal indels in Central Spain and the Basque Country populations. Forensic Sci Int Genet 7:e27–e30

    Article  Google Scholar 

  • Martínez-Cortés G, García-Aceves M, Favela-Mendoza AF et al (2015) Forensic parameters of the Investigator DIPplex kit (Qiagen) in six Mexican populations. Int J Legal Med 130:683–685

    Article  Google Scholar 

  • Meng HT, Zhang YD, Shen CM et al (2015) Genetic polymorphism analyses of 30 InDels in Chinese Xibe ethnic group and its population genetic differentiations with other groups. Sci Rep 5(5):8260

    Article  Google Scholar 

  • Nei M (1978) Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics 89:583–590

    Google Scholar 

  • Nunotani M, Shiozaki T, Sato N et al (2015) Analysis of 30 insertion-deletion polymorphisms in the Japanese population using the Investigator DIPplex® kit. Leg Med (Tokyo) 17:467–470

    Article  Google Scholar 

  • Ota T (1993) DISPAN: genetic distance and phylogenetic analysis. Pennsylvania State University, University Park (b35)

    Google Scholar 

  • Palstra FP, Heyer E, Austerlitz F (2015) Statistical inference on genetic data reveals the complex demographic history of human populations in central Asia. Mol Biol Evol 32:1411–1424

    Article  Google Scholar 

  • Pepinski W, Abreu-Glowacka M, Koralewska-Kordel M et al (2013) Population genetics of 30 INDELs in populations of Poland and Taiwan. Mol Biol Rep 40:4333–4338

    Article  Google Scholar 

  • Pereira R, Phillips C, Alves C et al (2009) A new multiplex for human identification using insertion/deletion polymorphisms. Electrophoresis 21:3682–3690

    Article  Google Scholar 

  • Phillips C, Fang R, Ballard D et al (2007) Evaluation of the Genplex SNP typing system and a 49plex forensic marker panel. Forensic Sci Int Genet. 2:180–185

    Article  Google Scholar 

  • Pritchard JK, Wen X, Falush D (2003) Documentation for structure software: Version 2. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.323.9675&rep=rep1&type=pdf

  • Romanini C, Catelli ML, Borosky A et al (2012) Typing short amplicon binary polymorphisms: supplementary SNP and Indel genetic information in the analysis of highly degraded skeletal remains. Forensic Sci Int Genet 6:469–476

    Article  Google Scholar 

  • Saiz M, André F, Pisano N et al (2014) Allelic frequencies and statistical data from 30 INDEL loci in Uruguayan population. Forensic Sci Int Genet 9:e27–e29

    Article  Google Scholar 

  • Schneider PM (2007) Scientific standards for studies in forensic genetics. Forensic Sci Int 165:238–243

    Article  Google Scholar 

  • Seong KM, Park JH, Hyun YS et al (2014) Population genetics of insertion-deletion polymorphisms in South Koreans using Investigator DIPplex kit. Forensic Sci Int Genet 8:80–83

    Article  Google Scholar 

  • Shan RZ, Deng MB (2012) Analysis the formation of xinjiang uygur gens source and religious beliefs. Academic forum 237. (Article in Chinese)

  • Shi M, Liu Y, Bai R et al (2015) Population data of 30 insertion-deletion markers in four Chinese populations. Int J Legal Med 129:53–56

    Article  Google Scholar 

  • Tan GK, Tee SF, Tang PY (2015) Genetic association of single nucleotide polymorphisms in dystrobrevin binding protein 1 gene with schizophrenia in a Malaysian population. Genet Mol Biol 38:138–146

    Article  Google Scholar 

  • Walsh PS, Metzger DA, Higuchi R (1991) Chelex 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material. Biotechniques 10:506–513

    Google Scholar 

  • Wang Z, Zhang S, Zhao S et al (2014) Population genetics of 30 insertion-deletion polymorphisms in two Chinese populations using Qiagen Investigator® DIPplex kit. Forensic Sci Int Genet 11:e12–e14

    Article  Google Scholar 

  • Wei YL, Qin CJ, Dong H et al (2014) A validation study of a multiplex INDEL assay for forensic use in four Chinese populations. Forensic Sci Int Genet 9:e22–e25

    Article  Google Scholar 

  • Xu MY, Hong KX, Ma J et al (2006) Analysis of HLA-B locus gene polymorphism in Sichuan Yi ethnic group and Xinjiang Uygur ethnic group. Yi Chuan 28:913–917

    Google Scholar 

  • Yang N, Li H, Criswell LA et al (2005) Examination of ancestry and ethnic affiliation using highly informative diallelic DNA markers: application to diverse and admixed populations and implications for clinical epidemiology and forensic medicine. Hum Genet 3–4:382–392

    Article  Google Scholar 

  • Ye Y, Luo H, Liao L et al (2014) A case study of SNPSTR efficiency in paternity testing with locus incompatibility. Forensic Sci Int Genet 9:72–75

    Article  Google Scholar 

  • Yoo J, Lee Y, Kim Y et al (2008) SNPAnalyzer 2.0: a web-based integrated workbench for linkage disequilibrium analysis and association analysis. BMC Bioinformatics 23(9):290

    Article  Google Scholar 

  • Yuan L, Liu H, Liao Q et al (2015) Genetics analysis of 38 STR loci in Uygur population from Southern Xinjiang of China. Int J Legal Med 130:687–688

    Article  Google Scholar 

  • Zaumsegel D, Rothschild MA, Schneider PM (2013) A 21 marker insertion deletion polymorphism panel to study biogeographic ancestry. Forensic Sci Int Genet 7:305–312

    Article  Google Scholar 

  • Zhang YD, Shen CM, Jin R et al (2015) Forensic evaluation and population genetic study of 30 insertion/deletion polymorphisms in a Chinese Yi group. Electrophoresis 36:1196–1201

    Article  Google Scholar 

Download references

Authors’ contributions

TM and CS wrote the main manuscript text, BZ and LZ designed the study and modified the manuscript. YL, YZ, YG and QD did the sample preparation, HM, XW and JY conducted the data processing. All authors reviewed the manuscript. All authors read and approved the final manuscript.

Acknowledgements

This project was supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region (2013211A065). The authors would like to thank the scientists who produced relevant baseline work.

Competing interests

All authors approved and agree with the contents of the manuscript and the authors declare that they have no competing interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li-Ping Zhang.

Additional information

Ting Mei and Chun-Mei Shen contributed equally to this work

Bo-Feng Zhu and Li-Ping Zhang contributed equally to this work

Additional files

40064_2016_2730_MOESM1_ESM.xls

Additional file 1: Table S1. The genotyping results of the 30 InDel loci from Uigur ethnic group living in Xinjiang Uigur Autonomous Region, China (n = 136).

Additional file 2: Table S2. The linkage disequilibrium pattern revealed by r 2 values at 30 InDel loci.c.

40064_2016_2730_MOESM3_ESM.xls

Additional file 3: Table S3. Pairwise Fst and p-values between Xinjiang Uigur group and other groups at 30 InDel loci (n = 136).

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mei, T., Shen, CM., Liu, YS. et al. Population genetic structure analysis and forensic evaluation of Xinjiang Uigur ethnic group on genomic deletion and insertion polymorphisms. SpringerPlus 5, 1087 (2016). https://doi.org/10.1186/s40064-016-2730-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40064-016-2730-3

Keywords