Comparative Study of Nei ’ s D with other Genetic Distance Measures between Barak Valley Muslims and other Nations for ABO Locus

Quantification of the genetic distance between populations is essential in many genetic research programs. Several formulae were proposed for the estimation of genetic distance between populations using gene frequency data. But the selection of a suitable measure for estimating genetic distance between real-world human populations is a very difficult task despite the widely used measure Nei’s D. The present study was undertaken to estimate the genetic distance between Barak Valley Muslims (BVM) and other twenty-four nations using seven different measures with ABO blood group gene frequency data for comparative analysis and to estimate the correlation coefficients between distance measures and to work out the linear regression equations. Seven genetic distance measures namely Nei’s D, Nei’s Nm, La, Nei’s Da, Dc, Re and Nei’s Ne were estimated between BVM and other 24 nations enroute the journey of mankind from Africa that commenced about 200,000 years ago (www.bradshawfoundation.com). Correlation coefficients between Nei’s D with other measures were estimated to find out which other genetic distance measures were closely related to Nei’s D. Nei’s D showed highly significant (p=0.01) positive correlation with Cavalli-Sforza and Edwards chord distance Dc (0.90), Reynolds Re (0.90), Nei’s Da (0.74) and Nei’s Ne (0.63) but negative correlation with Nei’s Nm and La. Linear regression equations of Nei’s D with other distance measures were estimated as Da =-0.80 + 1.34D, Dc = 1.91 + 4.44D, Re =-0.51 + 0.24D and Ne =-7.60 + 1.30D.


Introduction
Quantification of the genetic distance between populations is instrumental in many genetic research programs.A large number of formulae have been proposed for this purpose.However, the selection of an appropriate measure for assessing genetic distance between real-world human populations that diverged because of mechanisms that are not fully known can be a challenging task (Libiger et al., 2009).
Nei's standard genetic distance D has been the most widely used genetic distance measure between populations.Since several formulae have already been proposed for genetic distance measurement, it is essential to identify which genetic measures show a close relationship with Nei's D. Barak Valley zone of Assam in India has a total population of about 3.21 million including Hindus, Muslims and Christians with a land area of 6,992 square kilometers.These populations have maintained distinct culture and life style for centuries despite sharing a few common features.No information is available on the genetic proximity of the Muslims of Barak Valley (BVM) with other nations/ populations in the route of migration of humankind that commenced from Africa nearly 200,000 years ago.ABO blood grouping system was established by Karl Landsteiner in 1900 on the basis of presence or absence of two antigens (A and B) on RBC and its Mendelian inheritance pattern by Bernstein in 1924 (Crow, 1993).In this system, four blood groups namely A, B, AB and O are identified by blood tests.Genetic analysis of the ABO blood group system revealed that three alleles namely A (I A ), B (I B ) and O (i) determine blood group phenotype.The A allele produces A antigen, B produces B antigen and the O allele produces neither.Both A and B alleles are mutant forms and show codominance with each other but both are dominant over the O allele in diploid condition.
The nations/populations across the globe can be characterized for the distribution of ABO blood groups.These phenotypic data are used to estimate the frequency of different alleles of ABO gene using the standard formulae of population and quantitative genetics.The allele frequencies of a gene can be used to estimate the genetic distance between two populations.Human genome project (HGP) has given a draft estimate of 25000 to 30000 genes in human genome.The study of all these genes, each with varying number of alleles, at a time in a genome to elucidate the process of molecular evolution is complicated and almost impossible with the latest available state-of-the-art molecular biology technique (October, 2011).Hence it is imperative to study one or a few genes at a time to understand the evolutionary process in humankind.
Nei's standard genetic distance (D) between two populations without bias correction according to Nei (1972) is estimated as: Latter's distance (La) according to Latter (1972) is given by: Nei's Da distance according to Nei et al. (1983) is given by: Cavalli-Sforza and Edwards chord distance (Dc or CE) according to Cavalli-Sforza and Edwards (1967) is given by: Reynolds genetic distance (Re) according to Reynolds et al. (1983) is given by: Nei's geometric distance (Ne) based on genotype frequency data (not gene frequency) is given by: ( )

Correlation and regression analysis
Correlation coefficient between any two distance measures was calculated according to Harris et al. (2007).Correlation coefficient was tested by the t-test for significance at p=0.01 and 0.05.Linear regression equation of a distance measure (as dependent variable) on Nei's D as independent variable was estimated by the method of least squares as per Harris et al. (2007).

Results and discussion
Barak Valley Zone, named after the mighty river Barak flowing through the zone, is located in southern part of Assam state in North East India.The valley has inhabited one of the major endogamous religious groups, the Muslims, for several centuries.Barak Valley has a total population of about 3.21 million including Hindus, Muslims and Christians with a land area of 6,992 square kilometers.This region is characterized by undulating topography with wide The present study was taken up to estimate the genetic distance between Barak Valley Muslims (BVM) and each of other 24 nations for ABO blood group gene frequency data using seven different genetic distance measures namely Nei's D, Nei's Nm, Latter's La, Nei's Da, Cavalli-Sforza and Edwards Dc (RE), Reynolds Re and Nei's Ne.Genetic distance between Barak Valley Muslims and other 24 nations along the historic route of human migration as proposed by Stephen Oppenheimer (at http://www.bradshawfoundation.com) was estimated on the basis of ABO gene frequency and to assess the genetic proximity and the evolutionary relationship of the Barak Valley Muslims with other nations.
To identify the distance measure(s) that shows a close relationship with Nei's D, correlation analysis was done between the estimates of Nei's D and other distance measures.Linear regression equations of different distance measures on Nei's D were worked out to determine the value of a particular distance measure with a given value of Nei's D.

Materials and methods
In this study, ABO blood group distribution data of 25 populations excluding Barak Valley Muslims were obtained from the published literature and websites.The ABO blood group distribution data in Barak Valley Muslims were estimated by Chakraborty (2010).The frequencies of O, A and B alleles belonging to ABO blood group system for each population were estimated from ABO blood group phenotyping data using the formulae suggested by Hedrick (2005) as given below: Where N= Total individuals N 11 +N 13 = Individuals having "A" blood group N 22 +N 23 = Individuals having "B" blood group N 33 = Individuals having "O" blood group

Genetic distance
The ABO gene frequency data (Tab.1) were used to estimate the genetic distance between Barak Valley Muslims and each of the remaining 24 populations using seven distance measures as given below.
Let the genetic distance for 'm' loci with 'v' alleles per locus be studied in populations 1 and 2 with n 1 and n 2 individuals having n as the average number of individuals.Let lu1 P and lu2 P be the frequencies of allele 'u' at locus 'l' in population 1 and 2, respectively and let P lu1 and P lu2 be the number of individuals that carry allele 'u' at locus 'l' in populations 1 and 2 respectively, then seven distance measures can be estimated as follows: plain area, low lying water logged tracts and hillocks.The climate of Barak valley is sub-tropical, warm and humid with average annual rainfall of 318 cm and 146 rainy days.Nearly 80% of the total population depends on agriculture for livelihood.

Gene frequency
The frequencies of O, A and B alleles of ABO gene of different nations/populations were estimated from the ABO blood group distribution data of each population (Tab.1).In general, the frequency of O allele was the highest in all the populations.B allele was not reported in Australians.

Genetic distance between populations
The estimates of various genetic distance measures between Barak Valley Muslims (BVM) and each of the twenty-four populations were calculated on the basis of ABO gene frequency data (Tab.2).
Nei's D estimate was the lowest (0.0015) between BVM and India (in general) indicating lowest genetic distance between these two populations for ABO gene.On the other hand, the highest Nei's D value (0.0395) was found between BVM and Australia suggesting greatest genetic distance between these two populations for ABO gene out of 24 combinations.Nei's geometric distance (Ne), except all other genetic distance measures, was calculated on the basis of genotypic data estimated from ABO gene frequency.
Nei's minimum genetic distance (Nm) ranged from the lowest value 0.0074 between BVM and West Indonesia to the highest value 0.0791 between BVM and South China irrespective of sign.Similarly, Latter's distance (La) ranged from 0.0146 between and West Indonesia to 0.1366 between BVM and South China.
Nei's Da estimate ranged from 0.0009 between BVM and India to 0.1000 between BVM and Australia.Cavalli-Sforza and Edwards chord distance (Dc) showed the range from 0.0265 between BVM and India to 0.2847 between BVM and Australia.Reynolds genetic distance (Re) ranged from the lowest estimate 0.00002 between BVM and Bulgaria to the highest value 0.0073 between BVM and Australia.Nei's Ne estimate ranged from the lowest value 0.0169 between BVM and Sudan to the highest value 0.0808 between BVM and South China.Estimates of genetic distance between BVM and other populations using seven measures are graphically presented (Fig. 2).
Several studies were carried out on genetic distance measurements across different populations.Genetic distance and gene diversity studies by Roy et al. (1990)   Genetic differentiation studies in Indian populations by Papiha et al. (1982) revealed that genetic differentiation in India populations was low (0.26-1.70%).In Assam, genetic variation studies by Das (1979)

Regression analysis
Nei's D is the most widely used genetic distance measure in research programs.Assuming Nei's D as a dependent variable and anyone of the remaining distance measures (Da, Dc, Re or Ne) as independent variable, the linear regression equations of the latter on Nei's D were estimated (Tab.4; Fig. 1).Since Nei's minimum distance Nm and Latter's distance La did not show significant correlation with Nei's D, hence Nm and La were not used as dependent variables in determining the linear regression equation with Nei's D.
These regression equations could be used to estimate the magnitude of the particular genetic distance measure with a given value of Nei's D between two populations.But the accuracy of the particular genetic estimates calculated from a given estimate of Nei's D using the above linear regression equations would decrease with the decreasing value of correlation coefficients.In the regression equation y = A+Bx, the B estimate represents the regression coefficient (slope) for linear regression and the regression constant A represents the magnitude of the y-intercept i.e. the distance from the origin to the point where the straight line intersects the y-axis.sam were distinguished between Marias (who seemed to be more closely related to Mongoloid populations) and Sheikhs (whose phenotypic appearance was more like that of Hindu caste groups).
Genetic distance studies by Roychoudhury et al. (1982) between Jews and Non-Jews using gene frequency data of nine blood groups and protein loci revealed that the Yemenite Jews have a high degree of genetic affinity to the Israeli Arabs and the Iranian Jews to the Iranians.Genetic distance studies by Triantaphyllidis et al. (1983) between the inhabitants of nine Mediterranean countries and the three major human races using the gene frequency data of several genetic markers suggested that the Algerians were closer to Negroids while the other Mediterraneans were closer to Caucasoids.
Genetic and taxonomic distance studies by Sokal (1988) among 3466 samples of human populations in Europe based on 97 allele frequencies and 10 cranial variables demonstrated that speakers of different language families in Europe differ genetically and that this difference remains even after geographic differentiation.

Correlation analysis
The estimates of correlation coefficients between any two distance measures (Tab.3) revealed that Nei's D showed highly significant (p=0.01)positive correlation with Cavalli-Sforza and Edwards chord distance Dc (0.90), Reynolds Re (0.90), Nei's Da (0.74) and Nei's Ne (0.63).This indicated great similarity among these four distance measures and anyone of these measures could be used instead of all the four measures in genetic analysis.But due  The results of the present study revealed that the Muslims of Barak Valley in Assam showed the highest genetic distance (Nei's D) with Australians but the lowest with Indians in general.Correlation analysis between Nei's D and other distance measures revealed that Nei's D had highly significant (p=0.01)positive correlation with Cavalli-Sforza and Edwards chord distance Dc (0.90), Reynolds Re (0.90), Nei's Da (0.74) and Nei's Ne (0.63).This indicated great similarity among these four distance measures.Nei's D is the most widely used measure in research programs.Assuming Nei's D as a dependent variable and anyone of the remaining distance measures (Da, Dc, Re or Ne) as independent variable, the linear regression equations of the latter on Nei's D in the form y = A+Bx were estimated.These regression equations were Da =-0.80 + 1.34D, Dc = 1.91 + 4.44D, Re =-0.51 + 0.24D and Ne =-7.60 + 1.30D.
populations namely Brahmin, Kalita and Kaibarta on the basis of ABO blood groups and other anthropometric characters revealed that the Kaibarta stand apart from the Brahmin and the Kalita, who are similar to each other.Genetic study byDanker-Hopfe et al. (1988) among 13 Assamese populations including two Muslim groups for the distribution of anthropometric, anthroposcopic and dermatoglyphic traits revealed that the Muslims in As-among 10 endogamous groups in Chattisgarh, India using the gene frequency data of three genetic loci revealed that the gene differentiation among these groups is only about 2 per cent.

Fig. 1 .
Fig. 1.Linear regression equations showing relationship between Nei's D with other distance measures

Fig. 2 .
Fig. 2. Comparison of various genetic distance measures between BVM and other nations for ABO gene Tab. 1. Estimates of allele frequencies of ABO gene in Barak Valley Muslims (BVM) and other 24 populations/nations *Detailed reference in text among three caste Tab. 2. Estimates of seven different genetic distance measures between BVM and other nations for ABO gene Tab. 3. Correlation coefficients of Nei's D with other genetic distance measures ** Significant at p=0.01