Genetic Diversity Analysis of Indian Salmon, Eleutheronema tetradactylum from South Asian Countries Based on Mitochondrial COI Gene Sequences

Eleutheronema tetradactylum is an important commercial fish species exposed to intense exploitation both in Southeast Asian countries and Northern parts of Australia. Research on the population structure of E. tetradactylum in these coastal waters is substantial in order to ensure sustainable use and appropriate resource management. In this study, genetic variation, diversity and population structure of E. tetradactylum among four FAO fishing areas, along South Asian countries, were evaluated using cytochrome c oxidase subunit I (COI) gene. Totally 30 sequences of COI gene were collected from four FAO fishing areas. Among these 30 individuals, 18 distinct haplotypes were defined. High levels of haplotype diversity (hd = 0.952 ± 0.096) and nucleotide diversity (π = 0.01536 ± 0.00312) were observed in the population within the Bay of Bengal. No haplotype and nucleotide diversity were observed in South China Sea population. Hierarchical analysis of molecular variance (AMOVA) indicated that whereas 0.81% of the genetic variation occurred within the populations, 7.09% occurred among populations. Significant genealogical branches were recognized in North Australian populations (one clade), South China Sea populations (one clade), Arabian Sea and Bay of Bengal populations (one clade on the neighbor-joining tree). These results suggested that E. tetradactylum populations in FAO fishing areas 51, 57 and 61 have developed different genetic structures. Tests of neutral evolution and mismatch distribution suggest that a population growth of E. tetradactylum may take place in these fishing areas.


Introduction
The Indian salmon, E. tetradactylum is a pelagic-neritic fish species that belongs to the Polynemidae family, which is mainly distributed in the Indo-West-Pacific region: from Persian Gulf to Papua New Guinea, Northern Australia and East Asia (Japan, China, Vietnam) (Yamada et al., 1995). E. tetradactylum prefers shallow turbid water, soft substrates and is found in a variety of near-shore habitats (Horne et al., 2011). However, fisheries of E. tetradactylum have drastically decreased in recent years due to overexploitation and water pollution (Motomura et al., 2002;Newman et al., 2011). E. tetradactylum is a protandrous hermaphrodite that becomes female after 2 years, with a maximum lifespan of approximately 7 years reaching more than 1 meter length (Horne et al., 2011). The location of spawning is unknown in this species but both eggs and larvae are pelagic, suggesting a high dispersal potential (Horne et al., 2011). There are no data on pelagic larval duration for this species in the wild, where the larvae reach a maximum length of 30 mm (Motomura, 2004). This species is also a commercially important fish that is harvested on a large scale between Kuwait and Northern Australia (Motomura, 2004), but more knowledge is needed about the stock structure for proper management of this fishery (Welch et al., 2002). Earlier studies show that the dispersal of this species is sufficiently low to make inferences about the ecological connectivity levels, which are the most relevant concerning management (Jones et al., 2009;Horne et al., 2011). A number of studies have been carried out on Polynemidae fish stock structures using molecular markers. Zischke et al. (2009) determined the stock structure of blue thread fin E. tetradactylum along the East Queensland Coast using parasites and conventional tagging. Moore et al. (2011) investigated the stock structure of E. tetradactylum across Tropical Northern Australia using stable isotopes in sagittal otolith carbonates.
Molecular markers can be used to effectively estimate genetic variation and population structure in different populations, thereby providing a basis for better management of whole populations and thence sustainable fisheries (Liu et al., 2009;Yue et al., 2009). The COI gene is well characterized and is frequently used for genetic studies in invertebrates and vertebrates (Ward et al., 2005;Spies et al., 2006). In addition, variations in the COI gene sequence have been employed to resolve population analysis in fish species such as Pampus argenteus, Coilia ectenes, Nibea albiflora and E. rhadinum (Peng et al., 2009;Ma et al., 2010;Xu et al., 2012;Sun et al., 2013). In the present study, COI gene sequence was used to assess genetic divergence and genetic connectivity among six E. tetradactylum populations along the Seas of the Indian Ocean, South China Sea and North Australian Seas.

Materials and Methods
Sample collection Five E. tetradactylum individuals were collected from Parangipettai, Tamil Nadu and from the Bay of Bengal. All of the individuals were identified based on morphological characteristics according to the description of Motomura et al. (2002). After collection muscle samples were preserved in 95% ethanol for DNA extraction. To support the hereby studied COI data, another two sequences for the Bay of Bengal, another three for Arabian Sea and another five (one for each) for South China Sea, North West Australia, North Australia and North East Australian waters were retrieved from NCBI GenBank. Map of FAO fishing zones and detailed information concerning the sequences is shown in Fig. 1 and Table 1.

DNA isolation, amplification and sequencing
From the stored tissues, DNA was isolated by standard Proteinase-K/Phenol-Chloroform-Ethanol method (Sambrook et al., 1989) and the concentration of isolated DNA was estimated using a UV spectrophotometer. The DNA was diluted in TAE buffer to a final concentration of 100 ng ⁄μl. The COI gene was amplified in a 50 μl volume PCR mix with 5 μl of 10X Taq polymerase MgCl2 (25 mM) buffer, 1μl of each dNTP (0.05 mM), 1 μl of each primer (0.01 mM), 0.6 U of Taq polymerase, 2 μl of genomic DNA and 36 μl of double distilled water. The universal primer, FishF1 -5'TCAACCAACCACAAAGACATTGGCAC3' and FishR1-5'TAGACTTCTGGGTGGCCAAAGAATCA3' (Ward et al., 2005) was used for the amplification of the CO1 gene. The thermal regime consisted in an initial step of 2 min at 95 °C followed by 35 cycles of 40 sec at 94 °C, 40 sec at 54 °C and 60 sec at 72 °C followed by final extension of 10 min at 72 °C. The PCR products were checked using 1.5% agarose gel and the most representative bands were selected for sequencing. The cleaned up PCR product was sequenced by a commercial sequencing facility (Eurofin Genomics, Bangalore, India).

Sequence analysis
The COI gene partial sequences of five individuals were edited using MEGA 4.0 (Tamura et al., 2007) and aligned with Clustal W 1.6, included in the same software. The haplotype definitions have been submitted to the NCBI GenBank. The genetic diversity indices such as nucleotide diversity (π) (Lynch and Crease, 1990) and haplotype diversity (hd) (Nei, 1987), were calculated in Dnasp 4.0 (Rozas et al., 2003). Genetic relationships among individuals were constructed based on the neighbor-joining (NJ) method (Saitou and Nei, 1987). In order to illustrate the phylogenetic and geographical relationships of the haplotype sequences, a haplotype network was created with the median-joining in Network 4.1 (Röhl and Mihn, 2003). A hierarchical analysis of molecular variance (AMOVA) was performed to reveal the geographical structure of genetic variation using ARLEQUIN version 3.1 (Excoffier et al., 2008). The significance of the fixation index was tested by 1000 permutations of the data set. The population genetic structure within the six fishing zones were revealed by pairwise F statistics in ARLEQUIN version 3.1 (Excoffier et al., 2008). Tajima's D (Tajima, 1989), Fu and Li's D and Fu's Fs (Fu, 1997) was calculated to verify the null hypothesis of selective neutrality in relation to mtDNA sequences, which would be expected with population expansion. Mismatch distributions (Harpending et al., 1993) were constructed in Dnasp 4.0 (Rozas et al., 2003). The shapes of the mismatch distributions were used to deduce whether a population has undergone a sudden population expansion (Rogers, 1995 Significance was assessed on the parameters with permutation tests under the null hypothesis that a sudden population expansion cannot be rejected.

Genetic variation
The 5' partial fragment of 614 bp of E. tetradactylum COI gene was amplified and sequenced from all the five collected individuals of the present study and 25 Genbank sequences were analyzed. The average contents of A, T, G, and C were 22.6%, 29.1%, 19.2% and 29.1%. Among the 30 sequences 18 haplotypes were defined and no insertions or deletions were found. The number of haplotypes (h), haplotype diversity (hd) and nucleotide diversity (π) within each population are presented in Table 2. The South China population had the lowest number (one) of haplotypes and the lowest genetic diversity (hd = 0.000 ± 0.000, π = 0.00000 ± 0.00000), while the Bay of Bengal population was the highest in all concerns (h = 6, hd = 0.952 ± 0.096, π = 0.01536 ± 0.00312).
Population genetic structure and genetic distance A neighbor-joining (NJ) tree for the six E. tetradactylum populations comprising 30 individuals was constructed based on the Kimura-2-parameter (K2P) distance model (Fig. 2). 419 The NJ tree was deep and there were significant genealogical branches corresponding to populations in FAO fishing areas (51, 57 and 61). Furthermore, to illustrate the phylogenetic and geographical relationships between the identified sequences, haplotype networks were constructed using the median-joining method. Only one haplotype was shared by Northern and North East Australian populations that might be the ancestral haplotype for Australian populations. No other haplotypes were shared by any other populations (Fig.  3). The K2P genetic distance and FST value between the six populations are given in Table 3. All the populations exhibited significant genetic differences except the ones between the Arabian Sea and Bay of Bengal as well as North Australia and North West Australian populations. The analysis of molecular variance was performed based on haplotype frequencies to test for large-scale patterns of genetic structure (Table 4). Results show that only 0.81% of the genetic variation occurred within the populations, whereas 7.09% occurred among populations.

Tests of neutrality and population expansion estimation
Tests for neutral evolution were performed to ascertain the evidence of purifying or balancing selections. The Fu's F tests resulted in negative values almost for all of the populations despite the fact that the results were not statistically significant (Table 5). This result suggests that E. tetradactylum populations likely experienced a population expansion. The mismatch distribution pattern of E. tetradactylum among the populations is shown in Fig. 4.
In the present study, 614 bp segment of the COI gene were used to assess the genetic diversity of E. tetradactylum populations in the FAO fishing areas 51, 57, 61 and 71, including the Arabian Sea, Bay of Bengal, South China Sea and North Australian Seas. Eighteen haplotypes were defined in 30 sequences. The genetic diversity in E. tetradactylum is comparable with E. rhadinum populations where the mean haplotypic (h) and nucleotide diversities (π) were 0.759 ± 0.035 and 0.00198 ± 0.00326 by COI data (Sun et al., 2013). It is important to mention that in Austlalian E. tetradactylum populations the haplotype (h) and nucleotide diversity (π) range from 0.00 to 0.83 and from 0.0000 to 0.0024 respectively as per Cyt b data (Horne et al., 2011). In addition a remarkable reduction was observed in genetic diversity of the Zhoushan population (h = 0.595 ± 0.109, π = 0.001 ± 0.001 55) compared to the Qidong (h = 0.782 ± 0.058, π = 0.00212 ± 0.0035) and Zhuhai populations (h = 0.780 ± 0.059, π = 0.00222 ± 0.00282). Genetic variation within populations can be reduced through genetic drift or bottleneck in the particular population Habib et al., 2011). An earlier study (Chang et al., 2012) reported that, overfishing and concomitant habitat loss in this area have had a deleterious effect on population level decrease and genetic diversity, and it might be responsible for the lower genetic variation in the Zhoushan population in South China Sea. The high level of haplotypic diversity and low π value in E. tetradactylum populations in Bay of Bengal and Australian populations suggest that this fish could have experienced a population expansion after a period of low effective population size (Grant and Bowen, 1998). This type of genetic structure has been observed in threadfin fish, E. rhadinum (Sun et al., 2013), long-tailed hake, Macruronus magellanicus (Machado-Schiaffino and Garcia-Vazquez, 2011) and fat greenling, Hexagrammos otakii (Habib et al., 2011     population expansion in most of the E. tetradactylum populations. Moreover, the haplotype network also confirmed recent population expansion following a population bottleneck in most of the studied populations. However, positive selection could also result in an excess of low-frequency haplotypes in many populations, making it difficult to unambiguously discern between evidence for natural selection and demographic population expansion. To distinguish these scenarios, further analysis of several unlinked loci in the genome is necessary, because the selection affects only specific loci (Grant et al., 2006). According to the F ST analysis, a significant genetic differentiation was observed between the Arabian Sea and South China Sea populations. Many factors, including historical factors, anthropogenic activity, habitation, and a low rate of mitochondrial evolution, can influence genetic population structure (Avise, 2004;Grant et al., 2006). However, the F ST value between Arabian Sea and Bay of Bengal as well as the three Australian populations were low, which suggest genetic similarities between the sampled regions. In general genetic homogeneity in marine fishes can be attributed to high dispersal potential during the planktonic egg and larval stages coupled with an absence of physical barriers between ocean basins and adjacent continental margins. Previous studies have revealed that the ocean currents in the China Sea facilitate the dispersal of marine larvae among distant populations (Han et al., 2008;Shui et al., 2009;Xiao et al., 2009). Data analysis from the COI gene sequences revealed genetic heterogeneity in E. tetradactylum populations in 51, 57, 61 and 71 FAO fishing areas. The population structure shows that subdivisions exist between the studied areas. The sample size and geographical diversity of the populations were limited in this study. The use of multiple genetic marker systems can increase the resolving power of genetic studies (Gruenthal et al., 2007). Furthermore, molecular studies comprise a higher number of molecular markers including nuclear markers which are still required to precisely evaluate the genetic structure of E. tetradactylum throughout the globe.