Unanticipated population structure of European grayling in its northern distribution: implications for conservation prioritization

Background The European grayling (Thymallus thymallus) is a salmonid fish native to Europe, with a distribution ranging from England and France to the Ural Mountains of north-western Russia. The majority of grayling populations inhabit freshwater rivers and lakes but some populations also occupy brackish water in northern parts of the Baltic Sea. Previous population genetic studies have demonstrated that grayling populations in Finland, Estonia and Russia belong to a single mitochondrial lineage and exhibit high levels of differentiation even at a small geographic scale. As a result, we predicted that grayling populations should not cluster regionally. Despite the extensive amount of genetic research that has been carried out on grayling, comprehensive national-level information on population structure of grayling in Northern Europe is still lacking. Yet this is the level at which populations are currently managed. Results We found unanticipated population structure of grayling clustering into three groups largely corresponding to the northern, Baltic and south-eastern geographic areas of Finland using 13 microsatellite loci. We also found a high level of genetic differentiation among the groups and moderate to high differentiation within the groups. This combined with low variability strongly indicates that genetic drift and limited migration have a major impact on grayling population structure. An allele size permutation test indicated that mutations at microsatellite loci have not significantly contributed to genetic differentiation among the three Finnish groups. However, at the European scale, mutations had significantly contributed to population differentiation. Conclusion This research provides novel genetic information on European grayling in its northern distribution range and has clear implications for supporting country-scale conservation efforts. Specifically, the strong between population divergence observed indicates that single populations should generally be recognized as separate management units. We also introduced an alternative prioritization strategy for population conservation based on the evaluation of the relative roles of different evolutionary forces shaping the gene pools. We envision that the proposed approach to categorize populations for conservation will be a useful tool for wildlife researchers and conservationists working on a diverse range of organisms.


Background
The European grayling, Thymallus thymallus, is a salmonid fish highly appreciated by recreational anglers. It is native to Europe with a distribution from England and France to the Ural Mountains of north-western Russia. It can inhabit stream, riverine or lacustrine habitats and show large phenotypic differences in gill raker number, body size, weight at first spawning and fecundity [1]. While grayling occur mostly in freshwater habitats, some populations also occupy brackish water in the northern parts of the Baltic Sea.
Population genetic structure of the European grayling has been studied rather extensively across its natural distribution, but particularly in central Europe, using both mitochondrial sequences and microsatellite markers [2][3][4][5][6][7]. In central and northern Europe, the population structure is represented by four major lineages with (lineage) contact zones in Germany [8] and northern Sweden and Finland [9]. In northern Europe, two major lineages have been described (3.1% and 1.1% divergences using mtDNA PCR-RFLP and 529-bp of ND5 sequence, respectively; [9]). One lineage that almost exclusively inhabits Sweden and Norway most likely originated from a central European refugium while grayling in Finland, Estonia and north-western Russia belong to a different mitochondrial lineage originating most likely from an eastern European refugium [9]. Northern European grayling have been studied also at a very small geographic scale [10]. In lake Saimaa, Finland, Koskinen et al. [11] found evidence of severely limited gene flow among populations separated by just tens of kilometres.
Despite the quite extensive amount of genetic research that has been carried out on grayling, detailed nationallevel information of grayling population structure in Northern Europe is lacking. Yet, this is the level at which populations are currently managed. Based on the previous work, we expect that grayling populations in Northern Europe exhibit high genetic differentiation even at a small geographic scale (prediction 1) and possibly show just a relatively weak isolation-by-distance signal as genetic drift should be the dominant evolutionary force compared to migration. However, as Finnish grayling populations belong to a single mitochondrial lineage on one hand and show extremely low gene flow on the other, we do not expect that grayling in Finland exhibit further genetic clustering or grouping (prediction 2).
In this study, the major objective was to provide genetic information from a geographically representative set of populations to assist country-scale conservation of grayling in Finland. We analyzed grayling sampled throughout Finland in order to test the predictions based on the knowledge from previous population genetic studies, using a panel of 13 microsatellite markers. For compara-tive purposes, we analyzed grayling populations from Russia, Sweden, Norway and Germany. We also categorized and prioritized populations according to predominant evolutionary forces, providing useful information for the development of a scientifically justified nationalscale conservation strategy of grayling in Finland.

Fish samples
Grayling were sampled from 15 locations within Finland, between 1996 to 2001 (figure 1a). Fin clips were stored in 95% ethanol. DNA was extracted using a salt-based method described by Aljanabi and Martinez [12]. Information on population status, location and geographic coordinates are given in table 1.

Microsatellite data
Initially, seventeen microsatellite markers employed by Koskinen et al. [13] were re-screened using 16 individuals representing a number of populations and two loci were then excluded (BFRO9 and BFRO16) due to low levels of polymorphism. Forward primers were labeled fluorescently with FAM for BFRO4, BFRO5, BFRO7, BFRO10 and BFRO13; VIC for BFRO15, BFRO17, BFRO18 and Ogo2; NED for BFRO12, Cocl23 and Str85INRA; and PET for BFRO11, ONE2, and Str73INRA. Primer concentrations were optimized for co-amplification of 15 loci in a single multiplex PCR and the amplicons were size-measured in a single capillary electrophoresis. The 6.25 μl multiplex PCR reaction consisted of ca. 125 ng of template DNA, 1× multiplex PCR master mix (Qiagen) and 0.015 to 1.915 μM of each primer. Exact concentrations of primers are provided in the supplementary material [see Additional file 1]. Amplifications were carried out in a PTC100 thermal cycler (MJ Research) with an initial heat-activation at 95°C for 15 minutes (min) followed by 37 cycles of denaturation at 94°C for 30 seconds (sec), annealing at 55°C for 90 sec and extension at 72°C for 60 sec. The PCR was terminated after 30 min of final extension at 60°C. PCR products were diluted, denatured and then electrophoresed on an ABI Prism 3130xl genetic analyzer (Applied Biosystems/Hitachi) along with GeneScan 600 LIZ size standard (Applied Biosystems). DNA fragments were genotyped using GeneMapper 4.0 (Applied Biosystems). All genotypes were manually inspected and examples of electropherograms are shown in the supplementary material [see Additional file 2]. Besides the 15 newly genotyped Finnish populations, previously published data [13] from seven populations within Finland and six populations from Russia, Sweden, Norway and Germany were included (table 1). Three of the previously published Finnish populations (102 individuals in total) were re-genotyped in this study to enable calibration of allele sizes.
Grayling sampling locations and analyses of population structure based on 13-microsatellite loci Figure 1 Grayling sampling locations and analyses of population structure based on 13-microsatellite loci. Populations are coded as in table 1. a) A map indicating sampling locations. Three population groups are indicated with different color lines corresponding to the three clusters identified using PCA. Dot sizes are proportioned to allelic richness of each population. Black and green dots stand for indigenous and supplementary stocked populations, respectively. b) Principal Component Analysis of Finnish grayling populations based on microsatellite allele frequencies. c) Individual clustering as inferred by series of hierarchical partitioning using Structure. Each individual is represented by a thin horizontal line pooled into K-colored blocks indicating individual's membership fractioned in K clusters. Black horizontal lines separate individuals from different sampling sites.

Statistical analyses
Microsatellite diversity, Hardy-Weinberg and genotypic linkage equilibrium GenePop 3.4 [14] was initially employed to screen for deviations from Hardy-Weinberg (H-W) equilibrium and two loci (BFRO4 and ONE2) were excluded due to highly significant heterozygote deficiency (P < 0.0001) in several populations, likely indicating the presence of null alleles. Thus, the final dataset comprised 13 loci. Because some of the sampled populations can be affected by stocking with individuals of non-native origin (table 1), Structure 2.2 [15] was used to identify individuals that represented genotypes highly unlikely to be native to the sampled population. In total, 15 individuals belonging to four populations (1-6 individuals per population) were identified with highly unlikely genotypes compared to the rest of the samples and these putatively non-natives were excluded from further analysis. The final dataset consisted of 936 individuals from 25 populations.
Allele number and allelic richness were estimated using FSTAT 2.9.3.2 [16]. Observed and expected heterozygosity were measured using Microsatellite Toolkit 3.1 [17]. Deviations from Hardy-Weinberg equilibrium and genotypic linkage equilibrium were calculated using Genepop 3.4. For the combination of separate tests across loci (H-W test) or locus pairs (linkage equilibrium test), Fisher's procedure was applied [18]. Throughout the study, sequential Bonferroni corrections [19] were performed to correct for multiple testing.

Genetic differentiation and relationships between populations
Inter-population genetic divergence was calculated using the multilocus F ST estimator of Weir & Cockerham [20] with FSTAT 2.9.3.2. Population differentiation was tested as genic differentiation for all population pairs using Genepop 3.4. The genetic relationship among populations was examined by Principal Component Analysis (PCA), based on allele frequencies, using PCAGEN 1.2 [21]. Population clustering was performed with a hierarchical partitioning approach as previously employed by Vähä et al. [22], using Structure 2.2 based on the delta K method [23] calculated from 20 replicates of ln Pr(X|K) under each K. Different run lengths were used (burn-in 20 000 to 80 000 iterations and, after that, data were collected for 20 000 to 80 000 iterations) to achieve stable results.

Analysis of molecular variance
To estimate the amount of molecular variation associated with different sets of population groupings, hierarchical analysis of molecular variance (AMOVA) was performed using Arlequin 3.1 [24]. By this analysis, genetic variation residing within each alternative population clustering was measured. Three alternative scenarios were tested based on the PCA results: grouping 1 (three population groups corresponding to northern, Baltic and south-eastern populations), grouping 2 (northern vs Baltic and south-eastern groups combined) and grouping 3 (Pera and Kru were re-located into the south-eastern group. Other populations were as in grouping 1).

Mantel test
An association between the geographic distance and genetic divergence matrices was examined using the Mantel test implemented in the GenAlEx 6.1 package [25] supported by multiple regression and a correlation extension procedure [26]. The statistical significance of the parameter estimates was obtained via 9 999 permutations. Interpopulation geographic distances were directly calculated from latitude and longitude data using the great circle distance method [27]. F ST /(1-F ST ) was used as a measurement of genetic divergence in the Mantel test while the geographic distance matrix was used as ln transformed as well as raw distance [28]. For detecting IBD within the three population groups with higher statistical power, a socalled "pooled within-stratum" approach was applied to the Mantel test as suggested by Smouse, PE (personal communication). Briefly, matrices from each of three population groups were permuted using GenAlex 6.1 to generate 999 values of the sum of permuted XY (spXY). The spXY from each population group were resampled using PopTool 2.7 [29] to yield 1 000 random combinations of "pooled" spXY [sum(XY) = sumA(XY) + sumB(XY) + sumC(XY); A, B and C stand for each population group]. Later, "pooled" sum(X 2 ) and sum(Y 2 ) were calculated accordingly. One-thousand pooled R XY were subsequently obtained as . The pooled R XY were ranked and the corresponding P-value was calculated.

Population prioritization for conservation
In order to rank populations in terms of their conservation priority, Contrib 1.20 [33,34] was used to calculate both the diversity (allelic richness) and the differentiation (related to Nei's D ST and G ST ) components of each population. These parameters are relative measurements and are evaluated as the contribution of each population to total allelic richness (contribution to total allelic richness; CTR) pooled from all populations. Thus, negative values indicate that the diversity or the differentiation of a population is lower than the mean of the whole dataset. For prioritizing grayling populations within Finland, foreign populations were excluded from the analysis in order to avoid the underestimation of the differentiation component within Finnish populations. Two Finnish populations (Iso and Rau) were also excluded because of their mixed origin, from hatchery stocking of non-native fish [35], which would artificially increase their diversity component compared to indigenous gene pools.

Microsatellite diversity, Hardy-Weinberg and genotypic linkage equilibrium
The average number of alleles per locus varied from 1.6 (Kai) to 5.4 (Tor) while the average allelic richness ranged from 1.6 (Kai) to 4.6 (SweHol; However, as each of these five linkage disequilibrium occurred only in a single population and none involved the same two loci, these loci are likely not in physical linkage. Beside these five locus pairs, no deviation from genotypic linkage equilibrium was observed after the sequential Bonferroni correction (k = 78; P = 0.0015-0.9999). Therefore, for subsequent analyses, all loci were assumed to have independent segregation of alleles.

Level of differentiation and genetic relationships between populations
When all 25 populations were included in the PCA, three populations (GerEge, Norles and SweVat) were very dis- tinct compared to the rest along the first principal component axis (data not shown). This deep divergence corresponds to two mtDNA lineages described by Koskinen et al. [9]. When only 19 Finnish populations were included in the analyses, populations clustered into three separate groups: 'northern', 'Baltic' and 'south-eastern' (figure 1b). The 'northern' group was the most homogeneous and distinct from the latter two groups.  The hierarchical analysis of molecular variance revealed that the proportion of total genetic variance due to differences between groups was highest for grouping 1 (F CT = 0.191). The two alternative hierarchical groupings explained smaller proportions of variation (F CT = 0.183 and 0.160 for grouping 2 and 3, respectively).

Mantel test A significant isolation-by-distance (IBD) signal was observed
when all Finnish populations were included in the analysis (R XY = 0.558; P < 0.001; figure 2). However, non-significant IBD patterns were observed within each of the three Finnish groups (R XY = -0.008, 0.233 and 0.314; all associated P > 0.05; within northern, Baltic and south-eastern groups, respectively). In addition, pooled within-group comparisons did not result in significant IBD signal (R XY = -0.153, P > 0.05). The Mantel test using ln distance provided the same conclusion as employing the raw geographic distance.

Population prioritization and categorization for conservation
By plotting diversity and differentiation components calculated using the Contrib program, the following categories of populations were identified: 1) high diversityhigh differentiation group, 2) high diversity-low differentiation group; 3) low diversity-high differentiation group and 4) low diversity-low differentiation group (figure 4a). Arguably, these categories largely reflect the relative roles of genetic drift and gene flow affecting the grayling populations and therefore can be useful for developing genetically justified conservation strategies for populations. The first category (high diversity-high differentiation group) contains seven populations that have high levels of diversity and exhibit differentiation higher than average. The second category (high diversitylow differentiation group) contains three populations that have a higher amount of diversity than average while the level of differentiation is rather low compared to the other populations. The third category (low diversity-high differentiation group) contains five populations that exhibit a lower amount of diversity than average and at the same time shows an increased differentiation component. The fourth category (low diversity-low differentiation group) contains two populations that exhibit somewhat lower genetic diversity as well as reduced differentiation compared to the average. For comparative purpose, the Contrib results are presented in a more traditional way by combining the diversity and differentiation components ( figure 4b).
Mantel test indicating the observed isolation-by-distance sig-nal driven by among-group comparisons, but not within-group Figure 2 Mantel test indicating the observed isolation-by-distance signal driven by among-group comparisons, but not within-group. Trend line is for all comparisons. Solid and open dots are for the among-and within-group comparisons, respectively.

Unanticipated population structuring
In accordance with other population genetic studies in European grayling [3,8,11,13,36,37] we found a generally high level of differentiation between the majority of populations. However, in contrast to our expectations and previous studies using microsatellites and mitochondrial DNA, we identified unanticipated signals of regional clustering of grayling populations in Finland. Such grouping was evident using both PCA and individual multi-locus genotype based analysis and further supported by analysis of molecular variance (AMOVA). The three population groups identified roughly correspond to separate geographic areas: the northern, Baltic and south-eastern regions. Nevertheless, all these three identified groups belong to a single European grayling mitochondrial lineage [9] and thus could be considered as one evolutionarily significant unit based on criteria sensu Moritz [38]. One of the reasons why the previous studies [9,13] of grayling in Finland did not detect such clustering is probably due to the low number of populations analyzed from the region. We observed high genetic differentiation between three population groups and moderate to high differentiation within the northern, Baltic and south-eastern groups. Significant IBD signal was identified when all Finnish popu-lations were analyzed together while no significant relationship between genetic and geographic distances was found among populations within the three groups and pooled within-group comparisons. Thus, the overall IBD signal most likely derives from the structuring of Finnish populations into three groups rather than from concurrent gene flow between neighboring populations, even though the power to detect significant IBD signal was lower for within group comparisons. This is in accordance with other studies in European grayling demonstrating that inter-population dispersal is extremely limited even among neighboring populations [11,39,40].
Rather surprisingly, several different types of analyses strongly indicated that the Kitkajärvi population (Kit) groups together with the Baltic populations even though this population inhabits an inland lake that is relatively distant from the Baltic Sea (174 km) and, moreover, currently flows into the White Sea basin. However, it is known that Kitkajärvi (lake Kitka) had postglacially bifurcated outflow; the eastward channel drained into the Baltic sea via the system of Livojärvi (lake Livo) until around 8 400 years ago, and the westward channel flowed into the White Sea via Kitkajoki (river Kitka) as it still does [41]. The postglacial colonization of grayling into north-Global F ST , pR ST and R ST estimated among three Finnish groups and among five countries ern Europe has been estimated to have begun ca. 10 500 -13 000 years ago [9] and thus we can hypothesize that the present Kitkajärvi grayling population was established through the historical waterway from the Baltic Sea more than 8 400 years ago. An alternative explanation is that there has been stocking activities of Baltic graying into Kitkajärvi, although there is no documented evidence supporting this scenario.

Genetic diversity
Among salmonid fishes, European grayling is one of the least genetically variable species, exhibiting low diversity even across highly variable markers such as microsatellites. In Atlantic salmon (Salmo salar), brown trout (Salmo trutta), whitefish (Coregonus hoyi) and rainbow trout (Oncorhynchus mykiss), the mean expected heterozygosity measured at microsatellite loci commonly varies from 0.63 to 0.76 (e.g. [42][43][44][45]) while the mean expected heterozygosity in this study was as low as 0.41 and ranged from 0.20 to 0.63. Hence, either the effective population size, level of migration or microsatellite mutation rate in European grayling is smaller than in other salmonid populations. When looking at specific populations, NorLes from Norway had extremely low allelic richness (A r = 1.81) and low heterozygosity (H e = 0.24), which is in accordance with the demographic history of the population as it is known that NorLes was most likely established by a small number of individuals in 1880 [10]. However, we found that the natural population (Kai) in Inarijoki flowing to lake Inari showed even lower levels of diversity (A r = 1.56, H e = 0.20). As most of the study populations are from the area that was completely covered by an ice sheet during the last glacial period, it is possible that the low variability of grayling in northern Europe reflects postglacial colonization and subsequent founder effects [46,47]. However, the southernmost population from central Germany included in our study showed relatively low levels of variability (A r = 3.23, H e = 0.41) and other studies concentrating on the southern distribution range of European grayling have shown low diversity levels [2,7,8,37,48]. This strongly indicates that genetic drift and limited migration have a strong impact on diversity and population structure of grayling. However, as there is no estimator of the microsatellite mutation rate available for grayling one cannot exclude the possibility that the mutation rate in European grayling is lower than in other salmonid fishes, although it seems unlikely.

The role of mutations contributing to population differentiation
Despite the clear separation of the three population groups in Finland there was no evidence that mutations have significantly contributed to the genetic differentiation among these groups. Thus, even though we observed relatively high differentiation between the three population groups, the allele permutation test indicated that patterns of differentiation are mostly driven by genetic drift and low migration rather than accumulation of new mutations. However, the allele size permutation test strongly suggested that mutations significantly contributed to population differentiation on a broader geographic scale.

Population conservation categorization based on the role of different evolutionary forces
The "central dogma of conservation genetics" is that genetic variability is beneficial and therefore it is often assumed that increasing genetic variability enhances population survival [49]. Another important aspect of prioritizing populations for conservation is their genetic uniqueness measured at the molecular genetic or phenotypic level (e.g. evolutionary distinctiveness [50]). Hence for population conservation, it is relevant to evaluate both of these parameters but considerable controversy exists about the relative weights of diversity and uniqueness components (e.g [51]). Based on the strong clustering of Finnish populations, it is recommendable that national conservation efforts include populations from all three genetically distinct groups. In this study, we used an alternative approach that is still based on examining populations according to diversity and differentiation but instead of simply summing the diversity and uniqueness components based on rather subjective weighting, populations were categorized based on the predominant evolutionary forces acting on them. As a result, this approach is expected to be more objective compared to ranking solely on the summation of diversity and differentiation components based on arbitrary weights without losing crucial information about the relative roles of underlying evolutionary forces. For example, when simply combining diversity and differentiation components sensu Petit et al. [33], it is possible to get similar prioritization ranks for two populations with markedly different demographic histories -a population with low diversity-high differentiation can have a similar priority rank as another population with high diversity-low differentiation. It is clear that the first population could be severely affected by e.g. inbreeding depression while the second population might contain high genetic variability necessary for population survival and long-term evolution.
When examining diversity and differentiation components in grayling populations, we identified four categories where the relative importance of different evolutionary forces (namely drift and migration) varies. These categories are high diversity-high differentiation (category 1), high diversity-low differentiation (category 2), low diversity-high differentiation (category 3) and low diversity-low differentiation (category 4). From the conservation perspective, the populations falling into the high diversity-high differentiation category (e.g. Kit, Kem and Pur; from the Baltic, northern and south-eastern groups, respectively) may be less affected by genetic drift and migration as they represent large isolated populations. This high diversity-high differentiation category has the highest likelihood of containing unique genetic material. Notably, all three Baltic members fall into this category and given that grayling populations in the Baltic Sea are likely adapted to the brackish environment, they are hence highly relevant for conservation [52]. Populations that belong to the high diversity-low differentiation category (Tor, KasLat and KasPor) represent relatively large populations where the effect of drift is small, while migration may have some effect on the gene pool. We propose that populations in these two categories represent the top priority for conservation. Populations that belong to the low diversityhigh differentiation category (e.g. Vuo, Lie and Kai) may represent small populations strongly affected by genetic drift. From a conservation genetic perspective, these populations have a lower probability of being able to evolve and survive in the future. Importantly, this does not necessarily mean that these populations cannot adapt to local conditions. For example, it has been shown that even in small grayling populations natural selection can have a predominant role over random genetic drift in affecting the fitness of phenotypic traits [10]. Populations that belong to the low diversity-low differentiation category (Ten and Juu) might be classified as a low conservation priority, as low differentiation reflects their lack of 'uniqueness' compared to other populations and low variability can additionally hinder adaptation in the future. However, it is also important to emphasize that these two populations have both the diversity and differentiation components quite close to the average across all populations. Taken together, the high levels of genetic differentiation observed among grayling populations here, as well as in other studies [3,4,8,11,37] clearly suggests that generally single populations should be the principal unit for conservation and management and thus population intermixing should be avoided. We also recommend using the proposed categorization strategy, taking into account the relative role of different evolutionary forces, as a basis for the conservation of grayling in Finland as well as of other species.

Conclusion
This research provides genetic information on European grayling in its northern distribution range in order to assist country-scale conservation. We found unanticipated population structure of grayling, clustering into three groups largely corresponding to the northern, Baltic and south-eastern geographic areas of Finland and we recommend that these three groups should be used as the starting point for developing a national grayling conservation strategy. However, as the observed clusters extend beyond the borders of Finland, international co-operation for broader scale conservation management is warranted. We also found high levels of genetic differentiation among the groups and moderate to high differentiation within the groups. Such strong divergence indicates that single grayling populations should generally be recognized as separate management units. We also developed an alternative prioritization strategy in the conservation perspective by categorizing populations based on the evaluation of the relative role of various evolutionary forces affecting the indigenous gene pool. We envision that the proposed population categorization approach could be useful for a diverse range of organisms.