Analysing diet of small herbivores: the efficiency of DNA barcoding coupled with high-throughput pyrosequencing for deciphering the composition of complex plant mixtures

Background In order to understand the role of herbivores in trophic webs, it is essential to know what they feed on. Diet analysis is, however, a challenge in many small herbivores with a secretive life style. In this paper, we compare novel (high-throughput pyrosequencing) DNA barcoding technology for plant mixture with traditional microhistological method. We analysed stomach contents of two ecologically important subarctic vole species, Microtus oeconomus and Myodes rufocanus, with the two methods. DNA barcoding was conducted using the P6-loop of the chloroplast trnL (UAA) intron. Results Although the identified plant taxa in the diets matched relatively well between the two methods, DNA barcoding gave by far taxonomically more detailed results. Quantitative comparison of results was difficult, mainly due to low taxonomic resolution of the microhistological method, which also in part explained discrepancies between the methods. Other discrepancies were likely due to biases mostly in the microhistological analysis. Conclusion We conclude that DNA barcoding opens up for new possibilities in the study of plant-herbivore interactions, giving a detailed and relatively unbiased picture of food utilization of herbivores.


Background
Small mammalian herbivores, such as voles and lemmings, play a key role in many boreal ecosystems where they function as the main link between the vegetation and predators [1,2]. Their dramatic population fluctuations and consequent ecosystem implications [3,4] have been subject to intensive studies for decades [5][6][7]. Trophic interactions have been emphasized as a main determinant of these fluctuations [8]. While predator-prey interactions have been much emphasized [6], the interactions between voles and plants have gained new attention lately (see e.g. [9,10]), as new interaction pathways have been identified [11][12][13]. However, plant-herbivore interactions cannot be understood without knowing, among other aspects, herbivore diets in natural settings. Yet most studies on plantvole interactions have used only limited and indirect diet data [14][15][16], such as recording signs of rodent feeding on vegetation [17,18] or cafeteria experiments in artificial settings [19,20]. The most direct information comes from microhistological analysis of stomach content [21][22][23][24]. However, this is a very time-consuming method giving unspecific and context-dependent results [25], due to the small particle size as well as the composition of encountered plant material in rodent stomachs.
DNA barcoding, i.e. taxon identification using a standardized DNA region [26], is now increasingly used in ecological studies (see Valentini et al. [27] for a review), including diet analysis. DNA barcoding is particularly useful in diet determination when the food is not identifiable by morphological criteria, such as in the case of liquid feeders [28], or when the diet cannot be deduced by observing the feeding behavior, (e.g. diatom-feeding krill [29]). Diet analyses based on DNA markers concern so far mostly carnivorous animals (e.g. [30]). A universal approach for herbivorous diet analysis has been developed recently [31]. This approach combines the new highly parallel sequencing systems [32], with the amplification of the P6 loop of the chloroplast trnL (UAA) intron [33].
In this paper, we present the first results on species-level food selection of two ecologically important vole species (Microtus oeconomus and Myodes (formerly Clethrionomys) rufocanus) in sub-arctic ecosystems based on DNA barcoding of ingested plants. We compare the results with the traditional microhistological method, which, for small rodents, has previously not been assessed against other methods.

Vole trapping
The voles were sampled in low arctic tundra at Varanger peninsula (Finnmark, Norway: 70° 20' N, 30° 00' E) in July and September 2007. The sampling was conducted by snap trapping according to standard methods [34], after which the voles were dissected and their stomachs were stored in 70% ethanol for approximately 1/2 year prior to analysis. A sample of 48 individuals was chosen to be analyzed with both methods, based on the available space in DNA analysis batch and on stratifying according to species and season. The stomachs were dissected and the con-tents homogenized. Two subsamples were taken from each stomach for DNA analysis. They were stored in paper filter bags and submerged in silicagel to dry them. Remaining stomach contents were stored in 70% ethanol until the microhistological analysis.

Microhistological analysis
Microhistological recognition of food particles in vole stomachs was based on leaf epiderm morphology. The shape of epidermal cells is taxon-specific, and several additional features, such as trichomes, hairs and characteristics of cells surrounding stomata, can be used for species identification [25,35]. A photography-guide (Soininen & Nielsen, unpublished data) of epidermis of all vascular plants recorded at the sampling area (Ravolainen et al. unpublished data) was prepared using a method modified from Carrière [25]. Dry plant samples were soaked overnight, scraped to reveal the epidermis and bleached with household bleach to clear the tissue of chlorophyll. Hard leaves were first boiled in table vinegar to soften the mesophyll tissue. Microphotographs (40×) were taken of abaxial and adaxial leaf side and leaf edge of all plants. Additional photographs were taken of stems and seeds of certain species of special interest. In addition to the specifically prepared epidermis photographs, photographs and microscopy slides of arctic plant epidermis, received from C. Hübner and E. Bjørkevoll, were used to aid identification.
A method modified from Hansson [36] was used for microscopy analysis. After taking subsamples for DNA analysis, stomach contents were filtered to > 0.16 mm and > 0.56 mm fractions. These were bleached with approximately 2 mL of household bleach for approximately 1/2 hour. One sample per fraction was analyzed, mounting a droplet of it on a microscopy slide. The frequency of occurrence of food items was recorded by light microscope (40×), by counting 25 hits on identifiable material along a measure grid. When approximately 95% of fragments were unidentifiable, the slide was discarded. For four individuals, no slide with adequate amount of identifiable material could be made. Therefore, they were discarded from the microhistological analysis. . The mixture was denatured at 95°C for 10 min, followed by 35 cycles of 30 s at 95°C, and 30 s at 55°C; the elongation was removed in order to reduce the +A artifact [37,38]. Samples were amplified with using the universal primers g and h described by Taberlet et al. [33]. The addition of a specific tag on the 5' end allowed an assignement of sequences to the respective samples. After amplification all samples were pooled for the pyrosequencing run. Each sample was recognized by a specific five bases long tag with at least two differences between tags for a better assignation of sequences to samples during bioinformatic segregation of sequences.
PCR products were purified using the MinElute PCR purification kit (Qiagen GmbH, Hilden, Germany). DNA quantification was carried out using the BioAnalyzer (Agilent Technologies, Inc., Santa Clara, CA). Taking these concentrations into account PCR products were pooled leading to equal amounts per sample. Large-scale pyrosequencing was carried out using GS FLX sequencer (Roche, Basel, Switzerland) following the manufacturer's instructions.
The first step of analyzing the output of the pyrosequencing consisted of sorting the different sequences according to the tag present on the 5' end of the primers. Thus, for each sample (each stomach content), a new file was generated, containing all the sequences having the relevant tag. Then, these sequences were analyzed to determine the diet. To limit the influence of sequence errors [39], only sequences that were present more than three times were considered in the subsequent analyses.
The sequences were compared to a database of 842 species representing all widespread and/or ecologically important taxa of the arctic flora (GenBank accession number GQ244527 to GQ245667) (Sønstebø et al: A minimalist DNA barcoding approach for reconstructing past Arctic vegetation and climate, submitted). It was developed by sequencing the whole chloroplast trnL (UAA) intron of these species using primer pair designed by Taberlet et al. [40], and following the protocol described and evaluated in Taberlet et al. [33]. In the database a total of 33,5% of species and 77,1% of genera could be identified by the P6 loop. All families were unambiguously identified (Sønstebø et al: A minimalist DNA barcoding approach for reconstructing past Arctic vegetation and climate, submitted). When sequences were not fully identified using the arctic plant database, they were compared with sequences retrieved from GenBank, using ecoPCR [33]; http:// www.grenoble.prabi.fr/trac/ecoPCR. The taxon was assigned to each sequence in a dataset by similarity assessment with a reference database using FASTA [41] algorithm. A FASTA alignment was retrieved if there was at least 98% of identity between query and database sequences and 100% of query coverage. If two or more taxa could be assigned with the same score for a given sequence, we assigned this sequence to the higher taxonomic level that included both taxa. This method resulted in some sequenced taxa being assigned to the rank of genus or family.
Chimeric sequences are a well know problem when amplifying a mixture of homologous genes, and it is impossible to avoid their formation [42]. But if two unrelated taxa compose the chimeric sequence the resulting sequence is not taken into account because for taxon identification the FASTA alignments is retrieved only if the sequence have 100% of query coverage with the reference sequence. If two related taxa compose the chimeric sequence, this sequence is assigned to the higher taxonomic level that included both taxa (e.g. genus, family, order, etc.). Figure 1 Taxonomic resolution of vole diets. Taxonomic resolution of the diets of Microtus oeconomus and Myodes rufocanus according to two methods of diet analysis (microhistology and DNA barcoding). Proportions are taken from total number of hits on identifiable material for microhistology and total number of identified sequences for DNA barcoding. Category "other" includes items not assigned to any taxonomic level by microhistology, but to morphologic groups (e.g. seed, root).  .08 (± .12)

Data analysis
Results of the two methods were compared in two ways. First, the taxonomic resolution obtained by the two methods was examined by comparing the relative frequencies of hits (microscopy) and sequences (DNA) at different taxonomic levels. For this comparison, the four individuals with no microscopy data were excluded also from DNA dataset.
Second, the relative frequencies of food items in diets were compared between the methods. Taxonomic adjustments were first made to make the results from the two methods comparable. Nomenclature from species to family level follows Lid & Lid [43] and Elven [44], and for higher taxonomy Judd et al. [45]. Several genera are represented only by one species in the study area and sequences assigned to these were therefore attributed to the respective species (e.g. Arctous alpinus, Bistorta vivipara, Chamaepericlymenum suecicum, Rumex acetosa). Similar adjustments were done at other taxonomic levels (e.g. Salicaceae to Salix and Ranunculales to Ranunculaceae). Then, proportions of food items estimated at the level of individual voles were averaged (mean ± standard deviation) across species and sampling season for both methods.

Results
Using DNA barcoding, 75% of all sequences were identified at least to the genus level (Figure 1), whereas with the microhistological method, less than 20% of the identified fragments could be specified at this level. Consequently, more plant species and genera were identified in vole diets with the DNA barcoding than microhistology (Table 1 and 2). For the M. oeconomus diet, DNA barcoding identified 13 species and 9 genera compared with 9 and 5 with microscopy. Corresponding numbers for M. rufocanus were 17 and 8 (DNA barcoding) compared with 11 and 7 (microscopy).
Differences in taxonomic resolution between the methods made diet comparison complicated. For instance using microhistology, a large number of fragments could only be assigned to high taxonomic levels such as eudicotyledons and monocotyledons. Still, most of the species or genera that according to DNA barcoding were found to be quantitatively important in the diet, were also found to be so in the microhistological data, after differences in taxonomic accuracy were taken into account (Table 1 and 2). For example, for M. rufocanus both methods suggested that Vaccinium spp. are important food resources during both seasons, but especially in autumn (Table 2). Similarly, there was an agreement between the methods with respect to the importance of the family Polygonaceae (including Rumex acetosa and other species within this family) for both vole species in summer and that family Caryophyllaceae (with the species Stellaria nemorum and Cerastium spp.) is prevalent in M. oeconomus diet in the same season (Table 1 and 2). On the contrary, the substantial amounts of graminoids and horsetails (Equisetum spp.) found in the diet of M. oeconomus with microscopy were not evident from DNA analysis. Moreover, the prevalence of non-plant food items (fungi) and various plant structures (bark, root, seed) identified with microhistology could not be identified by DNA barcoding.
Both methods showed large variation in diets between individuals; the mean proportion of food items was often smaller than its standard deviation (Table 1 and 2).

Discussion
Although there was an agreement between the methods with respect to the importance of the main plant groups, DNA barcoding gave by far a taxonomically more detailed picture of the diet of the two vole species than did the microhistological analysis. Indeed, much of the discrepancy in the results derived from the two methods could be explained by differences in taxonomic resolution. Other Proportion of identified food items in the diet of Microtus oeconomus using microhistological (mic) and DNA barcoding (dna) methods, summer and autumn, (mean ± standard deviation). At each taxonomic level, also the proportions from lower levels are included, except for Euphyllophytes where only the items which could not be assigned to a lower taxa are included. NA = Food item not identifiable with the method in question.  discrepancies must be attributed to particular biases in the methods. Because a substantial fraction of the stomach content is left unidentified by the microhistological diet analysis, the proportion of the different plant groups in the diets will be biased towards the most easily identified groups by this method. One example might be the fairly large discrepancy in the abundance of monocotyledons between the methods. Another bias is due to varying epiderm/mesophyll ratios between taxa. Finally, a general problem is introduced by a great deal of subjectivity in the microhistological identification processes so that experience of the observer will have a major impact on the outcome of the analysis.
However, quantitative comparison between the methods should be done cautiously. Quantitative interpretation of DNA barcoding results is not straightforward; partly because it is based on chloroplast DNA. Thus, species from which chloroplast-rich tissues are eaten, are likely to be overrepresented compared to species mostly represented by e.g. seed or root in the diet [31,46]. In addition, even if the trnL approach using g-h primers is universal for angiosperms and gymnosperms [33], the horsetails, mosses and fungi identified with microhistological method are mostly omitted by the DNA analysis. Together with easy microhistological identification this explains discrepancy between methods in the amount of Equisetum. Although DNA barcoding using the trnL approach also has its shortcomings, we conclude that it is superior to traditional methods for establishing diet analysis based on stomach content in small rodents. This novel method is less prone to the biases and context-dependencies that hamper microhistological analyses and yields a vastly improved taxonomic resolution. Information about diets at the level of individual plant species opens for testing more precise hypotheses on interactions between voles and plants. Furthermore, the new DNA-based technology makes it possible to study vole-plant interaction by nondestructive sampling of faeces in the natural habitats of voles. In faeces, DNA is highly degraded and only small fragments remain [47]. The short sequences obtained using the trnL approach (10-143 base pairs) allow applying it on faecal samples [31]. In this case, the first step will be to identify the rodent species using a mitochondrial DNA marker, and the second step will be the diet analysis. The analysis can even be more specific, by performing individual and sex identification using microsatellite polymorphism and Y-chromosome amplification [48,49]. Thus, diet comparisons among species, individuals and sexes can be carried out, even without observing the animals (e.g. [31]).
For increasing the resolution in genera where sequences do not vary among the species (e.g. Carex and Salix), the trnL approach using the g-h primers can be complemented by one or several additional systems, specially designed Proportion of identified food items in the diet of Myodes rufocanus using microhistological (mic) and DNA-barcoding (dna) methods, summer and autumn, (mean ± standard deviation). At each taxonomic level, also the proportions from lower levels are included, except for Euphyllophytes where only the items which could not be assigned to a lower taxa are included. NA = Food item not identifiable with the method in question. for amplifying a short and variable region in these genera (as suggested in Valentini et al. [31]). Another strategy would be to amplify the whole trnL intron using universal primers such as c and d designed by Taberlet et al. [40] (254-767 bp), which strongly increases the resolution [33]. However, even if this second strategy can be easily implemented for the analysis of stomach content (using the Titanium upgrade of the 454 FLX), it will not be suitable for faeces analysis because of the shortness of the degraded plant DNA fragments.