How to describe a cryptic species? Practical challenges of molecular taxonomy

Background Molecular methods of species delineation are rapidly developing and widely considered as fast and efficient means to discover species and face the 'taxonomic impediment’ in times of biodiversity crisis. So far, however, this form of DNA taxonomy frequently remains incomplete, lacking the final step of formal species description, thus enhancing rather than reducing impediments in taxonomy. DNA sequence information contributes valuable diagnostic characters and –at least for cryptic species – could even serve as the backbone of a taxonomic description. To this end solutions for a number of practical problems must be found, including a way in which molecular data can be presented to fulfill the formal requirements every description must meet. Multi-gene barcoding and a combined molecular species delineation approach recently revealed a radiation of at least 12 more or less cryptic species in the marine meiofaunal slug genus Pontohedyle (Acochlidia, Heterobranchia). All identified candidate species are well delimited by a consensus across different methods based on mitochondrial and nuclear markers. Results The detailed microanatomical redescription of Pontohedyle verrucosa provided in the present paper does not reveal reliable characters for diagnosing even the two major clades identified within the genus on molecular data. We thus characterize three previously valid Pontohedyle species based on four genetic markers (mitochondrial cytochrome c oxidase subunit I, 16S rRNA, nuclear 28S and 18S rRNA) and formally describe nine cryptic new species (P. kepii sp. nov., P. joni sp. nov., P. neridae sp. nov., P. liliae sp. nov., P. wiggi sp. nov., P. wenzli sp. nov., P. peteryalli sp. nov., P. martynovi sp. nov., P. yurihookeri sp. nov.) applying molecular taxonomy, based on diagnostic nucleotides in DNA sequences of the four markers. Due to the minute size of the animals, entire specimens were used for extraction, consequently the holotype is a voucher of extracted DNA ('DNA-type’). We used the Character Attribute Organization System (CAOS) to determine diagnostic nucleotides, explore the dependence on input data and data processing, and aim for maximum traceability in our diagnoses for future research. Challenges, pitfalls and necessary considerations for applied DNA taxonomy are critically evaluated. Conclusions To describe cryptic species traditional lines of evidence in taxonomy need to be modified. DNA sequence information, for example, could even serve as the backbone of a taxonomic description. The present contribution demonstrates that few adaptations are needed to integrate into traditional taxonomy novel diagnoses based on molecular data. The taxonomic community is encouraged to join the discussion and develop a quality standard for molecular taxonomy, ideally in the form of an automated final step in molecular species delineation procedures.


Results:
The detailed microanatomical redescription of Pontohedyle verrucosa provided in the present paper does not reveal reliable characters for diagnosing even the two major clades identified within the genus on molecular data. We thus characterize three previously valid Pontohedyle species based on four genetic markers (mitochondrial cytochrome c oxidase subunit I, 16S rRNA, nuclear 28S and 18S rRNA) and formally describe nine cryptic new species (P. kepii sp. nov., P. joni sp. nov., P. neridae sp. nov., P. liliae sp. nov., P. wiggi sp. nov., P. wenzli sp. nov., P. peteryalli sp. nov., P. martynovi sp. nov., P. yurihookeri sp. nov.) applying molecular taxonomy, based on diagnostic nucleotides in DNA sequences of the four markers. Due to the minute size of the animals, entire specimens were used for extraction, consequently the holotype is a voucher of extracted DNA ('DNA-type'). We used the Character Attribute Organization System (CAOS) to determine diagnostic nucleotides, explore the dependence on input data and data processing, and aim for maximum traceability in our diagnoses for future research. Challenges, pitfalls and necessary considerations for applied DNA taxonomy are critically evaluated. Conclusions: To describe cryptic species traditional lines of evidence in taxonomy need to be modified. DNA sequence information, for example, could even serve as the backbone of a taxonomic description. The present contribution demonstrates that few adaptations are needed to integrate into traditional taxonomy novel diagnoses based on molecular data. The taxonomic community is encouraged to join the discussion and develop a quality standard for molecular taxonomy, ideally in the form of an automated final step in molecular species delineation procedures.

Background
Species boundaries are frequently hard to delimit based on morphology only, a fact which has called for integrative taxonomy, including additional sources of information such as molecular data, biogeography, behavior and ecology [1,2]. Founding a species description on a variety of characters from different, independent datasets is generally regarded as best practice [3]. When species are considered as independently evolving lineages [4], different lines of evidence (e.g., from morphology, molecules, ecology or distribution) are additive to each other and no line is necessarily exclusive nor need different lines obligatory be used in combination [3,5]. Taxonomists are urged to discriminate characters according to their quality and suitability for species delineation, rather than to just add more and more data [5]. The specifics of the taxon in question will guide the way to the respective set(s) of characters that will provide the best backbone for the diagnosis. In cases of pseudo-cryptic species (among which morphological differences can be detected upon re-examining lineages separated e.g. on molecular data) or of fully cryptic species (that morphology fails to delimit), the traditional lines of evidence have to be modified by using, e.g., molecular information to break out of the 'taxonomic circle' [6,7].
Cryptic species are a common phenomenon throughout the metazoan taxa, and can be found in all sorts of habitats and biogeographic zones [8][9][10]. Groups characterized by poor dispersal abilities (e.g., most meiofaunal organisms or animals inhabiting special regions where direct developers predominate, such as Antarctica), are especially prone to cryptic speciation [11,12]. Uncovering these cryptic species is fundamental for the understanding of evolutionary processes, historical biogeography, ecology, and also to conservation approaches, as distribution ranges that are smaller than initially assumed mean a higher risk of local extinction [8,10]. The lack of morphological characters to distinguish cryptic species should not lead to considerable parts of biological diversity remaining unaddressed.
The utility of DNA barcoding and molecular species delineation approaches to uncover cryptic lineages has been demonstrated by numerous studies (e.g., [11,[13][14][15][16][17][18][19]). Unfortunately, inconsistencies in terminology associated with the interface between sequence data and taxonomy have led to confusion and various criticisms [6,20]. First of all, one needs to distinguish between species identification via molecular data (DNA barcoding in its strict sense) and species discovery [6,21,22]. While species identification is a primary technical application, species delimitation requires means of molecular species delineation that is either distance, tree or character based [6,23]. Under ideal circumstances sufficient material is collected from different populations across the entire distribution area of a putative group of cryptic species. Using population genetics the distribution of haplotypes can be analyzed and different, genetically isolated lineages can be detected [24]. Population genetic approaches are, however, not always feasible with animals that are rare or hard to collect, which might actually be a common phenomenon across faunas of most marine ecosystems [25][26][27][28]. Derived from barcoding initiatives, threshold based species delimitation became the method of choice, aiming for the detection of a 'barcoding gap' between intraand interspecific variation [29][30][31]. This approach has been criticized, however, due to its sensitivity to the degree of sampling, the general arbitrariness of fixed or relative thresholds, and to frequent overlap between intra-and interspecific variation [6,32,33]. In the recently developed Automatic Barcode Gap Discovery (ABGD) [34], progress has been made in avoiding the dependence of a priori defined species hypotheses in threshold based approaches, but reservations remain concerning the concept of a barcoding gap [25]. Several independent delineation tools exist, e.g. using haplotype networks based on statistical parsimony [35], maximum likelihood approaches applying the General Mixed Yule-Coalescent model [36,37], or Bayesian species delineation [38,39]. Empirical research currently compares the powers of these different tools on real datasets [25,32,40]. The effect of the inclusion of singletons in analyses is considered as most problematic [25]. At the present stage of knowledge, independent approaches allowing cross-validation between the different methods of molecular species delineation and other sources of information (morphology, biogeography, behavioral traits) seem the most reliable way of delimiting cryptic species [25].
The second inconsistency in terminology concerns usages of 'DNA taxonomy'. Originally, DNA taxonomy was proposed to revolutionize taxonomy by generally founding descriptions on sequence data and overthrowing the Linnaean binominal system [41]. Alternatively, it was suggested as a concept of clustering DNA barcodes into MOTUs [42]. Since then, however, it has been applied as an umbrella term for barcoding, molecular species delineation, and including molecular data in species descriptions (see e.g., [13,14,20,36,43,44]). In a strict sense, one cannot speak of molecular taxonomy if the process of species discovery is not followed by formal species description (i.e. there are two steps to a taxonomic process: species discovery (delimitation) and attributing them with formal diagnoses and names.) Taxonomy remains incomplete if species hypotheses new to science are flagged as merely putative by provisional rather than fully established scientific names. For practical reasons and journal requirements, most studies on molecular species delineation postpone formal descriptions of the discovered species (e.g., [13,14,25,33,36,40,[43][44][45][46]), and then rarely carry them out later. DNA barcoding and molecular species delineation are promoted as fast and efficient ways to face the 'taxonomic impediment' , i.e. the shortage of time and personnel capable of working through the undescribed species richness in the middle of a biodiversity crisis [7,47,48]. However, keeping discovered entities formally unrecognized does not solve the taxonomic challenges but adds to them by creating parallel worlds populated by numbered MOTUs, OTUs or candidate species. In many cases the discovered taxa remain inapplicable to future research, thus denying the scientific community this taxonomic service, e.g. for species inventories or conservation attempts. Without formal description or a testable hypothesis, i.e. a differential diagnosis, 1) the discovered species might not be properly documented or vouchered by specimens deposited at Natural History Museums; and 2) their reproducibility can be hindered and confusion caused by different numbering systems. A deterrent example of the proliferation of informal epithets circulating as 'nomina nuda' (i.e. species which lack formal diagnoses and deposited vouchers) in the literature is given by the 'ten species in one' Astraptes fulgerator complex [31,49]. Thus, we consider it as all but indispensable for DNA taxonomy to take the final step and formalize the successfully discovered molecular lineages.
The transition from species delimitation to species description is the major task to achieve. Nearly ten years after the original proposal of DNA taxonomy [41], revolutionizing traditional taxonomy has found little acceptance in the taxonomic community, as most authors agree that there is no need for overthrowing the Linnaean System. Consequently, the challenge is to integrate DNA sequence information in the current taxonomic system. Several studies have attempted to include DNA data in taxonomic descriptions, albeit in various non-standardized ways; see the review by Goldstein and DeSalle ([21]; box 3): In some cases, DNA sequence information is simply added to the taxonomic description (in the form of GenBank numbers or pure sequence data), without evaluating and reporting diagnostic features [21]. Others rely on sequence information for the description, either reporting results of species delineation approaches, e.g. raw distance measurements or model based assumptions, or extracting diagnostic characters from their molecular datasets. There still is a consensus that species descriptions should be character based [50] (but see the Discussion below for attempts at model based taxonomy), and that tree or distance based methods fail to extract diagnostic characters [6]. Character based approaches, like the Characteristic Attribute Organization System (CAOS), are suggested as an efficient and reliable way of defining species barcodes based on discrete nucleotide substitution, and these established diagnostics from DNA sequences can be used directly for species descriptions as molecular taxonomic characters [51,52]. Yet, the application of CAOS or similar tools requires an evaluation of how to select and present molecular synapomorphies and how to formalize procedures to create a 'best practice' linking DNA sequence information to existing taxonomy [20].
In the present study, we formally describe the candidate species of minute mesopsammic sea slugs in the genus Pontohedyle Golikov & Starobogatov (Acochlidia, Heterobranchia) discovered by Jörger et al. [25]. This cryptic radiation was uncovered in a global sampling approach with multi-gene and multiple-method molecular species delineation [25]. The initially identified 12 MOTUs, nine of which do not correspond to described species, are considered as species [following 4] resulting from a conservative minimum consensus approach applying different methods of molecular species delineation [25]. The authors demonstrated that traditional taxonomic characters (external morphology, spicules and radula features) are insufficient to delineate cryptic Pontohedyle species [25]. To evaluate the power of more advanced histological and microanatomical data, we first provide a detailed computer based 3D redescription of the anatomy of Pontohedyle verrucosa (Challis, 1970) and additional histological semithin sections of P. kepii sp. nov. In the absence of reliable diagnostic characters from morphology and microanatomy, we then rely on DNA sequence data as the backbone for our species descriptions. For the three previously valid Pontohedyle species we extract diagnostic characters using the Character Attribute Organization System (CAOS) based on four standard markers (mitochondrial cytochrome c oxidase subunit I, 16S rRNA, and nuclear 18S rRNA and 28S rRNA). In addition, nine new species are formally described on molecular characteristics and evidence from other data sources. Various approaches to the practical challenges for molecular driven taxonomysuch as critical consideration of the quality of the alignment, detection of diagnostic nucleotides and their presentation aiming for maximum traceability in future studiesare tested and critically evaluated.

Evaluation of putative morphological characters
The diversity within Pontohedyle revealed by molecular data cannot be distinguished externally: the body shows the typical subdivision into the anterior head-foot complex and the posterior visceral hump. Bodies are whitishtranslucent, digestive glands are frequently bright green to olive green. Rhinophores are lacking, labial tentacles are bow-shaped and tapered towards the ends (see Figures 1  and 2). Monaxone rodlet-like spicules distributed all over the body and frequently found in an accumulation between the oral tentacles are characteristic for Pontohedyle. These spicules can be confirmed for P. wenzli sp. nov., for P. yurihookeri sp. nov., P. milaschewitchii  and P. brasilensis (Rankin, 1979), and, in contrast to the original description [53], also in P. verrucosa. No spicules could be detected in P. peteryalli sp. nov. from Ghana. The absence of spicules is insufficient, however, to delineate microhedylid species, since their presence can vary under environmental influence [54].
The radulae of eight species were investigated using SEM (see Figures 1 and 2). Radulae of P. neridae sp. nov., P. martynovi sp. nov. and P. yurihookeri sp. nov. were not recovered whole from molecular preparations, and thus were unavailable for further examination [25]. The radula of P. wiggi sp. nov. could only be observed under the lightmicroscope, but not successfully transferred to a SEM stub. All radulae are hook-shaped with a longer dorsal and a shorter ventral ramus, typical for Acochlidia. Radula formulas are 38-53 × 1.1.1, lateral plates are curved rectangular, and the rhachidian tooth is triangular and bears a central cusp and typically three smaller lateral denticles. Most radulae bear one pointed denticle centrally on the anterior margin of each lateral plate and a corresponding notch on the posterior side. Only the radula of P. kepii sp. nov. and P. verrucosa can be clearly distinguished from the others by the absence of this denticle and the more curved lateral teeth (see Figure 1A and [25], Figure 1D,E). Uniquely, P. verrucosa bears five lateral denticles next to the central cusp of the rhachidian tooth [25]; in P. liliae sp. nov. a tiny fourth denticle borders the central cusp (see * in Figure 1C).
Previous phylogenetic analyses [25] recovered a deep split into two Pontohedyle clades: the P. milaschewitchii clade and the P. verrucosa clade. This is supported by novel analyses in a larger phylogenetic framework and additionally including a second nuclear marker (18S rRNA) (own unpublished data). Since no detailed histological account exists of any representative from the large P. verrucosa clade, we redescribe P. verrucosa (based on ZSM Mol-20071833, 20071837 and 20100548), supplementing the original description with detailed information of the previously undescribed nervous and reproductive systems. The central nervous system (cns) of P. verrucosa lies prepharyngeal and shows an epiathroid condition. It consists of paired rhinophoral, cerebral, pleural, pedal and buccal ganglia and three unpaired ganglia on the visceral nerve cord, tentatively identified as left parietal ganglion, median fused visceral and subintestinal ganglion and right fused parietal and supraintestinal ganglion ( Figure 3A). An  [25]); C) P. brasilensis (living animal from WA-3 (Belize), radula from WA-10 (Brazil), see [25]). cc = central cusp of rhachidian tooth, llp = left lateral plate, rlp = right lateral plate, rt = rhachidian tooth. osphradial ganglion or gastro-oesophagial ganglia were not detected. Anterior and lateral to the cerebral ganglia are masses of accessory ganglia. Due to the retracted condition of all examined specimens, tissues are highly condensed and no separation in different complexes of accessory ganglia could be detected. Attached to the pedal ganglia are large monostatolith statocysts. Oval, unpigmented globules are located in an antero-ventral position of the cerebral ganglia, interpreted as the remainder of eyes (see Figure 3B). P. verrucosa is a gonochoristic species. The three sectioned specimens include two males and one female. The male reproductive system is comprised of gonad, ampulla, postampullary sperm duct, prostatic vas deferens, ciliated (non-glandular) vas deferens, genital opening and a small ciliated 'subepidermal' duct leading to a second genital opening anterodorsally of the mouth opening ( Figure 3C). The sac-like gonad is relatively small and bears few irregular distributed spermatozoa. The large tubular ampulla emerges from the gonad without a detectable preampullary sperm duct; it is loosely filled with irregularly distributed spermatozoa ( Figure 3D). The ampulla leads into a short, narrow ciliated postampullary duct widening into the large tubular prostatic vas deferens (staining pink in methylene-blue sections, Figure 3D). Close to the male genital opening, the duct loses its glandular appearance and bears cilia. The primary genital opening is located on the right side of the body at the visceral hump and close to the transition with the head-foot complex. Next to the genital opening, the anterior vas deferens splits off as an inconspicuous subepithelial ciliated duct that leads anteriorly on the right side of the head foot complex. It terminates in a second genital opening between the oral tentacles anterodorsally from the mouth opening. The female reproductive system consists of gonad, nidamental glands and oviduct ( Figure 3E) and a genital opening located on the right side, in the posterior part of the visceral hump (not visible in Figure 3E, due to the retracted stage of the individual). The gonad is saclike and bears one large vitellogenic egg (see Figure 3F) and several developing oocytes. Three histologically differentiated tube-like nidamental glands could be detected with a supposedly continuous lumen and with an epithelium bearing cilia. From proximal to distal these glands are identified as albumen gland (cells filled with dark blue stained granules), membrane gland (pinkish, vacuolated secretory cells) and winding mucus gland (secretory cells stained pink-purple). In its proximal part the distal oviduct shows a similar histology as the mucous gland, but then loses its glandular appearance. The epithelium of the distal oviduct bears long, densely arranged cilia.
Additional notable histological features are numerous dark-blue-stained epidermal gland cells (see e.g., arrowhead in Figure 3D) and refracting fusiform structures in the digestive gland (see Figure 3B). An additional series of histological semi-thin sections of Pontohedyle kepii sp. nov. was sectioned and brief investigation revealed no variation in the major organization of the organ systems in Pontohedyle as described herein and in previous studies [55,56].

Remarks on the presentation of molecular characters
Diagnostic characters for each species of Pontohedyle were extracted using the 'Characteristic Attribute Organization System' (CAOS) [51,57,58]. We define diagnostic characters as single pure characters, i.e. unique character states that respectively occur in all investigated specimens in a single Pontohedyle species but in none of the specimens of its congeners. As additional information single heterogeneous pure characters (i.e., different character states present within the species but absent from the congeners) are reported (for further details on the chosen approach see the Material and methods and Discussion sections). Positions refer to the position of the diagnostic nucleotide within the respective alignment (see Additional files 1, 2, 3, 4, 5 and 6). Where alignment positions differ from those in the deposited sequences, positions within the sequence of the holotype or in another reference sequence are also provided.    [61] Phylogenetic analyses of the genus Pontohedyle [25] confirmed earlier assumptions, that the three genera established by Rankin [62] (see above) present junior synonyms of Pontohedyle.

Taxonomy of Pontohedyle
Morphological characteristics of genus Pontohedyle: Minute (0.7-6 mm) marine interstitial microhedylacean acochlid. Body divided into anterior head-foot complex and posterior visceral hump. In case of disturbance head-foot complex can be entirely retracted into visceral hump. Body whithish translucent. Foot with short rounded free posterior end. Head bears one pair of bow-shaped dorso-ventrally flattened oral tentacles. Rhinophores lacking. Monaxone, calcareous spicules irregularly distributed over head-foot complex and visceral hump. Radula hook-shaped band (lateral view), formula 1-1-1, lateral plates curved or with one pointed denticle, rhachidian tooth triangular with one central cusp and 2-4 lateral cusps on each side. Nervous system with accessory ganglia at cerebral nerves anterior to the cns. Sexes separate, male reproductive system aphallic, sperm transferred via spermatophores.
Molecular diagnosis of the genus Pontohedyle, based on the sequences analyzed herein (Table 1) and on sequences from a set of outgroups including all acochlidian genera  for which data are available [63,64]. Positions refer to the alignments in Additional files 1 and 2, and to the reference sequences of P. milaschewitchii, ZSM Mol 20080054 (GenBank HQ168435 and JF828043) from Croatia, Mediterranean Sea (confirmed to be conspecific with material collected at the type locality in molecular species delineation approaches [25]). Molecular diagnosis is given in Table 2.
Pontohedyle milaschewitchii   Type material: To our knowledge no type material remains. Nevertheless we refrain from designating a neotype, as there is no taxonomic need, i.e. no possibility of confusion in the species' area of distribution.
Molecular diagnosis is given in Table 3.
Type material: According to Challis [53] in the Natural History Museum, London, and the Dominion Museum, Wellington, New Zealand. Own investigations revealed that the type material of Challis never arrived at the Natural History Museum, London and visiting the Museum of New Zealand Te Papa Tongarewa (former Dominion Museum), we were unable to locate any of her types. Thus, at current stage of   knowledge, type material might only remain in her private collection. We refrain from designating a neotype because we were unable to recollect at the type locality (see below). Distribution and habitat: Reported from Indonesia and the Solomon Islands [25,53]; marine, interstitial, intertidal, coarse sand.
Sequenced material: In a collecting trip to the Solomon Islands, we were unfortunately unable to recollect at the type locality (Maraunibina Island, East Guadalcanal), but successfully recollected in Komimbo Bay (West Guadalcanal), a locality, from which the describing author noted similar ecological parameters and recorded several meiofaunal slug species occurring at both sites [53,70] Additional material was collected at different collecting sites in Indonesia (see Figure 4).
Type material: No type material remaining in Marcus' collection (pers. comm. Luiz Simone). We nevertheless refrain from designating a neotype, since we lack material from the type locality.
Distribution and habitat: Caribbean Sea to southern Brazil [25,72]; marine, interstitial, intertidal to subtidal, coarse sand and shell gravel.
Sequenced material: Despite a series of recollecting attempts at the type locality and its vicinity in the past five years, we were unable to recollect any specimen of Pontohedyle in Southern Brazil. Our reference sequence refers to the southern-most specimen of a Western Atlantic Pontohedyle clade (see Figure 4), herein assigned to P. brasilensis (see Discussion). Additional material was collected at different collecting sites in the Caribbean (see Figure 4 for collecting sites and Figure 2C for photograph of a living specimen and SEM of radula).
Distribution and habitat: Currently known from type locality only; marine, interstitial, subtidal 5-6 m, coarse coral sand.
Molecular diagnosis is given in Table 6.
Positions of the diagnostic characters refer to the sequence of the holotype. Diagnostic characters in nuclear 18S rRNA were determined based on GenBank KC984290, in 28S rRNA based on GenBank JQ410967, in mitochondrial 16S rRNA based on GenBank JQ410966, and in mitochondrial COI based on GenBank JQ410912.
ZooBank registration: urn:lsid:zoobank.org:act:73AA C79D-5A43-40E4-B0D6-0329CAAA2AA0 Etymology: Named after Dr. Jon Norenburg to honor his efforts and enthusiasm for meiofaunal research and to thank him for his support for uncovering the largely unknown Caribbean meiofauna.
Distribution and habitat: Currently known from the Caribbean Sea (St. Vincent and Belize), type locality subtidal, 2-3 m depth, sand patches between seagrass, coarse sand. Additional material also subtidal, 14-15 m, sand patches between corals, coarse sand.
Distribution and habitat: Known from type locality only; subtidal 3-4 m, fine to medium coral sand.
Description: Morphologically with diagnostic characters of the genus Pontohedyle. Radula characteristics unknown.
Molecular diagnosis is given in Table 8.
The sequences retrieved from the holotype serve as reference sequences. Diagnostic characters in nuclear 28S rRNA were determined based onAM C. 476062.001 (GenBank JQ410986), in mitochondrial 16S rRNA based on AM C. 476062.001 (GenBank JQ410985), and in mitochondrial COI based on AM C. 476062.001 (GenBank JQ410922).
Distribution and habitat: Known from type locality only; subtidal 20 m, relatively fine coral sand.
Description: Morphologically with diagnostic characters of the genus Pontohedyle. Radula formula 45 × 1-1-1, rhachidian tooth with three (to four) lateral cusps, lateral plate with one pointed denticle ( Figure 1C). Eyes clearly visibly externally, monaxone spicules in accumulation between oral tentacles and irregular all over the body.
Distribution and habitat: Known from the type locality only; marine, interstitial between sand grains, relatively fine coral sand, subtidal 6-7 m depth, sandy slope among patches of corals.
Molecular diagnosis is given in Table 10. Note: Most species delineation approaches suggested ZSM 20100592, and some also AM C. 476051.001, as an independently evolving lineage [25]. Due to the conservative consensus approach, these specimens were included in the described species. Future analyses might show that their separation as independent species is warranted.
Distribution and habitat: Known from Indonesia, with putative distribution across the Indo-Pacific and Central Pacific; marine, subtidal (3-22 m), interstitial, coarse sand and shell grid. Description: Morphologically with diagnostic characters of the genus Pontohedyle, eyes clearly visible externally (see Figure 2B, picture of living holotype). Radula 43 × 1-1-1, rhachidian tooth with three lateral cusps, lateral plate with pointed denticle (like in P. milaschewitchii).
Additional material: six specimens in 75% Ethanol collected at Nzema Cape, Ghana, Africa, Gulf of Guinea, East Atlantic Ocean; conspecifity still needs to be confirmed via barcoding.
Distribution and habitat: Currently only known from the Ghana West Coast around MiaMia, marine, interstitial, subtidal 2-3 m, fine sand.
Molecular diagnosis is given in Table 12.
The sequences retrieved from the holotype (ZSM Mol 20071133) serve as reference sequences. Diagnostic characters in nuclear 18S rRNA were determined based on GenBank KC984298, in mitochondrial 16S rRNA based GenBank JQ410930 and in mitochondrial COI based on GenBank JQ410899.   Types: Holotype: DNA voucher (extracted DNA in buffer) AM C. 476054.001 (DNA bank accession number at ZSM AB34402062). Paratype: one specimen fixed in 5% formalin embedded in epoxy resin (AM C.476053.001), collected together with the holotype.
Distribution and habitat: Known from type locality only; marine, interstitial, subtidal 18-20 m, coarse sand, shell grid and rubble.
Description: Morphologically with diagnostic characters of the genus Pontohedyle. Radula characteristics unknown.
Molecular diagnosis is given in Table 13.
The sequences retrieved from the holotype (AM C. 476054.001) serve as reference sequences. Diagnostic characters in nuclear 28S rRNA were determined based on GenBank JQ410984, and in mitochondrial 16S rRNA based on GenBank JQ410983.
Description: Morphologically with diagnostic characters of the genus Pontohedyle. Radula characteristics unkown.
Molecular diagnosis is given in Table 14.
The sequences retrieved from the holotype (ZSM Mol 20080565) serve as reference sequences. Diagnostic characters in nuclear 18S rRNA were determined based on GenBank KC984299, and in nuclear 28S rRNA based on GenBank JQ410987.

Cryptic species challenging traditional taxonomy
Largely due to the development of molecular methods, research on cryptic species has increased over the past two decades [8,9], demonstrating their commonness across Metazoan taxa, though with random or non-random distribution among taxa and biomes still to be investigated [9,10]. Several recent studies have underlined that there is a large deficit in alpha taxonomy and that the diversity of marine invertebrates and especially meiofaunal animals might be much higher than expected, partly caused by high proportions of cryptic species e.g., [11,13,14,25,[73][74][75]. Rather than global, amphi-Oceanic, circum-tropical or otherwise wide ranging, the distribution areas of the biological meiofaunal species involved may be regional and their ecology more specialized [12,25,76]. At an initial stage of molecular and ecological exploration, cryptic meiofauna is potentially threatened by global change and cannot effectively be included in conservation approaches.
In traditional taxonomy, most species descriptions are based on morphological and anatomical characters. Morphological species delineation, however, can fail to    [1][2][3]. Previous authors have argued that 'integrative taxonomy' does not necessarily call for a maximum of different character sets, but rather requires the taxonomist to select character sets adequate for species delineation in the particular group of taxa [3,5]. Thus, there should be no obligation in taxonomic practice to stick to morphology as the primary source [77], and there are no official requirements by the International Code of Zoological Nomenclature to do so [78,79]. The results of Jörger et al. [25] indicate that the members of Pontohedyle slug lineages are so extremely uniform that conventional taxonomic characters (i.e. external morphology, radula characteristics, spicules) fail to delineate species. A series of studies have demonstrated the generally high potential of advanced 3D-microanatomy for character mining in Acochlidia (e.g., [80][81][82]). However, the exclusively mesopsammic microhedylacean Acochlidia form an exception, as they show reduced complexity in all organ systems and uniformity that leaves few anatomical features for species delineation even on higher taxonomic levels [83]. Based on previous histological comparisons, Jörger et al. [56] were unable to find any morphological characters justifying discrimination between the closely related western Atlantic P. brasilensis and its Mediterranean congener, P. milaschewitchii. Here, we provided a detailed histological (re-)description using 3D-reconstruction based on serial semi-thin sections of P. verrucosa, to evaluate whether advanced 3D-microanatomy provides distinguishing morphological characters for the two generally accepted species, P. milaschewitchii and P. verrucosa, as representatives of the two major Pon tohedyle clades (see [25], Figure 1). Indeed, we revealed some putative distinguishing features in the reproductive and digestive systems (see Table 15). However, the encountered (minor) morphological differences are problematic to evaluate in the absence of data on ontogenetic and intraspecific variation, and on potential overlap with interspecific differences. For example, slight differences in the reproductive system could be due to different ontogenetic stages, therefore presently they cannot be used to discriminate species. Comparatively investigated serial semi-thin sections of Pontohedyle kepii sp. nov. also confirmed the similarity in all major organ systems reported previously [55,56]. We thus conclude that in Pontohedyle even advanced microanatomy is inefficient or even inadequate for species diagnoses. Molecular character sets currently offer the only chances for unambiguous discrimination between the different evolutionary lineages. Proponents of morphology based alpha taxonomy [84] might argue that we have not attempted a fully integrative approach since we have not performed 3D-microanatomy on all proposed new species, including enough material for intra-specific comparisons, ultrastructural data on, e.g., cilia, sperm morphology or specific gland types, to reveal whether these forms indeed represent cryptic species. However, in light of the biodiversity crisis and the corresponding challenges to taxonomy, we consider it as little effective to dedicate several years of a taxonomist's life to the search for morphological characters, when there is little to expect, while molecular characters enable straightforward species delineation. This is not a plea to speed up description processes at the expense of accuracy and quality, or by allowing ignorance of morphology, but for a change in taxonomic practice to give molecular characters similar weight as morphological ones, in cases in which this is more informative or practical.
Still debated is the way how the traditional Linnaean System needs to be adapted to incorporate different character sets, in the first place the growing amount of molecular data. Probably the most radical way ignores the character-based requirements of the International Code of Zoological Nomenclature [78,79] and proposes to base descriptions of new species directly on support  [85,86]. Aside from the paradigm shift this would bring, far away from long-standing taxonomic practice, opponents criticize that unambiguous allocation of newly collected material is impossible in the absence of definitions and descriptors and requires repetition of the species delineation approach applied [50]. As a method of species delineation, coalescent based approaches are objective and grounded on evolutionary history and population genetics [86,87]; thus it is indeed tempting to use results derived from molecular species delineations approaches directly as species descriptions ('model-based species descriptions' [87]). This would clearly facilitate descriptions, thus reduce the taxonomic impediment and the risk of an endless number of discovered but undescribed candidate species. Every species description should aim for differentiation from previously described species; therefore, diagnostic characters are usually derived from comparisons to other, closely related species. Nevertheless, the species description itself has to be self-explanatory and should not rely on comparative measurements which are only valid in comparison to a special set of other species used for a certain analysis, i.e. on a complex construct that may not be reproducible when new data are added. In contrast to Fujita & Leaché [87], we believe that each species, i.e. separately evolving lineage [4], will presentin the current snap-shot of evolutionary processesfixed diagnostic characters of some sort (e.g., from morphology, DNA sequence information, behavioral, karyology…), and we consider it the task of modern taxonomy to detect the most reliable and efficient set of characters on which to found species descriptions. The Characteristic Attribute Organization System (CAOS) [51,57,58] is a character based method proposed for uniting species discovery and description [88]. As an approach to species delineation, we consider it inferior to coalescent based approaches (e.g., GMYC and BP&P); CAOS successfully determines putative diagnostic nucleotides, but is not predictive, i.e. lacks objective criteria with which to delimit a threshold number of distinguishing nucleotides that would indicate a species boundary. One has to distinguish between diagnosability of entities and the delimitation of species. Diagnostic characters of whatever sort can be found for all levels in the hierarchical classification, but there is no objective criterion for determining a number of characters needed to characterize a (new) species, e.g. versus a population. Nevertheless, for the purpose of species description, we think that character based approaches like CAOS are highly valuable and should complement molecular species delineation procedures, thus enabling the transition from species discovery to description.

Requirements of molecular taxonomy
While calls for replacing the Linnaean system by a DNA sequence based one [41] have trailed away, we still lack a common procedure on how to include molecular data into the Linnaean system [21]. Like any other source of data, molecular data is not explicitly treated by the International Code of Zoological Nomenclature, there are no provisions dictating the choice of characters [78,79]. Currently, molecular data are included in species descriptions in various mutually inconsistent ways [21]. If DNA sequence data are only used as additive to, e.g., morphology based species descriptions or molecular species delineation approaches to confirm pre-identified entities, the addition is straightforward and requires no specific considerations. But if molecular sequence information is to be used as the partial or even sole content of a species description, a discussion of the corresponding best practice is needed.

Type material for species based on molecular data
Previous authors highlighted the need for voucher material in molecular studies [89]. Ideally, DNA is extracted from (a subsample of ) a name-bearing type specimen (holotype, syntype, lectotype or neotype); if no such specimen is available for molecular studies, an attempt should be made to collect fresh material at the type locality. If parts of larger animals belonging to putative new species are used for DNA extraction, DNA and remaining specimen can both become part of the type material under nomenclatural rules. However, where the members of a putatively new species, e.g. of meiofauna, are so small that molecular extraction from only part of an individual is impossible, taxonomists may be confronted with the critical decision to either have DNA without a morphological type specimen or a type without DNA. In taxonomically unproblematic groups one can add new material or use paratypes for DNA (or other) analyses, relying on specimens to be conspecific if they were collected from 'the same population' , i.e. from a place (and time) close enough to the type locality to assume gene flow. But what if, as has been shown for Pontohedyle slugs [25], there is a possibility of cryptic species occurring sympatrically and at the same time? Would it be better (A) to sacrifice a (single available) type specimen to obtain molecular data for species delineation or (B) to save the type and use a secondary specimen, taking the risk that the latter might not be conspecific with the former? In a group like our Pontohedyle slugs in which DNA sequence data are much more promising for species delineation than morphological approaches, and considering the wealth of potential DNA sequence characters, we prefer to sacrifice even single specimens to DNA extraction. In absence of a term referring to vouchers exclusively consisting of extracted DNA, we term this type material: 'DNA types'. However, prior to this, researchers should attempt an optimization of microscopical documentation (for details see [90]) and recovery of hard parts (e.g. radulae) from the spin columns used for extraction [91]. In the case of DNA aliquots serving as type material, natural history collections are urged to create long term DNA storage facilities [41,42] like the DNA bank network (http:// www.dnabank-network.org/), and should apply the same caution and requirements (i.e. documentation of collection details) as for any morphological type.

Risk of two parallel taxonomies?
Old type material often does not allow molecular analyses [84,92], and searching for fresh material at a type locality can be unsuccessful. Future technical advances are likely to enable DNA acquisition from some old type material, as there has been considerable progress in dealing with degenerated DNA [93]. Nevertheless, there are the potential risks that two parallel taxonomic systems could develop, and that the one based on molecular characters could duplicate, under separate names, some taxa already established on morphological grounds [77]. Similar concerns have arisen previously when the taxonomy of certain taxa was based on a character set other than morphology (e.g. cytotaxonomy based on data from chromosomes) and the investigation of one character set hindered the exploration of the other. It clearly remains the duty of taxonomists to carefully check type material of closely related taxa before describing new species [77]. To keep molecule driven taxonomy 'workable' [94] and connected to traditional morphology based taxonomy, authors should include a brief morphological diagnosis of the (cryptic) species [77], even in the absence of species-diagnostic characters, in order to make the species recognizable as belonging to a certain group of (cryptic) species.

Trouble with names
Any specimen identified from molecular data only can belong to a previously established species or to one new to science. If unambiguous identification with a single existing species name is possible then, of course, the latter should be used. In our cases in Pontohedyle, we call those Indo-Pacific specimens collected near the type locality of P. verrucosa (Challis, 1970) on the Solomon Islands by this single available name for Indo-Pacific Pontohedyle. Concerning Atlantic Pontohedyle, the name P. brasilensis (Rankin, 1979), proposed for Brazilian specimens, was treated as a junior synonym of the older name, P. milaschewitschii . Since we have shown that P. milaschewitschii refers to Mediterranean and Black Sea specimens only [25], we resurrected the name P. brasilensis for Western Atlantic Pontohedyle, and now apply it to the only species in of two cryptic ones that has been collected from Brazil. In doing so we accept the risk resulting from the fact that these specimens were collected at some distance from the type locality of P. brasilensis (see Figure 4), as the latter has not yielded any Pontohedyle specimens for more than the last 50 years, despite considerable and repeated collecting efforts, including our own. These assignments of previously established species names left at least nine additional, clearly separate Pontohedyle species for which available names did not exist. In cases of microscopic animals such as Pontohedyle, molecular taxonomy thus may benefit from morphology based taxonomy having missed them in the past.

Species descriptions based on singletons
Species descriptions based on singleton specimens cannot reflect intraspecific variation, and Dayrat [1] even proposed a guideline to restrict species descriptions to well-sampled taxa. However, there is no objective way to determine any sample size at which intraspecific variation would be covered sufficiently. Moreover, excluding taxa described from singletons would lead to considerably lower, and effectively false, estimates of the scientifically known biodiversity [5,[26][27][28]. The present study on Pontohedyle includes five species descriptions based on DNA sequence information from one individual only. Usually, this is done when such a singleton presents a combination of characters so discrete that it is considered highly unlikely to fall within the variational range of another species [28]. In a complex molecular species delineation approach Jörger et al. [25] recognized our five singletons as independently evolving lineages. Approximations with molecular clock analyses estimate the diversification of these species from their respective sister groups to have occurred 54-83 mya (own unpublished data), which indicates significant timespans of genetic isolation. In light of our general revision of the genus Pontohedyle, we consider it as less productive to keep these entities on the formally unrecognized level of candidate species than to run the risk that our species hypotheses may have to be modified due to future additional material. Nevertheless, we are well aware of the fact that taxon sampling and data acquisition (i.e. incomplete molecular data sets) are not yet ideal for some of our newly described species (e.g., P.martynovi sp. nov., P. yurihookeri sp. nov.).

What is a diagnostic character in molecular taxonomy?
In character based taxonomy, descriptions of new taxa are, or should be, based on diagnostic differences from previously known taxa. In a phenetic framework (key systematics), similarity based distinction relies on sufficient sampling and detectable degrees of difference, whereas phylogenetic taxonomy additionally presumes knowledge of character homologies and sister group relationships. In an ideal phylogenetic framework diagnoses are based on apomorphic (i.e. derived) versus homologous but plesiomorphic (ancestral) states of a given character. In molecular taxonomy, the detection of homologies and apomorphic conditions among the four character states (bases) is handicapped by the high chance of convergent multiple transformations causing homoplasy. Reconstruction of ancestral sequences to support homology and differentiate between apomorphic and plesiomorphic character states for each node is possible [95]. However, unfortunately, robust phylogenetic hypotheses with strong support values for all sister group relationships are the exception rather than the rule. Since the evaluation of a state as apomorphic highly depends on the topology, and reconstruction of ancestral nucleotides is constrained by sampling coverage, we suggest more conservative approaches for cases of unclear phylogenetic relationships, as in our study. We use diagnostic nucleotides as unique character attributes (which may be apomorphic or plesiomorphic or convergent) within a certain entity, i.e. a monophylum with strong support values. This is clearly a trade off between the number and phylogenetic significance of diagnostic characters and the degree of dependence of these characters on a certain topology, as with increasing size and diversity of the selected entity, the likelihood of homoplasy also rises [96]. To enhance the stability of our molecular taxonomic characters we chose to determine diagnostic characters of each Pontohedyle species in relation to all its congeners, rather than just to the respective sister taxon as is the default in CAOS. Equal character states in non-Pontohedyle outgroups are left unconsidered, however, due to the larger evolutionary distances and the correspondingly increased risk of homoplasies. It will be one of the major challenges for molecule driven taxonomy to select the appropriate monophylum in which all included taxa are evaluated against each other. Rach et al. [88] addressed homoplasy within the selected ingroup by applying an 80% rule to so-called single private characters (see below). Pontohedyle species recognized here offered enough single pure diagnostic bases to avoid using single private characters and some further, more equivocal attributes provided by CAOS.
The Characteristic Attribute Organization System (CAOS) [51,57,58] can be used to identify diagnostic nucleotides for pre-defined taxonomic units [51]. The program offers discrimination between four types of 'character attributes' (CAs): simple (single nucleotide position) vs. compound (set of character states) and pure vs. private [51]. Pure CAs are nucleotides present in all members of a clade and absent from members of other clades; private CAs are only present in some members of the clade, but absent from others [51]. We consider only single pure CAs as eligible for diagnostic characters in DNA taxonomy, i.e. as supporting new species proposals. In our diagnoses of the new Pontohedyle species we emphasize those single pure CAs, which in protein coding genes code for a different amino acid. The probability of single pure CAs referring to fixed genetic differences increases exponentially with their number [88]. In our dataset, all Pontohedyle species have between 12 and 36 single pure CAs on independently evolving markers, which supports their treatment as genetically isolated lineages. Additionally, the CAOS program distinguishes between homogeneous pure CAs (shared by all members of the taxon under study, and not present in the outgroups) and heterogeneous pure CAs (with two or three different characters present in the taxon but absent from the outgroups). The latter characters can be treated as diagnostic, but are problematic as they may refer to convergently evolved character states. Therefore, we report them as additional information. In contrast, compound CAs can be unique for certain species, but they may have evolved from several independent mutation events. Consequently, compound CAs as an entity have low probabilities of homology; in analogy to morphoanatomical key systematics, these compound CAs can serve for re-identification of well-sampled species, but they are not diagnostic characters in a phylogenetic sense and thus should be avoided in DNA taxonomy.
CAOS identifies discrete nucleotide substitutions at every node of a given tree and has been complemented to find diagnostic bases in a 'phylogenetic-free context' [97], referring to the difference between CAs and true apomorphies. This notion can be misleading, however, as the results provided by CAOS are one hundred percent topology dependent in only comparing sister pairs at each node. To overcome this topology dependence, we ran several analyses placing each species at the root of the ingroup, which we defined as the most inclusive secure and taxonomically relevant monophylum, in our case the genus Pontohedyle (see Material and Methods). This procedure of a manually iterative, exhaustive intrageneric comparison of base conditions makes the recognized single pure CAs less numerous but more rigorous than with CAOS default parameters, i.e. by decreasing the chances of homoplasy and increasing the chances of single pure CAs representing apomorphies in our wider taxon comparison.

Towards a 'best practice' in molecular taxonomy
Considering stability and traceability in future research, the presentation of the identified diagnostic nucleotides is not trivial. Some recent studies just reported the number of differing nucleotides without specifying the position and character state e.g., [98]. This is equivalent to a morphological species description that would merely refer to, e.g., 'diagnostic differences in the reproductive system' without offering any descriptive details. Other studies present part of an alignment without identifying positions, and underline putative diagnostic nucleotides e.g., [99] without explanation what determined these bases as diagnostic. This practice leaves it to future researchers to identify the proposed bases, which is highly time consuming and error-prone, especially when the original alignment is not deposited in a public database. Reporting the positions within the alignment is a step towards reproducibility and traceability of molecular diagnostic characters e.g., [94,[100][101][102], but when new material is added that was generated with different primers or includes insertions or deletions, the critical positions are still difficult to trace. Yassin et al. [103] included the positions within a reference genome, which probably provides the greatest clarity for future research. Unfortunately, for non-model taxa closely related reference genomes which allow for unambiguous alignment of even fast evolving markers are usually unavailable. We thus suggest the following procedure for reporting positions in an alignment. (1) Clearly report primers and alignment programs, and clarify what determined position 1 (e.g., first base after the primer sequence); (2) deposit alignments in public databases or as additional material accompanying the publication's online edition. To make a diagnostic position in a sequence traceable independently from a specific alignment, we additionally recommend to (3) report the corresponding position in a deposited reference sequence (ideally generated from type material). Technically, the necessary values are easily retrievable from sequence editing programs such as Geneious [104]. To evaluate intraspecific variation, sequences from all specimens assigned to a certain species were included in our analyses of diagnostic characters. In new species descriptions the provided reference sequences should be generated from type material. In cases where the molecular data retrieved from the type are, however, incomplete, we consider it little problematic to additionally include data from other specimens, if there is justification on conspecifity (e.g. via other molecular markers). If future research rejects conspecifity, the respective characters can be easily excluded from the original description. We refrain from adopting the term 'genetype' , however, as label for sequences data from type material [105], as it might be easily misunderstood: sequences themselves are not types but amplified copies of certain parts of type material.
Since an alignment presents the positional homology assumptions that are crucial for the determination of diagnostic nucleotides, we consider the quality of the alignment as essential for the success of molecular taxonomy. Therefore, we sincerely recommend to critically compare the output of different alignment programs, as in the present study. While coding mitochondrial markers (such as COI) can be checked via reading frames and translation into amino acids, and are generally less problematic, non-coding fast evolving markers (e.g. 16S rRNA) can be difficult to align even among closely related species. Obviously, undetected misalignments can result in tremendous overestimation of diagnostic characters. For example, a misalignment occurred in the ClustalW approach to our 28S rRNA dataset, which increased the number of characters diagnostic for a sister clade within Pontohedyle wenzli sp. nov. on this marker from 0 to 34 compared to the MUSCLE [106] alignment. And even without obvious misalignments, the use of different alignment programs can result in a differing number of diagnostic nucleotides (e.g. 9 vs. 13 diagnostic nucleotides in P. milaschewitchii comparing the MUSCLE and ClustalW alignment). By removing ambiguous parts of the alignment, one reduces the number of diagnostic characters considerably (e.g. from 19 to 13 diagnostic nucleotides on 16S rRNA in P. milaschewitchii when masking ClustalW alignments with Gblocks [107]). However, those diagnostic characters that remain can be considered as more stable and reliable for species identification. Based on our comparative analyses, we decided to choose the most conservative approach (alignment conducted with MUSCLE [106] and masked with GBlocks [107]), and based on the above mentioned examples stress the need to dedicate time to alignment issues when performing molecular taxonomy.
Several potential sources of error unique to taxonomy from molecular data have been pointed out [23]. (1) contamination and chimeric sequences, (2) faulty alignments resulting in comparisons of non-homologous nucleotides, and (3) the risk of dealing with paralogs. Authors of species descriptions based on molecular data should bear these pitfalls in mind. The risk of chimeric sequences can be reduced by carefully conducting BLAST searches [108] for each amplified fragment; misidentifications of diagnostic characters due to non-homologous alignments can be avoided by applying the considerations discussed above. The quality and stability of molecular taxonomic results considerably increase when several independent loci support the species delineation. To avoid idiosyncrasies of individual markers, misidentifications due to sequencing errors, or the pitfalls of paralogs, we strongly recommend not to base molecular species delineation and subsequent species description on single markers. Otherwise, if subsequent results negate the diagnostic value of nucleotides on that marker, the species description loses its entire foundation. Furthermore, the use of single pure CAs rather than of other types of CAs, and especially the use of genus-level compared CAs as discussed above, increases the chances of establishing and diagnosing new species on apomorphies rather than on homoplasies.
We acknowledge the risk that species descriptions based on molecular data might contain errors in the form of incorrectly assumed apomorphies, especially when working in sparsely sampled groups. Moreover, putative molecular apomorphies of described species may have to be reconsidered as plesiomorphies when new species with the same characteristics are added, or they may vanish in intraspecific variation. The more potentially apomorphic nucleotides are found across independently evolving markers, the higher the chances that at least some of them truly refer to unique mutations accumulated due to the absence of gene exchange. But in all this, molecular characters do not differ from morphological or other sets of characters. Species descriptions are complex hypotheses on several levels: novelty of taxon, placement within systematic context, and hypothesis of homology applying descriptive terms [5,109,110]. Species descriptions based on molecular characters are founded on the well-established hypothesis that character differences reflect lineage independence [50] and that mutations accumulate in the absence of gene exchange. It is the task of the taxonomist to evaluate whether the observed differences in character states can be explained by a historical process causing lineage divergence [3]. According to rough time estimations by molecular clock analyses, the radiation of Pontohedyle species included in the present study took place 100-25 mya (own unpublished data). Therefore we are confident that many of the bases recognized as diagnostic within our sampling truly refer to evolutionary novelties and unique attributes of species-level entities. However, even in cases of more recent divergences it should be possible to detect at least some diagnostic bases. Regardless of which character set a species description is based on, species descriptions are hypotheses, which means that they need to be re-evaluated, i.e. confirmed, falsified or modified when new data, material or methods of analysis become available.

Conclusions
This contribution issues a plea to follow up discoveries of cryptic species by molecular species delineation with the steps necessary to establish formal scientific names for these species. This can be achieved by selection of diagnostic characters, e.g., via the CAOS software. Depending on the robustness of the underlying phylogenetic hypothesis, taxonomists need to evaluate the optimal balance between the number of diagnostic bases and their stability subject to the topology. In general, pure diagnostic bases rather than private or combined ones should be selected, and such single pure CAs should be compared against all the potentially closely related lineages, not only against the direct sister in a predefined tree entered in CAOS as is the default procedure. We also wish to highlight the following considerations. 1) When basing a species description on molecular data the same rules as in traditional taxonomy should be applied considering deposition and accessibility of data; DNA aliquots and additional type material should be deposited in long term storage facilities, and sequences in public databases (GenBank). As with morphological type specimens, special attention should be given to the storage and availability of molecular types. 2) Due to the underlying homology assumption, we consider the quality of the alignment as critical to determining and extracting diagnostic bases. Thus, we recommend exploring changes to the alignment and, thus, the identified diagnostic characters by applying different alignment programs and masking options. 3) Alignments may change when new data is added, especially concerning non-coding markers. For better traceability, we regard it as beneficial to report not only the alignment position but also refer to a closely related reference genome (if applicable) and report the position in a deposited reference sequence (ideally generated from type material). In its current stage of development, the extraction of diagnostic characters for molecular taxonomy is not yet ready for inclusion in automated species delimitation procedures, as it still requires time-consuming manual steps. However, little adaptation of existing programs would be needed to make them serve molecular taxonomy in its entirety, to overcome the current gap between species discovery and species description.

Type localities and collecting sites
The collecting sites of material included in the present study are shown in Figure 4 (modified after Jörger et al. [25]). Of the three valid species, we were able to recollect P. milaschewitchii from its type locality. P. verrucosa was collected in vicinity of the type locality on Guadalcanal, Solomon Islands. Despite several attempts, we were unsuccessful in recollecting P. brasilensis at the type locality (see Discussion for assignment of specimens to this species).

Morphology and microanatomy
Jörger et al. [25] analyzed the radulae of most of the species described above. Unfortunately, for Pontohedyle neridae sp. nov., P. martynovi sp. nov. and P. yurihookeri sp. nov. radulae could not be recovered from the specimens used for DNA extraction. The radula of P. wiggi sp. nov. could only be studied under the light microscope, but was lost when attempting to transfer it to a SEM-stub.
Phylogenetic analyses by Jörger et al. [25] revealed two major clades within Pontohedyle. One includes P. milas chewitchii, for which detailed microanatomical and ultrastructural data is available [55,111]. The other clade is morphologically poorly characterized, since the original description of P. verrucosa lacks details on major organ systems like the reproductive system and the nervous system. For detailed histological comparison of the two major Pontohedyle clades, glutaraldehyde fixed specimens of P. verrucosa (from near the type locality WP-3 and WP-2 see [25]) were post-fixed in buffered 1% osmium tetroxide, decalcified using ascorbic acid and embedded in Spurr low-viscosity epoxy resin [112] or Epon epoxy resin (for detailed protocols see [113,114]). Serial semithin sections (1 and 1.5 μm) of three specimens were prepared using a diamond knife (Histo Jumbo, Diatome, Switzerland) with contact cement on the lower cutting edge to form ribbons [115]. Ribbons were stained using methylene-blue azur II [116] and sealed with Araldit resin under cover slips. Sectioned series are deposited at the Bavarian State Collection of Zoology, Mollusca section (ZSM Mol-20071833, 20071837 and 20100548). Additionally, histological series of Pontohedyle kepii sp. nov. were sectioned as described above.
Digital photographs of each section were taken using a ProgRes C3 camera (Jenoptik, Germany) mounted on a Leica DMB-RBE microscope (Leica Microsystems, Germany). Subsequently, photographs were edited (i.e., grey-scale converted, contrast enhanced and reduced in size) using standard imaging software, then loaded into AMIRA 5.2 (Visage Imaging Software, Germany) for 3D reconstruction of the major organ systems. Alignment, labeling of the organ systems and surface rendering followed in principle the method described by Ruthensteiner [115].

Acquisition of molecular data
This study aims to characterize the genus Pontohedyle (Acochlidia, Microhedylacea) based on molecular standard markers, i.e., nuclear 18S and 28S rRNA and mitochondrial COI and 16S rRNA. We included the three previously valid Pontohedyle species (for taxonomy see [69,83]): P. milaschewitchii (Kowalewsky, 1901), P. verrucosa (Challis, 1970) and recently re-established P. brasilensis (Rankin, 1979) [25]. The nine additional species earlier identified as candidates in the genus Pontohedyle [25] are subject to molecular taxonomy. 28S rRNA, 16S rRNA and COI sequences analyzed by Jörger et al. [25] were retrieved from GenBank (see Table 1 for accession numbers). Additionally, we amplified nuclear 18S rRNA (approx. 1800 bp) for at least one individual per species. 18S rRNA was amplified in three parts using the primers for euthyneuran gastropods by Vonnemann et al. Polymerase chain reactions were conducted using Phire polymerase (New England Biolabs) following this protocol: 98°C 30 sec, 30-35x (98°C 5 sec, 55-65°C 5 sec, 72°C 20-25 sec), 72°C 60 sec. Successful PCR products were cleaned up with ExoSap IT. Cycle sequencing such as sequencing reactions was performed by the Genomic Service Unit (GSU) of the Department of Biology, Ludwig-Maximilians-University Munich, using Big Dye 3.1 kit and an ABI 3730 capillary sequencer. Sequences were edited (forward and reverse strands), concatenated and checked for potential contamination via BLAST searches [108] against the GenBank database via Geneious 5.5.2 [104].

Detection of diagnostic molecular characters
We used the Characteristic Attribute Organization System (CAOS) [51,57,58] to detect discrete nucleotide substitutions on our previously determined candidate species [25]. The program distinguishes single (single nucleotide) vs. compound (set of nucleotides) 'character attributes' (CA) [51]. Both, single and compound CAs can be further divided into pure (present in all members of a clade but absent from all members of another clade) and private CAs (only present in some members of the clade, but absent in members of other clades) [51]. For taxonomic purposes at this stage we consider only 'single pure characters' (sPu) as diagnostic characters for species descriptions (see Discussion). Since some sister group relationships among Pontohedyle species are not well supported (see [25], Figure 1), we chose our diagnostic molecular characters in the sense of unique within the genus Pontohedyle, rather than assigning plesiomorphic or apomorphic polarity to character states of one species in relation to its direct sister species.
As discussed above, the homology assumption presented in the alignment is crucial for the correct detection of diagnostic characters. For quality control, we performed data input into CAOS with alignments derived from three commonly applied alignment programs and critically compared the resulting differences concerning amounts and positions of the sPus. Alignments were generated for each marker individually using MUSCLE [106], Mafft [118,119] and CLUSTAL W [120]. The COI alignment was checked manually, supported by translation into amino acids. Due to difficulties in aligning highly variable parts of rRNA markers, we removed ambiguous parts of the alignment with two different masking programs, Aliscore [121] and GBlocks [107], and compared the respective effects on character selection. After comparison of the various results we chose MUSCLE [106] in combination with GBlocks [107] as the most conservative approach that results in fewer but more reliable diagnostic characters than the other approaches.
Alignments were analyzed and converted between different formats using Geneious 5.6 (Biomatters) [104]. We performed a phylogenetic analysis under a maximum-likelihood approach with RAxML 7.2.8 on each individual marker, applying the 'easy and fast way' described in the RAxML 7.0.4 manual to obtain an input tree. For our present study the phylogenetic hypothesis on sister group relationships of the different Pontohedyle species, however, is not relevant: We manipulated the resulting trees in Mesquite [122], generating a single starting file for CAOS for each species and for each marker, with each of the analyzed species successively being sister to all remaining Pontohedyle species. This iterative procedure retrieves diagnostic characters for the node that compares each single species to all its congeners.
The single gene alignments which formed the basis for the selection of diagnostic nucleotides are available in fasta format as Additional material 3-6. Diagnostic nucleotides are reported with positions in the reference alignment. Position 1 of each alignment refers to position 1 after the primer region, which was removed in the alignment. For better traceability, and in the absence of a closely related reference genome, we additionally report the positions within a reference sequence for each species (deposited in GenBank; see Table 1). In the description of our new species these reference sequences are retrieved from the holotype. Diagnostic molecular characters of the genus Pontohedyle in 18S and 28S rRNA are diagnosed based on alignments including all available Pontohedyle sequences ( Table 1) and representatives of all other acochlidian genera currently available in public databases (see Additional files 1 and 2 for the original alignments in fasta format).