Molecular identification of temperate Cricetidae and Muridae rodent species using fecal samples collected in a natural habitat

Molecular species identification from biological material collected at field sites has become an established ecological tool. However, extracting and amplifying DNA from degraded field samples, such as prey remains and feces that have been exposed to the elements, remains a challenge and costly. We collected 115 fecal samples of unknown small mammals, resembling fecal droppings of voles and mice (i.e., Cricetidae and Muridae), from a salt marsh in The Netherlands. We modified a previously published protocol into a relatively low-cost method with a PCR success of 95%. We demonstrate that species identification is possible for both Cricetidae and Muridae species using fecal samples of unknown age deposited in the field. For 90 samples, sequences of the variable control region in the mitochondrial genome were obtained and compared to published DNA sequences of small mammals occurring in north European salt marshes. A single sample, probably environmentally contaminated, appeared as Sus scrofa (n = 1). We positively identified house mouse Mus musculus, being the positive control (n = 1), and common vole Microtus arvalis (n = 88). In 81 sequences of 251 nt without ambiguous bases, ten haplotypes were present. These haplotypes, representing the central lineage of the western subspecies M. arvalis arvalis, were separated by 20 mutations from published control region haplotypes of the western European lineages sampled in France. Unlike earlier studies of cytochrome b variation in coastal European populations, we did not find indications of recent purging of genetic variation in our study area.


Introduction
Identification of taxa by molecular analysis of a variety of biological samples found in natural environments has become a well-established replacement or addition to collecting, trapping, or other invasive sampling (Höss et al. 1992;Beja-Pereira et al. 2009). Species identification of many taxa is made possible by extensive, publicly available, databases such as GenBank (Benson et al. 2009) and BOLD (Ratnasingham and Hebert 2007) which contain reference DNA sequences for generically used genetic markers (e.g., genes in the mitochondrial genome). Noninvasive sampling has successfully served a large range of study purposes in wildlife studies (Taberlet et al. 1999;Valentini et al. 2009), one of which is species identification from DNA retrieved from pollen, feathers, hair, or feces collected in the field. Studies using fecal analyses have led to insights into for example predator-prey food webs (Sheppard and Harwood 2005) and population size and structure (Hedges et al. 2013).
This study aims to identify the vole and mouse species (Cricetidae and Muridae, superfamily: Muroidea) inhabiting a salt marsh in The Netherlands, from feces collected in natural habitats using molecular tools with a relatively low-cost DNA extraction method. Voles and mice deposit fecal droppings throughout their territories (Delattre et al. 1996;Wheeler 2008). The outer layer of feces is covered with intestinal mucus cells from the host, which thus contains host DNA (Maudet et al. 2004). Prior to collection, DNA in fecal droppings of wild free-roaming animals has however been exposed to digestive enzymes, solar radiation, rain, flooding, and possibly DNA of other species. It is therefore unknown whether the quality of DNA of droppings collected in the field, as opposed to droppings collected from caged animals, is sufficient to allow species identification (Taberlet et al. 1999).
We applied a published protocol for species identification of voles developed for fresh fecal samples from caged or trapped animals; this protocol was developed for the mitochondrial control region of Arvicolid species, worked for 95% of the freshly collected fecal samples and could accurately differentiate vole species (Alasaad et al. 2011). The amplified product was relatively small,~300 nt, providing a marker to study field samples with possibly degraded DNA which still allows screening for genetic variation within the population. The PCR primers Pro+ (Haring et al. 2000) and MicoMico (Alasaad et al. 2011), although developed for voles, can be expected to be conserved in other Muroidea species (Alasaad et al. 2011).
We successfully demonstrated that the technique for molecular species identification of voles developed by Alasaad et al. (2011) can be applied to feces of voles and mice living in natural temperate habitats, and discovered ten unpublished haplotypes from the western subspecies of common vole Microtus arvalis arvalis.

Collection of field samples
This study was conducted in Noord-Friesland Buitendijks ( 5 3°2 0 ′ N , 5°4 3 ′ E ) , a c o n s e r v a t i o n a r e a i n The Netherlands. A relatively large area consists of salt marsh (> 20 km 2 ). Feces were collected from the high marsh (the low marsh is too wet for animals to persist even during summer) at two sites approximately 2.5 km apart, in September 2015. We surveyed a total of 660 circular plots of 2 m 2 , 10-20 m apart, along 75 transects (van Klink et al. 2016). However, 63 circular plots were too wet to be examined. Droppings were collected separately in a sterile 1.5-ml vial each, using gloves to avoid DNA contamination. No storage buffer was added. We collected 115 individual fecal droppings, one dropping per pile per circular plot. Vials were stored at − 20°C.

Reference database and PCR primers
We compiled a local reference sequence database of nucleotide sequences of the control region in the mitochondrial genome (mtDNA) of small mammal species of the superfamily Muroidea, including Cricetidae voles, and mice and rats of the Muridae family, known to occur in the north of The Netherlands, regardless of whether species inhabit salt marsh or not (to account for unexpected species). The sequences were downloaded from GenBank (Benson et al. 2009); accession numbers are given in Table 1. Sequences were aligned in Geneious 8.1.3 (Kearse et al. 2012), to identify the match between published PCR primers Pro+ (Haring et al. 2000) and MicoMico (Alasaad et al. 2011), and the mtDNA sequences of all target species.

DNA extractions
House mouse (Mus musculus) was used as a positive control in all experiments. A tail tip of a surplus humanely euthanized house mouse was collected under the ethical approval of the Animal Experiments Committee of the University of Groningen, The Netherlands (reference number surplus-DEC 6768A). DNA of M. musculus was extracted from 1 cm tail tissue using the DNeasy Blood & Tissue Kit supplied by ©QIAGEN, following the tissue protocol and manufacturer instructions. In addition, fresh fecal droppings of M. musculus were used as positive controls. Host DNA extractions of whole droppings were initially done with (1) the DNeasy method following the tissue protocol and, a much cheaper method, (2) the ammonium-acetate method (Richardson et al. 2001).
To develop a low-cost DNA extraction protocol, we subsequently modified the ammonium-acetate method (Richardson et al. 2001) by including a two-step lysis to increase the yield of host DNA. First, each dropping was soaked in 100 μl Qiagen lysis buffer and 10 μl Proteinase K in a sterile 1.5-ml tube. The sample was removed after 60 s (n = 39) or 10 min (n = 76) and the solution was incubated for 1.5 h at 55°C. Then, 250 μl Digsol lysis buffer and 10 μl Proteinase K were mixed into each sample which was incubated for 2 h at 55°C, with regular vortexing. Next, 250 μl 4 M AmAc was added, followed by 15-min incubation at room temperature with regular vortexing. Samples were centrifuged for 10 min at maximum speed and 500 μl supernatant was collected in a clean tube and cleaned by ethanol precipitation. DNA was eluted in 50 μl TE buffer and stored at − 20°C.

Sequencing
Amplified samples were prepared for sequencing as follows: 25 μl PCR product mixed with 4.0 μl loading dye was loaded on a 2% MP agarose gel, allowing separation of target DNA and nonspecific bands by electrophoresis. To prepare samples for sequencing, we used a gel extraction method: Wizard® SV Gel and PCR Clean-Up System, following the manufacturer's protocol. This was used instead of the more regularly used chemical cleaning before sequencing because of the added benefit of removing non-target DNA fragments, while not adding to the regular costs of sequencing. Final concentrations of PCR products were estimated by gel electrophoresis of 5 μl cleaned PCR product using a 100-bp ladder; samples were diluted to concentrations of 20-80 ng/μl. Samples were sequenced with the forward primer Pro+ (5 pmol/μl) on a Sanger ABI 3730x capillary sequencer.

Data analyses
Sequences were processed in Geneious 8.1.3. Primer sequences were trimmed. For species identification and to exclude the possibility that our database was incomplete, obtained sequences were searched against the GenBank nr database using standard Megablast algorithms with a maximum hit number of 10 and a maximum E-value of 0.1. Sequences identified as the same species were aligned and sequences with ambiguous base pairs were removed. The final alignment of 251 nt was exported to DnaSP (Librado and Rozas 2009) to identify unique haplotypes and calculate haplotype (H) and nucleotide (π) diversity. Haplotype diversity is the probability that two alleles randomly sampled from a population are different. Nucleotide diversity is the average number of nucleotide differences per site between any two DNA sequences chosen randomly from the sample population (Nei and Li 1979).

PCR primers
Using the sequence alignment from our local reference database, we identified the mitochondrial control primers Pro+ (Haring et al. 2000) and MicoMico (Alasaad et al. 2011) as the most suitable candidate PCR primers. This primer set matched with all species of the Cricetidae and Muridae families known to occur in the north of The Netherlands (Fig. S1, Table 1), and was successfully applied in an earlier study by Alasaad et al. (2011).

Molecular analysis of feces
Using the primer set Pro+ and MicoMico, PCR success was negative applying either the DNeasy method or the unmodified ammonium-acetate method for field droppings. This indicated that these DNA extraction methods yielded insufficient DNA for reliable PCR amplification, even though it yielded sufficient DNA from fresh M. musculus droppings. The proportion of successful PCRs increased to 51% with the modified ammoniumacetate method, and to 95% where the pre-lysis soaking time of feces was increased from 60 s to 10 min: of 76 field samples treated with 10-min pre-lysis soaking, only Table 1 Reference list of small mammal species of the superfamily Muroidea, including voles of the Cricetidae family and mice and rats of the Muridae family, known to occur in the north of The Netherlands. One species of each genus was included in the alignment in Fig. S1,  a These published sequences did not include the priming site for Pro+; therefore, the total length is inferred and given between brackets b No mitochondrial control region (CR) sequences were available for Arvicola amphibius and therefore, Arvicola sapidus was used instead four PCRs failed ( Table 2). The sequencing success rate was 91% when using a 10-min pre-lysis soaking time. DNA from M. musculus tail and droppings, and the field fecal samples yielded PCR products of the expected size of just over 300 nt (Table 1). In one sample, the size of the PCR product was much longer, closer to 500 nt. We successfully obtained mtDNA sequences of 89 field samples and one positive control.
Sequences of PCR product obtained with Pro+ confirmed that the primer combination successfully amplified host DNA from feces, confirming M. musculus as the positive control (n = 1). In the 89 field samples, two species were identified: common vole M. arvalis and Sus scrofa (wild boar or domestic pig). The Megablast search returned an average match of 294 nt with published M. arvalis sequences (n = 88, with pairwise identities > 95%) and a match with 359 nt of published Sus scrofa sequences (n = 1; pairwise identity = 98.6%).
We trimmed the alignment to 251 nt to obtain a dataset without ambiguous bases, leaving 81 sequences of the 88 sequences identified as M. arvalis in the dataset. These 81 sequences contained 16 variable sites and ten different haplotypes (Fig. 1). Nine of the ten haplotypes were confirmed with two or more fecal samples. The base variations at all 16 variable sites (see Fig. 1) were confirmed by six individual samples that were repeated and no errors were found. Haplotype VIII was found only once. This sequence had a T instead of a C at position 120. Unfortunately in the set of repeated samples, the sample representing haplotype VIII was not included; however, because the electropherogram was very clean, there was no reason to discard the variable base defining haplotype VIII. We therefore consider this one haplotype a singleton. The four common haplotypes (haplotypes II-V; Fig. 1) were represented by more than ten samples each. The haplotypes in our study area in The Netherlands roughly fell in two groups with two and three common haplotypes each separated by eight mutations (between haplotypes VI and X) and five minor haplotypes (Fig. 2). The haplotype diversity (H) was 0.85 and the nucleotide diversity (π) was 0.023.

Discussion
In conclusion, we show that a high success rate in species identification of voles and mice can be obtained at relatively low cost, while avoiding invasive methods of data collection. We successfully identified species of two rodent families, i.e., Cricetidae (voles) and Muridae (mice), using DNA extracted from feces, and we demonstrated that species can be identified from fecal samples of unknown age collected in the field. The added benefit of using the mtDNA control region was that the local haplotype variation in vole could be described. Because we compiled a local sequence database with control region sequences of the expected small mammal species, we had a priori knowledge that the marker was also suitable to distinguish species. A drawback of using the control region is that although it provides information on presence of species and haplotypes, it gives only conservative information about abundance; to identify individuals, nuclear markers such as microsatellites should be applied (Taberlet et al. 1999).

Cost and time efficiency
We processed field samples at a very low cost (DNA extraction and PCR for €0.35 per sample), while not compromising on PCR amplification and sequencing success. Our sequencing success of 91% is comparable to the success rate of 95% reported by Alasaad et al. (2011) and higher than the 85% reported by Barbosa et al. (2013); note that this last study used the mitochondrial gene, cytochrome b, instead of the control region. Both studies used commercial DNA extraction kits which, depending on the manufacture rates, cost €2-4 per sample and are thus more expensive than the ammonium-acetate method we used at a rate of €0.35 per sample including PCR.
Species identification through molecular analyses of feces may also be time-efficient since trapping must take place over several nights, with regular trap visits. Collection of feces can be a one-off exercise per season or year, with several days of laboratory work before results are known. Also using molecular tools to identify species present in ecosystems is a good alternative to trapping, especially when trapping is prohibited by law or permits are difficult to obtain. In addition, molecular tools may be preferred over more invasive methods to identify species. For example, fecal samples are relatively easy to collect, without risk of harming animals during trapping or handling (e.g., through stress or flooding of traps). Some species are known to be particularly Btrap-happy^(easy to trap, e.g., wood mouse (Apodemus sylvaticus)) and as such creates a sampling bias (Bekker et al. 2015). In contrast, as some species are Btrap-shy^and may therefore be missed, collecting samples such as fecal droppings is more thorough.

Methodological challenges
The targeted PCR fragment in this study was just over 300 nt long (Table 1). This may be at the limit of what can be amplified from feces, because DNA in feces can be degraded. This could explain why four samples failed to amplify even after we increased soaking time (Table 2). Alternatively, these samples may have been from other small mammal species, for example Soricidae (shrews) or Muridae species. We confirmed that the primer pair can amplify Muridae species. However, most shrew species would indeed be difficult to amplify from degraded fecal DNA because Crocidura and Sorex shrews and Eurasian water shrew (Neomys fodiens) have repeated sequences in the control region and therefore a much longer fragment between the two primers (Fumagalli et al. 1996;Liu et al. 2015). In a test (not shown), we confirmed that the primer combination Pro+ and MicoMico is able to amplify DNA isolated from tails of common shrew (S. araneus). However, and most importantly, feces of insectivorous shrews have a very different consistency and it is unlikely that we collected shrew feces (Marten Sikkema & Leo Bruinzeel, pers. comm.). One sample yielded DNA of domestic pig or wild boar (Sus scrofa). Pig manure is regularly used on Dutch farmland and may be present on the salt marsh (while pigs and wild boar do  Tougard et al. (2008) and presumably of the western lineages (assessed through (https://doi.org/10.1371/journal.pone.0003532.s001). Note the difference in scale between main network and inset; the maximum frequency of the most common haplotype is comparable at 17 and 14 sequences. Solid lines, links between haplotypes; small segments on links, number of mutations. Alternative links are not displayed. The connection between haplotype network of the western lineages and the Dutch network is depicted by a solid line from haplotype VII not occur in the study area) and therefore, we assume that this sample was contaminated with environmental DNA.

Haplotypes of Microtus arvalis in The Netherlands
Dutch common voles belong to the western subspecies M. arvalis arvalis as opposed to the eastern obscurus taxon, which is sometimes regarded a different species, M. obscurus (Jaarola et al. 2004). Currently, five evolutionary lineages are recognized in M. arvalis arvalis: the Eastern, Italian, and Central lineages, and two western lineages, the Western-North and Western-South lineages, which in some studies are lumped (Tougard et al. 2008;Martínková et al. 2013). Based on cytochrome b haplotypes, Dutch M. a. arvalis have been assigned to the Central lineage (Tougard et al. 2008;Martínková et al. 2013).
The Dutch control region haplotypes detected in this study are separated by 20 mutations from control region haplotypes of the two western European lineages sampled in France ( Fig. 2) (Tougard et al. 2008). We did not find a star-like topology of many minor haplotypes surrounding a few common haplotypes, as observed before in the Central lineage (cytochrome b haplotypes) and the western lineages (cytochrome b and control region, see Fig. 2) (Tougard et al. 2008). Martínková et al. (2013) discovered that the continental Western-North lineage has a star-like phylogeny because variation at cytochrome b seems to be recently purged in the continental populations, possibly because the sampling sites were susceptible to massive population declines and subsequent expansions. Our observation of five common haplotypes and very few minor haplotypes could mean that the variation at the control region is not purged as much by population fluctuations as cytochrome b. Also, the vole population in our relatively small natural sampling area may not have experienced recent population fluctuations. To our knowledge, the ten detected mitochondrial control region haplotypes have not been published earlier for M. arvalis and represent new knowledge regarding the distribution of mtDNA variation in this species.