A checklist of the flowering plants of Komi Republic (northeast of European Russia) and their representation in BOLD and GenBank databases

We presented the updated list of flowering plants (Angiosperms) of the Komi Republic that comprises 1211 taxa (including subspecies), 401 genera, and 80 families. This checklist based on the authors field collections data, materials from the Scientific Herbarium of the Institute of Biology of the Komi Scientific Center of the Ural Branch of the Russian Academy of Sciences, published data and open-access databases. For each taxon of flowering plants, we provided a presence-absence checklist of nucleotide sequences (rbcL, matK, ITS2 and trnH-psbA) that is available in BOLD and GenBank databases of DNA barcode data. The presented dataset will promote the identification of potentially new species (including endemic taxa) for molecular taxonomy and including of new sequences into the global database of BOLD Systems using the regional flora as model object.


Introduction
The Komi Republic is located in the northeast of European Russia (Fig. 1), it comprises several environmental and climatic zones (southern, middle and northern taiga, forest-tundra and tundra), as well as five physical and geographical regions (Vychegodsky-Mezen Plain, Timan Ridge, Pechora Plain, Bolshezemelskaya tundra and the western macroslope of the Ural Range, including the Northern, Subpolar and Polar Urals). For historical reasons, the region is distinguished by a replacement of European floristic elements with Asian ones in the latitudinal direction, and in longitudinal direction with a replacement of complexes of forest-steppe and nemoral species by representatives of the northern Arctic-Alpine, Hypoarctic and Arctic complexes (Flora of Northeast of the European part of the USSR 1974USSR , 1976aUSSR , 1976bUSSR , 1977. All these affect the originality of the regional flora and undoubtedly determine the speciation and species distribution and divergence during the evolution. Among the regional flowering plants, which are stored in the herbarium of the Institute of Biology of the Komi Scientific Center of the Ural Branch of the Russian Academy of Sciences, there are endemic species of the northeast of European Russia, the Ural Mountains, and the Arctic, and the species represented by the regional isolated populations. Some parts of the Komi Republic, especially the ridges of the western macroslope of the Urals, are difficult to access, and information about plants diversity and distribution could not be exhaustive (Natural heritage of the Urals 2012, Degteva et al. 2016, The Red Data Book of the Komi Republic 2019). The composition and distribution range of many taxa are unclear for this territory, whereas the available data are partially outdated (Flora of Northeast of the European part of the USSR 1974USSR , 1976aUSSR , 1976bUSSR , 1977 or incomplete (Lavrenko et al. 1995, Martynenko andDegteva 2003, Martynenko and  Gruzdev 2008, Florae 2016) and require updating with current information. Botanical collections are an indispensable tool in the field of biodiversity conservation. Revisions of regional flora and herbarium collections are crucial for understanding the regional biodiversity and native plant species distribution, and floristic changes monitoring.
The Herbarium of the Institute of Biology of the Komi Scientific Center of the Ural Branch of the Russian Academy of Sciences (SYKO) was founded in 1941 by the prominent Russian botanist A.I. Tolmachev. Currently, its funds include more than 220,000 samples of vascular plants. The main part of the collection fund is comprised of field collections from the northeast of the European part of Russia (Komi Republic, Nenets Autonomous Okrug, Arkhangelsk, Vologda, Kirov Oblasts of Russia). The plant communities of the European North zonal parts, from the polar deserts to the southern taiga and high-altitude zones of Urals western macroslope, is represented in the Herbarium.
At present, due to intense biodiversity studies, it is urgent to create the reference libraries of DNA markers, which will become the basis for the molecular identification (DNA barcoding) of regional plants (Kress 2017, Hosein et al., 2017. The first step in creating the library of molecular markers for the identification of regional taxa is the revision of regional floristic list. The next step is the screening of taxa for the presence/ absence of molecular markers that are used for species identification in available global genetic databases. Currently, the regions of chloroplast genome rbcL, matK and trnH-psbA along with nuclear markers ITS2 are successfully used for the investigation of phylogenetic patterns and DNA certification of flowering plants (Saarela JM et al. 2012, Bolson et al. 2015, Wattoo et al. 2016. In general, a DNA-based diagnostics system can be effectively used only if the sequences of all the species molecular markers, used for their identification, are present in the databases. Otherwise, it is only possible to determine how this marker sequence differs from those already present in the databases (Abramson 2009). The search and subsequent filling of gaps in molecular taxonomy and DNA barcoding, even for a regional flora, can serve as a source of new global information. This information will allow to expand our understanding of genetic taxonomy and to analyze more comprehensively the phylogenetic relationships within specific taxonomic groups.
The first stage of our research was to conduct a nomenclature revision and updating of flowering plants (Angiosperms) checklist of the Komi Republic, and to screen this list for the markers presence/absence (matK, rbcL, ITS2, trnH-psbA) in BOLD Systems and GenBank databases of DNA barcode data.

Material and methods
The studied area is located in the southern, middle and northern taiga zones, foresttundra and tundra of the northeast of the European part of Russia in the Komi Republic (Fig. 1). The Komi Republic has a total area of 415900 km2 (coordinates: 59°12' -68°25'N and 45°25' -66°15'E). Based on SYKO herbarium materials, literature data (Flora of Northeast of the European part of the USSR 1974, 1976a, 1976b, 1977, Lavrenko et al. 1995, Martynenko and Degteva 2003, Martynenko and Gruzdev 2008, Florae 2016, Degteva et al. 2016, The Red Data Book 2019.) and field collections of V.A. Kanev from the late 1990s, we compiled a list of flowering plants growing in the Komi Republic. We also analyzed the protected plants' checklist of the Komi Republic (Red Data Book, 2019). The morphological identification of plant samples from the herbarium was not carried out.
In our research, we used the modern classification of angiosperms APGIV (The Angiosperm Phylogeny Group 2016) and arranged taxa within the families in alphabetical order. The taxonomy of all species was validated using the open-access databases of The Plant List (2020) and World Flora Online (2020), because the absence of a common name is the cause of technical difficulties during the registration of nucleotide sequences in the databases of NCBI (GenBank) and Barcode of Life Data (NCBI 2020, BOLD Systems 2020). The analysis of the representation of plastid markers (gene matK, gene rbcL and intergenic spacer region trnH-psbA) and a nuclear marker (ITS2) in BOLD Systems and GenBank databases was carried out for all flowering plant species from the regional list available by the end of 2019.

Results and discussion
According to our data, the list includes 1211 taxa (including subspecies and varieties) of flowering plants comprising 80 families and 401 genera (Suppl. material 1: Table 1). The families with high species diversity are Asteraceae (144 taxa), Poaceae (127) and Cyperaceae (101) that are the most typical for regional flora (Fig.  2). The other families are Caryophyllaceae (71 taxa), Brassicaceae (66), Rosaceae  (28), Juncaceae (28) and Lamiaceae (28). A total of 18 families are represented in the region by only one species. Some of them (213 taxa) are included in the lists of rare protected plants (The Red Data Book 2019). However, this checklist is subject to periodic revisions based on new information from local checklists, nomenclature changes and species discovery.
The regional flowering plant species represented in the list were analyzed for the presence/absence of DNA markers (rbcL, matK, ITS2 and trnH-psbA) that are most often used for the molecular identification of species in two global genetic databases of BOLD Systems and GenBank (Suppl. material 1: Table 1). Given the fact that the GenBank database does not have the primary sequence information and reference samples metadata, the likelihood of species mistaken identification is high (Dormontt et al., 2018). Therefore, below we presented the information resulting from the analysis of the BOLD Systems database. Screening of flowering plant species list of the Komi flora for the presence/absence of the molecular markers in these databases made it possible to establish the representation of each marker, which was 77, 73, 68, and 14 %, respectively, for the rbcL, matK, ITS2 and trnH-psbA sequences.
Apart from the screening, we provide the list of flowering plant species (113 taxa), which nucleotide sequences are absent in the international genetic databases (Table 1). These are potentially new species for molecular taxonomy of regional and world flora. They are endemic vascular plant species of the northeast of European Russia (e.g. Agrostis korczaginii) and the Cis-Urals and Urals (e.g. Festuca pohleana, Lagotis uralensis, Astragalus gorodkovii, Linum komarovii subsp. boreale, and Trollius apertus) and rare species represented by regional isolated populations (e.g. Scorzonera glabra), which are endangered or vulnerable according to The Red Data Book of the Komi Republic (2019). Thus, we were able to determine the spe- Table 1. Species of flowering plants (Angiosperms) of the Komi Republic, which nucleotide sequences are not presented in the BOLD Systems and GenBank databases.

Family
Species*

Alchemilla sibirica Zamelis
Alchemilla substrigosa Juz. cies missing in the flowering plants molecular taxonomy of regional and global flora using the flowering plants of the Komi Republic.

Conclusion
We provided an updated checklist of flowering plant species of the Komi Republic. The list of plant species has high value for floristic changes monitoring in the northeast of European Russia. Our data can be integrated and extrapolated in large scale floristic analysis. We also provided a representation of genetic markers most commonly used for species identification (rbcL, matK, ITS2 and trnH-psbA) in the genetic databases of BOLD Systems and GenBank for the regional flowering plants. We presented a checklist of the plant species, not registered in the global genetic databases of BOLD Systems and GenBank and thus are potentially new for the DNA barcoding (especially rare endemic taxa) and for the molecular taxonomy. This checklist will outline the species for molecular identification and for providing new information to genetic databases.
Our future research will include the large-scale and coordinated analysis of herbarium collections of the Institute of Biology, generation and curation of reference data. We believe that the taxonomic identifications and plant DNA barcoding will contribute to the taxonomic and species diversity research in the northeast of European Russia and the Ural Mountains.