• No results found

environmental Genomics, dna Barcoding and ecosystem Modelling

4. coMPuter Science and MatheMaticS

5.4 environmental Genomics, dna Barcoding and ecosystem Modelling

Charting the diversity of Earth has been exceedingly slow because it has been so time-consuming and expensive to identify and describe organisms.

More than 250 years after the start of taxonomy, only 10-20% of the planet´s macroscopic diversity is known. The study of microbial diversity is largely

restricted to organisms that can be grown in the lab. Among multicellular organisms, precious little is known about most decomposers and parasites, including the majority of fungi and many insects. As a result, biologists have so far been limited to the study of isolated bits and pieces in the complex fabric of natural ecosystems.

New DNA sequencing techniques are now changing all of this. About a decade ago, microbiologists began sequencing random DNA from environ-mental samples, such as sea water, leading to the development of the field now known as environmental genomics or metagenomics. About the same time, macrobiologists started systematically building reference libraries of selected DNA markers, “DNA barcodes”, that could be used for genetic identification of species. A large number of scientists are now contributing to the established global reference library for genetic identification of spe-cies, the Barcode of Life Database (BOLD). Web platforms for taxonomic annotation of DNA sequences from environmental samples, such as the UNITE system developed by mycologists, bridge environmental genomics and DNA barcoding.

Environmental genomics and large-scale DNA barcoding are likely to result in several important breakthroughs in coming years, most impor-tantly in biomonitoring and ecosystem research, and Swedish scientists are well poised to play a prominent role in the field. The field relies heavily on e-Infrastructure, including access to medium- to large-scale storage for local databases or mirrors of international databases, and bioinformatics resources (software and hardware) for data processing and analysis. Many future projects will also develop web platforms allowing humans or machines to access, analyse, annotate, and contribute to the available data, and would benefit from an open and flexible environment.

Figure 5.3: Workflow for genetic species identification of biological samples. Image: Johan Bodegård.

5.4.1 Biomonitoring

Society spends large sums on biomonitoring for purposes such as assess-ment of environassess-mental change, or detection of invasive species. Genetic identification has been extensively tested as an alternative to the traditional morphology-based approach used today. Despite requiring very little train-ing, genetic identification has been found to be cheaper, faster, more re-liable, and vastly more accurate than traditional methods. In addition, it typically allows a much higher proportion of the specimens to be identified to species. For instance, immature life stages typically make up the bulk of benthic biomonitoring samples. While morphology-based identification of immature organisms often stops at the family or genus level, genetic tech-niques allow species-level identification of the entire material. An impor-tant breakthrough will occur when next-generation sequencing of bulk samples, such as soil, water or trap samples can reliably replace traditional identification methods.

5.4.2 community ecology and “Biomics”

Genetic identification opens up a whole array of new research possibilities for ecologists. Key interactions among species, which used to take hours of painstaking observation to establish, can now be identified simply by ex-amining a specimen for DNA traces of other organisms. These DNA traces are then matched against a complete DNA reference library for the local flora and fauna to identify the salient interactions. Recent research shows that genetic species-level identification can be applied to gut contents even after the food has been fully degraded and processed, making it possible to identify host species of parasitoids, prey items of carnivores, or host plants of herbivores even when the life history of the species is completely un-known. The entire food web around the spruce budworm, a major forest pest in North America, has been inferred in this way. The availability of material collected by the internationally unique Swedish Taxonomy Initia-tive (Svenska artprojektet) makes it possible to establish a DNA reference library for the complete Swedish flora and fauna of some 60,000 species relatively easily. Together with our strong ecological research tradition, this could make Swedish researchers world leading in studies of whole ecosys-tems, or “biomics”.

5.4.3 ecosystem modelling

Ecosystem modelling has traditionally been based on small datasets col-lected through timeconsuming manual efforts and analysed separately.

During the last decade, there have been substantial international efforts to standardize data collection protocols, push the datasets onto the web, and develop and implement the controlled vocabularies, ontologies, and analyt-ical tools needed to allow analyses across datasets. The Global Biodiversity Information Facility (GBIF) represents the largest and oldest of many na-tional, regional or international initiatives pushing for these changes. Sev-eral recent trends are now accelerating this development. For instance, bio-diversity researchers are becoming increasingly sophisticated in making use of “citizen scientists” in data collection efforts. Biodiversity observations by citizen scientists now account for the majority of the 400 million bio-diversity occurrence records provided by GBIF to the research community.

New sensor technology and camera systems have facilitated the collection of a large number of biodiversity observations, for example tracking data documenting the movement of individual animals. Swedish scientists are internationally prominent, for example, in the field of animal movement and continental-scale modelling of vegetation, carbon cycle and greenhouse gas exchange [Smi+11]. Significant improvements in the ability to model ecosystems and their resilience, understand past trends and predict future changes are expected as well as novel research by combining biotic and abi-otic data. The e-Infrastructure required includes computing resources and storage and web services to assisting Swedish scientists in contributing data to international data portals, and to mine existing data for their own re-search projects.

5.4.4 Potential breakthroughs

Methods to characterize large ecosystems and improved mapping of biodi-versity. Ecosystem modelling of larger and more complex systems.