Despite the important role that microorganisms play in environmental processes, the small number of cultured microbes has limited knowledge of their ecological strategies. However, the development of high-throughput sequencing has generated a huge amount of genomic and metagenomic data without the need of culturing that can be used to study ecological questions. Metagenomics describes a means of acquiring and analyzing nucleic acid sequencing data deriving from an environment rather than from a specific organism. Metagenomes can be analyzed using different approaches and tools. One of the most important distinctions is the way to perform taxonomic and functional assignment, choosing between the use of assembly algorithms or the direct analysis of raw sequence reads by homology searching, k-mer analysis or detection of marker genes. Many instances of each approach can be found in the literature, but to the best of our knowledge no evaluation of their different performances has been carried out. Among the wide variety of environments where microorganisms can be found, the oceans constitute the largest ecosystem and one of the least known. The activities of marine microorganism drive the biogeochemical cycles and have a strong impact on the regulation of climate.
In this thesis, there are three main objectives: (i) estimating the functional capabilities, genomic sizes and 16S copy number of different taxa in relation to their ubiquity and their environmental preferences; (ii) evaluating different performances for the analysis of metagenomes from three different environments (human gut, marine, and thermal), and (iii) studying particular aspects of the diversity, structure, and functional capabilities of marine bacterial communities.
To achieve the first goal, we compiled data regarding the presence of each prokaryotic genus in different environments. Then, genomic characteristics such as genome size, 16S rRNA gene copy number and functional content of the genomes were related to their ubiquity and different environmental preferences of the corresponding taxa. The results showed clear correlations between genomic characteristics and environmental conditions.
To compare different approaches for functional and taxonomic annotation of metagenomes, we analyzed several real and mock metagenomes using different methodologies and tools and compared the resulting taxonomic and functional profiles. Our results show that database completeness (the representation of diverse organisms and taxa in it) is the main factor determining the performance of the methods relying on direct read assignment either by homology, k-mer composition or similarity to marker genes, while methods relying on assembly and assignment of predicted genes are more influenced by metagenome size that, in turn, determines the completeness of the assembly (the percentage of reads that were assembled).
To gain insight into the diversity, structure, and functional capabilities of marine bacterial communities, we analyzed metagenomes from Malaspina 2010 Expedition samples. We selected the most abundant taxa by means of the study of miTags, we calculated the average copy number per genome of genes from surface and deep populations of the studied taxa and selected the most variable genes. Our results show clear differences in the abundance of genes related to nutrient transport.
Thus, we can conclude that ubiquity and adaptation were linked to genome size, while 16S copy number was not directly related to ubiquity. We observed that different combinations of these two characteristics delineate the different environments. Besides, the analysis of functional classes showed some clear signatures linked to particular environments.
In relation to metagenomic analysis, although differences exist, taxonomic profiles are rather similar between raw read assignment and assembly assignment methods, while they are more divergent for methods based on k-mers and marker genes. Regarding functional annotation, analysis of raw reads retrieves more functions, but it also makes a substantial number of over-predictions. Assembly methods are more advantageous as the size of the metagenome increases.
Finally, the study of Malaspina 2010 samples revealed that P and Fe high-affinity transporters are only present in populations of oligotrophic and free-living microorganisms where these nutrients are scarce, whereas particle-attached microorganisms do not present these variations. In addition, copiotrophic microorganisms present a wider variety of transporters than oligotrophs, which is reflected in their larger genome size and allows them to obtain P and Fe from different sources.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados