Starting a Metagenomics Project
- Shotgun metagenomics can be used for taxonomic and functional studies.
- Metabarcoding can be used for taxonomic studies.
- Collecting metadata beforehand is fundamental for downstream analysis.
- We will use data from a Cuatro Ciénegas project to learn about shotgun metagenomics.
Assessing Read Quality
- It is important to know the quality of our data to make decisions in the subsequent steps.
- FastQC is a program that allows us to know the quality of FASTQ files.
-
for
loops let you perform the same operations on multiple files with a single command.
Trimming and Filtering
- The options you set for the command-line tools you use are important!
- Data cleaning is essential at the beginning of metagenomics workflows.
- Use Trimmomatic to get rid of adapters and low-quality bases or reads.
- Carefully fill in the parameters and options required to call a function in the bash shell.
- Automate repetitive workflows using for loops.
Metagenome Assembly
- Assembly groups reads into contigs.
- De Bruijn Graphs use Kmers to assembly cleaned reads.
- Program screen allows you to keep open remote sessions.
- MetaSPAdes is a metagenomes assembler.
- Assemblers take FastQ files as input and produce a Fasta file as output.
Metagenome Binning
- Metagenome-Assembled Genomes (MAGs) sometimes are obtained from curated contigs grouped into bins.
- Use MAXBIN to assign the contigs to bins of different taxa.
- Use CheckM to evaluate the quality of each Metagenomics-Assembled Genome.
Taxonomic Assignment
- A database with previously gathered knowledge (genomes) is needed for taxonomic assignment.
- Taxonomic assignment can be done using Kraken.
- Krona and Pavian are web-based tools to visualize the assigned taxa.
Exploring Taxonomy with R
- kraken-biom formats Kraken output-files of several samples into the
single
.biom
file that will be phyloseq input. - The library
phyloseq
manages metagenomics objects and computes analyses. - A phyloseq object stores a table with the taxonomic information of each OTU and a table with the abundance of each OTU.
Diversity Tackled With R
- Alpha diversity measures the intra-sample diversity.
- Beta diversity measures the inter-sample diversity.
- Phyloseq includes diversity analyses such as alpha and beta diversity calculation.
Taxonomic Analysis with R
- Depths and abundances can be visualized using phyloseq.
- The library
phyloseq
lets you manipulate metagenomic data in a taxonomic specific perspective.
Other Resources
- Enjoy metagenomics.