Starting a Metagenomics Project


  • Shotgun metagenomics can be used for taxonomic and functional studies.
  • Metabarcoding can be used for taxonomic studies.
  • Collecting metadata beforehand is fundamental for downstream analysis.
  • We will use data from a Cuatro Ciénegas project to learn about shotgun metagenomics.

Assessing Read Quality


  • It is important to know the quality of our data to make decisions in the subsequent steps.
  • FastQC is a program that allows us to know the quality of FASTQ files.
  • for loops let you perform the same operations on multiple files with a single command.

Trimming and Filtering


  • The options you set for the command-line tools you use are important!
  • Data cleaning is essential at the beginning of metagenomics workflows.
  • Use Trimmomatic to get rid of adapters and low-quality bases or reads.
  • Carefully fill in the parameters and options required to call a function in the bash shell.
  • Automate repetitive workflows using for loops.

Metagenome Assembly


  • Assembly groups reads into contigs.
  • De Bruijn Graphs use Kmers to assembly cleaned reads.
  • Program screen allows you to keep open remote sessions.
  • MetaSPAdes is a metagenomes assembler.
  • Assemblers take FastQ files as input and produce a Fasta file as output.

Metagenome Binning


  • Metagenome-Assembled Genomes (MAGs) sometimes are obtained from curated contigs grouped into bins.
  • Use MAXBIN to assign the contigs to bins of different taxa.
  • Use CheckM to evaluate the quality of each Metagenomics-Assembled Genome.

Taxonomic Assignment


  • A database with previously gathered knowledge (genomes) is needed for taxonomic assignment.
  • Taxonomic assignment can be done using Kraken.
  • Krona and Pavian are web-based tools to visualize the assigned taxa.

Exploring Taxonomy with R


  • kraken-biom formats Kraken output-files of several samples into the single .biom file that will be phyloseq input.
  • The library phyloseq manages metagenomics objects and computes analyses.
  • A phyloseq object stores a table with the taxonomic information of each OTU and a table with the abundance of each OTU.

Diversity Tackled With R


  • Alpha diversity measures the intra-sample diversity.
  • Beta diversity measures the inter-sample diversity.
  • Phyloseq includes diversity analyses such as alpha and beta diversity calculation.

Taxonomic Analysis with R


  • Depths and abundances can be visualized using phyloseq.
  • The library phyloseq lets you manipulate metagenomic data in a taxonomic specific perspective.

Other Resources


  • Enjoy metagenomics.