Summary and Schedule
Welcome to this lesson on the fundamental principles of Pangenomics, a rapidly advancing field in bioinformatics. Throughout this course, you will delve into the fundamental theories that underpin the study of pangenomes. By utilizing command-line software, you will gain hands-on experience in downloading and annotating public bacterial genomes, acquiring essential genomic analysis skills.
One of the key highlights of this course is the opportunity to engage with specialized programs designed for pangenomics analysis. You will master the art of gene family clustering. You will become adept at constructing interactive pangenome graphs and plots, powerful visualization tools for studying the general structure of a pangenome and the families composing it. You will finally explore how to apply Topological Data Analysis to studying pangenomes.
The analyses presented here were meticulously curated to equip you with the necessary tools for conducting a starting pangenomics pipeline. By refining your bioinformatician skills through practical application, you will not only gain confidence in your abilities but also be well-prepared to explore diverse resources (See Other Resources). With this, you can go ahead and develop your personalized workflow tailored to the specific objectives of your pangenomics research.
Get ready to embark on this exciting journey into the world of Pangenomics, where you will unlock new insights and unravel the complexities of genomic variation!
{% comment %} This is a comment in Liquid {% endcomment %}
Pre-requisites
Before diving into this lesson on Pangenomics, it is essential to have a working understanding of the Bash shell and the language Python. If you are not already familiar with these programming languages, we recommend completing the Introduction to the Command Line for Pangenomics lesson before to starting this one and Introduction to Python for Pangenomics.
Additionally, some familiarity with biological concepts is assumed for this lesson. Having a basic understanding of prokaryote, genomes, genes, and orthologyis beneficial. If you are new to these concepts, we encourage you to review relevant materials to ensure you have a solid foundation for this course.
Throughout this lesson, we will utilize data hosted on an
Amazon Machine Instance (AMI). Workshop participants
will receive information on how to log in to the AMI during the
workshop. If you are studying independently, you must set up your own
AMI or install the necessary programs on your computer. Detailed
instructions on setting up an AMI and accessing the required data can be
found on the Pangenomics
Workshop Setup page.
If you are taking this workshop in UNAM-CCM, you will
access the shell and Python and have access to all the bioinformatics
programs through a JupyterHub
server.
This lesson is the third part of the Pangenomics Workshop, which also includes Introduction to the Command Line for Pangenomics and Introduction to Python for Pangenomics.
| Setup Instructions | Download files required for the lesson | |
| Duration: 00h 00m | 1. Introduction to Pangenomics |
What is a pangenome? What are the components of a pangenome? |
| Duration: 00h 25m | 2. Downloading Genomic Data | How to download public genomes by using the command line? |
| Duration: 01h 10m | 3. Annotating Genomic Data | How can I identify the genes in a genome? |
| Duration: 02h 15m | 4. Measuring Sequence Similarity | How can we measure differences in gene sequences? |
| Duration: 02h 55m | 5. Clustering with BLAST Results | How can we use the blast results to form families? |
| Duration: 03h 30m | 6. Clustering Protein Sequences | Can I cluster my sequences automatically? |
| Duration: 04h 10m | 7. Exploring Pangenome Graphs |
How can I build a pangenome of thousands of genomes? How can I visualize the spatial relationship between gene families? |
| Duration: 04h 50m | 8. Interactive Pangenome Plots |
How can I obtain an interactive pangenome plot? How can I measure the homogeneity of the gene families? How to obtain an enrichment analysis of the gene families? How to compute the ANI values between the genomes of the pangenome? |
| Duration: 05h 20m | 9. Other Resources |
What can I do after I have built a pangenome? What bioinformatic tools are available for downstream analysis of pangenomes? |
| Duration: 05h 40m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Please go to the Pangenomics Workshop Overview Setup Page.