Scientific reproducibility: What is it for?
- Reproducible research is key for scientific advancement.
- RStudio can help you to organize, have better control over, and produce reproducible research.
Good Practices for Managing Projects in RStudio
- Use best practices for file and folder organization. This includes using relative file paths as opposed to complete file paths.
- Make sure that all data are backed up on multiple devices and that you treat raw data as read-only.
- We can use Git and GitHub to keep track of what we’ve done in the past and what we plan to do in the future.
- Rproj files are pivotal to keeping everything bundled and organized.
Navigating RStudio and Quarto Documents
- RStudio has four panels to organize your code and environment.
- Manage packages in RStudio using specific functions.
- Quarto documents combine text and code.
Working with projects in RStudio
Introduction to Working with Quarto documents
- Quarto lets you create reproducible documents.
- An qmd file is comprised of a YAML header, formatted text in qmd, and code blocks.
- The render function converts the file into the chosen output format.
Writing and Styling Qmd Documents
- The visual editor has made formatting much easier.
- You can apply Qmd styling without prior Quarto knowledge.
- You can include inline code to narratives for basic calculations and dynamic information.
Adding Code to Quarto Documents
- Knitr will render your code and markdown-formatted text and output your document format of choice.
- Code chunks are runnable pieces of R code.
- Setting your working directory at the project level can effectively mitigate path-related challenges encountered while working on Quarto documents.
Rendering & Customizing Code Outputs
- Each time you render/knit the document, calculations and plots will run and be displayed.
- Options for code chunks can be set at the document level.
Advanced Code Chunk Options
- Learn how to externally source code
source()
- Learn how to modularize your code to make it more reproducible
- Use a chunk at the beginning of your document to load libraries and data to make your document more efficient.
Bibliography, Citations & Cross-Referencing
- Rstudio supports different lookup strategies to make the citation process easier.
- Rstudio supports different citation styles.
- The YAML can be adjusted to display uncited items in the reference list.
- Use bookdown to cross-reference content.
Using Git in RStudio
- RStudio integrates with Git to track and commit changes performed locally.
- You may use
.gitignore
to determine which files Git should ignore when tracking changes.
Collaborating via GitHub
- Setting up R Studio to authenticate with GitHub using a Personal Authentication Token (PAT).
- Setting the Git repository Origin in your R Studio project enables pushing and pulling from your local copy of the repository to the repository on GitHub.
Managing Dependencies in R/RStudio
- Run the
sessionInfo()
function to take a snapshot of your computational environment. - Groundhog is a handy tool for capturing your project’s package dependencies.
- Make your R scripts reproducible by replacing
library(pkg)
withgroundhog.library(pkg,date)
.
Publishing your project
- You may choose to share and publish your data project before publishing its associated manuscript.
- Sharing the code, data, and documentation is necessary to allow for inspection and research reproducibility.
- The quarto-journals GitHub organization has journal article formats available for use.
Creating and sharing reproducible environments with renv
- renv is a handy tool for capturing your project’s package dependencies
- renv creates a JSON lock file which documents dependencies and let users restore the original versions used for a particular project