Interactive and Reproducible Workflows for Exploring and Modeling RNA‐seq Data with pcaExplorer, Ideal, and GeneTonic

A Ludt, A Ustjanzew, H Binder, K Strauch… - Current …, 2022 - Wiley Online Library
A Ludt, A Ustjanzew, H Binder, K Strauch, F Marini
Current protocols, 2022Wiley Online Library
The generation and interpretation of results from transcriptome profiling experiments via
RNA sequencing (RNA‐seq) can be a complex task. While raw data quality control,
alignment, and quantification can be streamlined via efficient algorithms that can deliver the
preprocessed expression matrix, a common bottleneck in the analysis of such large datasets
is the subsequent in‐depth, iterative processes of data exploration, statistical testing,
visualization, and interpretation. Specific tools for these workflow steps are available but …
Abstract
The generation and interpretation of results from transcriptome profiling experiments via RNA sequencing (RNA‐seq) can be a complex task. While raw data quality control, alignment, and quantification can be streamlined via efficient algorithms that can deliver the preprocessed expression matrix, a common bottleneck in the analysis of such large datasets is the subsequent in‐depth, iterative processes of data exploration, statistical testing, visualization, and interpretation. Specific tools for these workflow steps are available but require a level of technical expertise which might be prohibitive for life and clinical scientists, who are left with essential pieces of information distributed among different tabular and list formats.
Our protocols are centered on the joint use of our Bioconductor packages (pcaExplorer, ideal, GeneTonic) for interactive and reproducible workflows. All our packages provide an interactive and accessible experience via Shiny web applications, while still documenting the steps performed with RMarkdown as a framework to guarantee the reproducibility of the analyses, reducing the overall time to generate insights from the data at hand.
These protocols guide readers through the essential steps of Exploratory Data Analysis, statistical testing, and functional enrichment analyses, followed by integration and contextualization of results. In our packages, the core elements are linked together in interactive widgets that make drill‐down tasks efficient by viewing the data at a level of increased detail. Thanks to their interoperability with essential classes and gold‐standard pipelines implemented in the open‐source Bioconductor project and community, these protocols will permit complex tasks in RNA‐seq data analysis, combining interactivity and reproducibility for following modern best scientific practices and helping to streamline the discovery process for transcriptome data. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC.
Basic Protocol 1: Exploratory Data Analysis with pcaExplorer
Basic Protocol 2: Differential Expression Analysis with ideal
Basic Protocol 3: Interpretation of RNA‐seq results with GeneTonic
Support Protocol: Downloading and installing pcaExplorer, ideal, and GeneTonic
Alternate Protocol: Using functions from pcaExplorer, ideal, and GeneTonic in custom analyses
Wiley Online Library