Home

Awesome

README for pRSEM demo

Table of Contents


<a name="Introduction"></a> Introduction

Prior-enhanced RSEM (pRSEM) is an RNA-seq quantification method that utilizes external data for the task of transcript abundance estimation. The workflow of pRSEM is illustrated in the following figure.

alt text

This repository is a mini-example for running pRSEM. It contains all the required software packages, input files, and scripts. More details is described below. This demo runs in 4 threads. The installation and running will take about 20 to 30 minutes on a 4 x 2.4GHz core machine depending on which R/Bioconductor libraries users have already installed.

<a name="Download"></a> Download

There are two ways to download this demo and all three required submodules

<a name="SystemRequirements"></a> System Requirements

<a name="Submodules"></a> Submodules

This demo requires three submodules:

<a name="Usage"></a> Usage

Go to this demo's folder and type

./run_pRSEM_demo.sh

This script will carry out the following tasks:

  1. Install Bowtie, STAR, RSEM, and all required libraries not yet installed by users
  2. Prepare genome references for Bowtie, STAR, and RSEM
  3. Derive prior parameters from RNA Polymerase II ChIP-seq data and use them to quantify RNA-seq data
  4. Derive prior parameters from a combination of four histone modfication ChIP-seq data sets and use them to quantify RNA-seq data
  5. Perform a testing procedure using a combination of four histone modification ChIP-seq data sets as the external data
  6. Perform a testing procedure using RNA Polymerase II ChIP-seq peaks as the external data

<a name="Input"></a> Input

All of the following data sets are under the folder input/. The RNA-seq and PolII ChIP-seq data were derived from ENCODE2 mouse Mel cell line. Although they are derived from a cell line rather than from tissue, we named them with keyword mmliver just to be consistent with the examples given in RSEM's documentation. The four histone modification ChIP-seq data sets were derived from Lara-Astiaso and Weiner et al. Science 2014 345:943.

<a name="Output"></a> Output

All output files will be stored in the following four folders under output/:

Please note that, in order to shorten the running time as much as possible, the input ChIP-seq and RNA-seq files were prepared in extremely small (and unrealistic) sizes, and Gibbs sampling were set to run in just 100 instead of the default 1000 steps. As a result, the final quantification results may vary from time to time.

<a name="Contact"></a> Contact

Got a question? Please post it at RSEM's GitHub Issues page with @pliu55 mentioned.

<a name="License"></a> License

This demo is licensed under the GNU General Public License v3.