preprocessing_RNAseq.smk¶
Snakemake workflow for preprocessing paired-end and single-end bulk RNA-seq data. The layout parameter in the config file controls the mode.
Note
Please make sure that you have Singularity and Snakemake installed on your system and cloned the SnakeNgs repository.
Workflow¶
Paired-end (layout: "paired")¶
The rulegraph was created by snakevision.
- Quality control using fastp with the default parameters.
- Alignment using STAR with the parameter
--outFilterMultimapNmax 1. - Convert the SAM file to BAM file and sort using samtools.
- Collect metrics using Picard
CollectRnaSeqMetricsandCollectInsertSizeMetrics. - Make bigWig files using deepTools
bamCoveragewith the parameter--binSize 1. - Make summary statistics using MultiQC.
Single-end (layout: "single")¶
The rulegraph was created by snakevision.
- Quality control using fastp with the default parameters.
- Alignment using STAR with the parameter
--outFilterMultimapNmax 1. - Convert the SAM file to BAM file and sort using samtools.
- Collect metrics using Picard
CollectRnaSeqMetrics. - Make bigWig files using deepTools
bamCoverage. - Make summary statistics using MultiQC.
Usage¶
1 2 3 4 5 | |
config.yaml should contain the following information:
Paired-end¶
1 2 3 4 5 | |
path/to/outputshould containfastqdirectory with the following structure:
1 2 3 4 5 6 7 8 | |
Single-end¶
1 2 3 4 5 | |
path/to/outputshould containfastqdirectory with the following structure:
1 2 3 4 5 | |
Common settings¶
/path/to/reference_transcriptome.gtfis the reference transcriptome in GTF format (e.g.Homo_sapiens.GRCh38.106.gtffor human transcriptome).
Please refer to the tutorial for more information.