Skip to content

preprocessing_CLIPseq.smk

Snakemake workflow for preprocessing CLIP-seq data (HITS-CLIP, iCLIP-seq, etc.).

Note

Please make sure that you have Singularity and Snakemake installed on your system and cloned the SnakeNgs repository.

Workflow

  1. Quality control using fastp with the parameters specified in the config.yaml (fastp_args).
  2. Alignment using STAR with the parameter --outFilterMultimapNmax specified in the config.yaml.
  3. Convert the SAM file to BAM file and sort using samtools.
  4. Remove duplicates using Picard MarkDuplicates with the parameter --REMOVE_DUPLICATES true.
  5. Make bigWig files using deepTools bamCoverage with the parameter --binSize 1. Optionally, normalize using CPM by setting normalize_bigwig: true in the config.yaml.
  6. Make summary statistics using MultiQC.

Usage

1
2
3
4
5
snakemake -s /path/to/SnakeNgs/snakefile/preprocessing_CLIPseq.smk \
--configfile /path/to/config.yaml \
--cores <int> \
--use-singularity \
--rerun-incomplete

config.yaml should contain the following information:

1
2
3
4
5
6
workdir: /path/to/output
samples: ["SRRXXXXXX", "SRRYYYYYY", "SRRZZZZZZ"]
star_index: /path/to/star_index
fastp_args: "-l 20 -3 --trim_front1 5"
outFilterMultimapNmax: 100
normalize_bigwig: false

Config parameters

Parameter Description Example
workdir Path to the output directory /path/to/output
samples List of sample names ["SRRXXXXXX", "SRRYYYYYY"]
star_index Path to the STAR index directory /path/to/star_index
fastp_args Additional arguments for fastp "-l 20 -3 --trim_front1 5" (HITS-CLIP) or "--trim_front1 9 --max_len1 35 --length_required 25" (iCLIP-seq)
outFilterMultimapNmax Maximum number of loci the read is allowed to map to 100 (HITS-CLIP) or 1 (iCLIP-seq)
normalize_bigwig Whether to normalize bigWig files using CPM false (HITS-CLIP) or true (iCLIP-seq)
  • path/to/output should contain fastq directory with the following structure:
1
2
3
4
5
output/
└── fastq
    ├── SRRXXXXXX.fastq.gz
    ├── SRRYYYYYY.fastq.gz
    └── SRRZZZZZZ.fastq.gz
  • path/to/star_index is the directory containing the STAR index.

Docker image used in the workflow