LeafCutter.smk¶
Snakemake workflow for differential RNA splicing analysis using LeafCutter.
Note
Please make sure that you have Singularity and Snakemake installed on your system and cloned the SnakeNgs repository.
Workflow¶
The rulegraph was created by snakevision.
- Extract junction reads using regtools
junctions extract. - Cluster introns using LeafCutter
leafcutter_cluster_regtools.py. - Extract exon information using LeafCutter
gtf_to_exons.R. - Differential splicing analysis using LeafCutter
leafcutter_ds.R. - Plot splice junctions using LeafCutter
ds_plots.R. - Make annotation codes using LeafCutter
gtf2leafcutter.pl. - Prepare results for visualization in LeafViz using LeafCutter
prepare_results.R. - Classify clusters using LeafCutter
classify_clusters.R.
Usage¶
1 2 3 4 5 | |
config.yaml should contain the following information:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
/path/to/outputis the directory where the output files will be saved./path/to/experiment_table.tsvis a tab-separated file, which is same as the one used in Shiba.
1 2 3 4 5 6 7 | |
The group column must be specified as Ref or Alt. This workflow will perform the differential splicing analysis between the two groups.
-
/path/to/reference_transcriptome.gtfis the reference transcriptome in GTF format (e.g.Homo_sapiens.GRCh38.106.gtffor human transcriptome). -
minimum_anchor_lengthis the minimum anchor length for the junction reads. minimum_intron_lengthis the minimum intron length for the junction reads.maximum_intron_lengthis the maximum intron length for the junction reads.strandis the strand information in the BAM file, whereXSis used for unstranded data. Please refer to the regtools documentation for more information.minimum_readsis the minimum number of reads required to cluster the introns.min_coverageis the minimum coverage required for the intron to be considered.min_samples_per_intronis the minimum number of samples required for the intron to be considered.min_samples_per_groupis the minimum number of samples required for the group to be considered.FDRis the false discovery rate for the differential splicing analysis.