Skip to content

Shao-Group/beaver-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This repository is dedicated to comparing the performance of our new assembler Beaver against three leading meta-assemblers, Aletsch, TransMeta, PsiCLASS, as well as two popular single-sample assemblers, StringTie2 and Scallop2. Here we provide instructions for downloading necessary tools, preparing datasets, executing the tools/pipelines and reproducing the results presented in the Beaver paper.

Step 1: Download and Link Tools

Our experiments involve the following tools:

Tool Version Description
Beaver v1.0.0 Cell-specific Assembler
Aletsch v1.1.0 Meta Assembler
Transmeta v.1.0 Meta Assembler
PsiCLASS v1.0.3 Meta Assembler
StringTie2 v2.2.1 Single-sample Assembler
Scallop2 v1.1.2 Single-sample Assembler
STAR v2.7.11 RNA-seq Aligner
GffCompare v0.11.2 Evaluate assembled transcripts

Step 1.1: Download Tools

  • Access the homepages of the respective tools using the links provided above.
  • Follow the download and compilation instructions on each tool's homepage.

Step 1.2: Link or Copy Executables

  • For tools with available executable files, link or copy them to the programs directory. This includes beaver, aletsch, scallop2, stringtie, STAR and gffcompare.
  • For tools without standalone executables (TransMeta and PsiCLASS), link the entire directory to programs.

Ensure the tools are accessible via the following paths:

your/path/to/programs/beaver
your/path/to/programs/aletsch
your/path/to/programs/TransMeta/TransMeta
your/path/to/programs/psiclass/psiclass
your/path/to/programs/stringtie
your/path/to/programs/scallop2
your/path/to/programs/STAR
your/path/to/programs/gffcomapre

You may need to rename some of the executable files to match the paths listed above.

Step 2: Download Datasets and Align

We evaluate the performance of the six methods using four datasets, as outlined below. Each dataset is identified by its unique prefix (used in this repository) and accession ID for reference.

Dataset # Cells Protocol Accession ID
HEK293T 192 Smart-seq3 E-MTAB-8735
Mouse-Fibroblast 369 Smart-seq3 E-MTAB-8735

Use STAR for read alignments for each sample/cell. For every dataset, compile a list of all BAM file paths as required by the different meta-assemblers. Simulated counterparts are generated by RSEM.

Step 3: Run All Tools

Execute the provided scripts in the results directory to run the simulator and assemblers for the four datasets:

./simulate.HEK293T.sh
./simulate.Mouse-Fibroblast.sh
./run.HEK293T.sh
./run.Mouse-Fibroblast.sh

Step 4. Train Beaver Scoring Model and Evaluation

Execute the provided scripts in the train directory to run the evaluation pipeline in the manuscript. Beaver_General and Beaver_Specific models are trained on Chr1-9 and tested on the other chromosomes.

./train_test_real.py
./train_test_sim.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published