Skip to content

malabz/linearPOA

 
 

Repository files navigation

linearPOA: A parallel, memory-efficient framework for Partial Order Alignment with linear space complexity

linearPOA is a library/program written in C++17 that applies the Hirschberg algorithm to Partial Order Alignment (POA).

Installation

Linux/WSL (Windows Subsystem for Linux) - from the source code (Recommended)

  1. Download and Compile the source code. (Make sure your version of gcc >= 11 or clang >= 13.0.0)
#1 Download
git clone https://github.com/malabz/linearPOA.git --recursive
# Make sure submodules are correctly cloned
git submodule init
git submodule update
git submodule foreach git submodule init
git submodule foreach git submodule update

#2 Open the folder
cd demem/src

#3 Compile and install
make THREADS=16 PREFIX=/path/you/wish all

#4 Test
./linearpoa --help

Windows - from Visual Studio 2022 (Not Recommended)

  1. Firstly, install git, and download the whole project with submodules:
git clone https://github.com/malabz/linearPOA.git --recursive
# Make sure submodules are correctly cloned
git submodule init
git submodule update
git submodule foreach git submodule init
git submodule foreach git submodule update
  1. Make programs from Visual Studio 2022:
  • First of all, download and install Visual Studio 2022.
  • Open POA_affine.sln in Visual Studio 2022, then switch it to Release mode (Select Properties, choose Configuration Properties, then press Configuration Manager... button, change Active Solution configuration to Release), choose Build -> Build Solution.
  • Open x64\Release folder, you will get POA_affine.exe.

Usage

Available options:
        --in       -i  FILE      sequence file name (Required)
        --out      -o  FILE      output file name (Required)
        --threads  -t  N         use N threads (N >= 1, default: 1)
        --open     -O  O         gap open penalty (default: 3)
        --ext      -E  E         gap extension penalty (default: 1)
        --match    -M  M         match score (default: 0)
        --mismatch -X  X         mismatch score (default: 2)
        --nolinear -N            do not use linear method (default: disabled)
        --fast     -f            use faster method (default: disabled)
        --genmode  -g  M         generate mode, 1: generate MSA, 2: generate consensus, 3: generate MSA+consensus (default: 1)
        --help     -h            print help message
        --version  -v            show program version
Example:
        ./linearpoa --in seq.fasta --out seq_out.fasta

Test dataset and compiled program

The test datasets and compiled programs are available at https://doi.org/10.5281/zenodo.15637837. You can use the data for testing the program.

Similarity between generated sequence and reference sequence

We use the error_measure program provided by FORAlign to measure the similarity between the generated sequence and the reference sequence.

Additionally, we modified several programs for comparison with our method. These modifications are shown as follows:

Modified TSTA

We modified TSTA to better control the output behavior. This repository is available here.

Modified PBSIM2

We modified PBSIM2 for generating simulated datasets, which only generates positive-strand sequences. This repository is available here.

Modified Racon

We modified Racon for calling POA methods to generate consensus sequences, while ignoring the window information provided by Racon. This repository is available here.

Citation

Contacts

If you find any bug, welcome to contact us on the issues page or email us.

More tools and infomation can visit our github.

About

Memory-efficient Partial Order Alignment with linearPOA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • C++ 98.4%
  • Makefile 1.1%
  • C 0.5%