linearPOA: A parallel, memory-efficient framework for Partial Order Alignment with linear space complexity
linearPOA is a library/program written in C++17 that applies the Hirschberg algorithm to Partial Order Alignment (POA).
- Download and Compile the source code. (Make sure your version of gcc >= 11 or clang >= 13.0.0)
#1 Download
git clone https://github.com/malabz/linearPOA.git --recursive
# Make sure submodules are correctly cloned
git submodule init
git submodule update
git submodule foreach git submodule init
git submodule foreach git submodule update
#2 Open the folder
cd demem/src
#3 Compile and install
make THREADS=16 PREFIX=/path/you/wish all
#4 Test
./linearpoa --help- Firstly, install git, and download the whole project with submodules:
git clone https://github.com/malabz/linearPOA.git --recursive
# Make sure submodules are correctly cloned
git submodule init
git submodule update
git submodule foreach git submodule init
git submodule foreach git submodule update- Make programs from
Visual Studio 2022:
- First of all, download and install
Visual Studio 2022. - Open
POA_affine.slninVisual Studio 2022, then switch it toReleasemode (SelectProperties, chooseConfiguration Properties, then pressConfiguration Manager...button, changeActive Solution configurationtoRelease), chooseBuild->Build Solution. - Open
x64\Releasefolder, you will getPOA_affine.exe.
Available options:
--in -i FILE sequence file name (Required)
--out -o FILE output file name (Required)
--threads -t N use N threads (N >= 1, default: 1)
--open -O O gap open penalty (default: 3)
--ext -E E gap extension penalty (default: 1)
--match -M M match score (default: 0)
--mismatch -X X mismatch score (default: 2)
--nolinear -N do not use linear method (default: disabled)
--fast -f use faster method (default: disabled)
--genmode -g M generate mode, 1: generate MSA, 2: generate consensus, 3: generate MSA+consensus (default: 1)
--help -h print help message
--version -v show program version
Example:
./linearpoa --in seq.fasta --out seq_out.fastaThe test datasets and compiled programs are available at https://doi.org/10.5281/zenodo.15637837. You can use the data for testing the program.
We use the error_measure program provided by FORAlign to measure the similarity between the generated sequence and the reference sequence.
Additionally, we modified several programs for comparison with our method. These modifications are shown as follows:
We modified TSTA to better control the output behavior. This repository is available here.
We modified PBSIM2 for generating simulated datasets, which only generates positive-strand sequences. This repository is available here.
We modified Racon for calling POA methods to generate consensus sequences, while ignoring the window information provided by Racon. This repository is available here.
If you find any bug, welcome to contact us on the issues page or email us.
More tools and infomation can visit our github.