allpaths-lg
Software Description
ALLPATHS-LG is a whole-genome shotgun assembler that can generate high-quality genome assemblies using short reads (\~100bp) such as those produced by the new generation of sequencers. The significant difference between ALLPATHS and traditional assemblers such as Arachne is that ALLPATHS assemblies are not necessarily linear, but instead are presented in the form of a graph. This graph representation retains ambiguities, such as those arising from polymorphism, uncorrected read errors, and unresolved repeats, thereby providing information that has been absent from previous genome assemblies.
Info
Module Name
allpathslg
Last Updated On
08/29/2023
Support Level
Secondary Support
Software Access Level
Open Access
Home Page
Documentation
General Linux
To run this software interactively in a Linux environment run the commands:
module load allpathslg
PrepareAllPathsInputs.pl DATA_DIR=/path/to/data
RunAllPathsLG PRE=$pre DATA_SUBDIR=$data RUN=$run REFERENCE_NAME=$ref
Note:
The PrepareAllPathsInputs.pl script requires one parameter, the path to the directory containing the input data. $pre is the root directory ALLPATHS-LG will use. $data is the subdirectory containing the input data. $run is the directory used for assembly pre-processing. $ref is the organism or reference genome name.
ALLPATHS-LG is composed of a number of modules, each of which performs a step in the assembly process. While each module can be run individually, ALLPATHS-LG provides a module that controls the entire assembly pipeline, called RunAllPathsLG. In addition, before ALLPATHS-LG can be used, data must be converted using the Perl script PrepareAllPathsInputs.pl.
AllPathsLG assembler has specific requirement for the paired-end read libraries. It requires the paired read to be actually interleaved.
A more detailed discussion of each of these directories, as well as a list of other command-line arguments, is available in the user manual. Other ALLPATHS-LG utilities may be found in the directory:
/common/software/install/migrated/allpathslg/VER/bin
where VER is the version of ALLPATHS-LG you are using. An example Slurm script for submitting ALLPATHS-LG jobs to the queue is shown below:
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --mem=1gb
#SBATCH --time=4:00:00
#SBATCH --partition=msismall
module load allpaths-lg
# Prepare input data
mkdir -p test.genome/data
PrepareAllPathsInput.pl DATA_DIR=$PWD/test.genome/data
# Assemble data
RunAllPathsLG \
PRE=$PWD \
DATA_SUBDIR=data \
RUN=run \
REFERENCE_NAME=test.genome
Agate Modules
Default
52488
Other Modules
42557, 52488