Dialign

Server Infrastructure for Cloud Computing available all over the World — © Universität Bielefeld

Dialign

Authors
B. Morgenstern, S. Abdeddaim

DIALIGN is a software program for multiple sequence alignment developed by Burkhard Morgenstern et al..

While standard alignment methods rely on comparing single residues and imposing gap penalties, DIALIGN constructs pairwise and multiple alignments by comparing entire segments of the sequences. No gap penalty is used. This approach can be used for both global and local alignment, but it is particularly successful in situations where sequences share only local homologies.

The latest version of the program, DIALIGN-TX, is described in Subramanian et al. (2008), Algorithms Mol. Biol. 3:6. A web server for this program is available at Goettingen Bioinformatics Compute Server (GOBICS). A web server for multiple alignment with user-defined constraints (anchor points) as described by Morgenstern et al. (2006), Algorithms Mol. Biol. 1:6 is also available through GOBICS.

During the last few years, DIALIGN has been successfully used by many researchers to align genomic sequences; some break-through discoveries have been made based on these alignments. We set up a WWW server for multiple alignment of genomic sequences using Mike Brudno's program CHAOS and DIALIGN at GOBICS, see Brudno et al. (2004), Nuc. Acids. Res. 32:W41-W44.

Users of Dialign are requested to cite

Morgenstern, Burkhard
DIALIGN: Multiple DNA and Protein Sequence Alignment at BiBiServ, Nucleic Acids Research, 2004

Download

Choose a file for download

Name	Description
Dialign Sourcen	Dialign source package. Should compile on every Unix like operating system (Linux, Solaris, OSX, Win+Cygwin...)
Dialign Windows (32 Bit) binary package	Dialign binary package for Windows (supports Win7 or newer)
Dialign OSX binary package	Dialign binary package for OSX (10.5 or newer) running on X86/PPC based architecture

Dialign is a command line tool. That means there's no graphical user interface, you have to run it from the command line.

Download the archive that fit your system.
Unzip the archive
Open a commandline and change into the unpacked directory.
Set DIALIGN2_DIR environment variable
- Windows : set DIALIGN2_DIR=dialign2_dir
- OSX : export DIALIGN2_DIR=dialign2_dir
Run Dialign typing dialign2-2.exe (Windows) or ./dialign2-2 (OSX) to get an usage help.

The BiBiServ team does not provide any support for compiling or using a tool from the download section. Please contact the author directly in case of any problem.

Manual

This manual describes all parameters, input and output data supported by this Dialign online service. The variation and count of parameters, the parameter range and input/ output types can differ from the standalone used program.

AA alignment

DNA alignment

RNA alignment

In-/Output values

INPUT :: aa_sequences
A set of amino acid sequences to be aligned.

INPUT :: dna_sequences
Deoxyribonucleic acid sequences to be aligned.

INPUT :: rna_sequences
Ribonucleic acid sequences to be aligned.

OUTPUT :: AA alignment
Multiples Amino Acid Alignment.

OUTPUT :: DNA alignment
Multiples deoxyribonucleic acid alignment.

OUTPUT :: RNA alignment
Multiples ribonucleic acid alignment.

Parameter Name

Description

threshold

As described in our papers, DIALIGN constructs alignments from gapfree pairs of segments of the sequences. Such segment pairs are referred to as diagonals.

Every possible diagonal is given a so-called weight reflecting the degree of similarity among the two segments involved. The overall score of an alignment ist then defined as the sum of weights of the diagonals it consists of and the program tries to find an alignment with maximum score -- in other words: the program tries to find a consistent collection of diagonals with maximum sum of weights. This novel scoring scheme for alignments is the basic difference between DIALIGN and other global or local alignment methods. Note that DIALIGN does not employ any kind of gap penalty.

It is possible to use a threshold T for the quality of the diagonals. In this case, diagonals are considered only if their weights exceed this threshold, and regions of lower similarity are ignored.

In the first version of the program (DIALIGN 1), this threshold was in many situations absolutely necessary to obtain meaningful alignments. By contrast, DIALIGN 2 should produce reasonable alignments without a threshold, i.e. with T = 0. This is the most important difference between DIALIGN 2 and the first version of the program.

Nevertheless, it is still possible to use a threshold T, so it is up to the user to experience with this option.

nucleic acid sequence handling

If (possibly) coding nucleic acid sequences are to be aligned, DIALIGN optionally translates the compared `nucleic acid segments' to `peptide segments' according to the genetic code -- without (necessarily) presupposing any of the three possible reading frames, so all three of them get checked for significant similarity. In this case, the similarity among segments will be assessed on the `peptide level' rather than on the `nucleic acid level'. We strongly recommend this option if nucleic acid sequences are expected to contain protein coding regions, as it will significantly increase the sensitivity of the alignment procedure in such cases.

References

Morgenstern, B. DIALIGN: Multiple DNA and Protein Sequence Alignment at BiBiServ, Nucleic Acids Research, 2004

Subramanian, A.R. and Kaufmann, M. and Morgenstern, B. DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment, Algorithms for Molecular Biology, 2008

Morgenstern, B. and Prohaska, S.J. and Poehler, D. and Stadler, P.F. Multiple sequence alignment with user-defined anchor points, Algorithms for Molecular Biology, 2006

Poehler, D. and Wernera, N. and Steinkamp, R. and Morgenstern, B. Multiple Alignment of Genomic Sequences using CHAOS, DIALIGN and ABC, Nucleic Acids Research, 2005

Schmollinger, M. and Nieselt, K. and Kaufmann, M. and Morgenstern, B. DIALIGN P: Fast pair-wise and multiple sequence alignment using parallel processors, BMC Bioinformatics, 2004