Nonbinary Cycle Killer

A program for constructing a phylogenetic network from two (not necessarily binary) phylogenetic (gene) trees. The constructed phylogenetic network will display (refinements of) both trees. The program aims to minimize the reticulation number of the constructed network. For worst-case performance guarantees, see below. This program extends our original Cycle Killer program, which could only handle binary trees.


Reference

Leo van Iersel, Steven Kelk, Nela Lekic and Celine Scornavacca, A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees. BMC Bioinformatics 15:127 (2014). Preliminary version in proc. of WABI 2012, LNCS 7534, pp. 430-440.


Installation

Download the source file NonbinaryCycleKiller.java.

Install CPLEX, which is free for academic use, see IBM academic initiative.

(Optional) Install Dendroscope, to view the constructed phylogenetic networks.

(Optional) Download rSPR, compile it and put its executable in the directory of NonbinaryCycleKiller. This is only needed if you want to use the –rspr or –rsprapp options.

To compile the source code, download and install a Java Compiler.

Open a terminal window and navigate to the folder of NonbinaryCycleKiller.

Type “javac -classpath jarpath NonbinaryCycleKiller.java” where “jarpath” is the location of cplex.jar on your computer.


Running the program

Type “java -classpath jarpath;. -Djava.library.path=libpath NonbinaryCycleKiller test.tree”

“test.tree” text file containing two trees in newick format, one per line
“libpath” path to CPLEX shared library on your computer (the folder containing the CPLEX binaries). Setting the libpath might not be necessary.
“jarpath” path to cplex.jar on your computer (e.g. “C:\Program Files\IBM\ILOG\CPLEX_Studio125\cplex\lib\cplex.jar”).

The constructed phylogenetic network is saved in e-Newick format to a file “test-network.tree”. This file can be opened with Dendroscope.


Options

–maf use the FPT algorithm of MAF to compute an intial agreement forest (default)
–mafapp use the approximation algorithm of MAF to compute an intial agreement forest
–rspr use the FPT algorithm of rSPR to compute an intial agreement forest
–rsprapp use the approximation algorithm of rSPR to compute an intial agreement forest
–given af.txt use the initial agreement forest given in the file af.txt

Note that the options –rspr and –rsprapp can currently only be used when at least one of the two input trees is binary. They use the program rSPR by Chris Whidden. The approximation algorithms are much faster than the FPT algorithms but their performance might not be as good (see below). The file “af.txt” should be as in this example file.


Example

An input file: test.tree

and an agreement forest file: testaf.txt.

test
The two input trees.

If you run Cycle Killer for these trees (with the default option), the program will construct this phylogenetic network, which can be viewed using Dendroscope. It looks as follows:

test-network
The constructed phylogenetic network.

If you specify the initial agreement forest testaf.txt, the program will construct a phylogenetic network with three reticulations (the optimum that can be achieved by cutting this agreement forest).

A bigger input file: bigcaterpillar.tree.


Worst-case performance

Cycle killer aims at constructing a phylogenetic network with a smallest possible reticulation number. The worst-case approximation ratio of the algorithm depends on whether the input trees are binary and which option is used, as specified in the following table:

 

binary input trees one binary one nonbinary nonbinary input trees
–maf 2-approximation 4-approximation 4-approximation
–mafapp 5-approximation 7-approximation 7-approximation
–rspr 2-approximation 4-approximation N/A
–rsprapp 4-approximation 6-approximation N/A

 

Note that in practice the approximation ratios are much smaller for (almost?) all data.


Additional functionality

Cycle Killer can also be used to transform a given acyclic agreement forest into a network. Simply run the program with “java NonbinaryCycleKiller trees.tree –given aaf.txt” (plus the necessary path information) and it will output a phylogenetic network displaying the two trees in “trees.tree” and corresponding to the acyclic agreement forest specified in “aaf.txt”.

Cycle Killer can also be used to find a directed feedback vertex set of a given digraph. Run the program with “java NonbinaryCycleKiller outlists.txt –dfvs”. The vertices of the graph need to be numbered 0,1,…,n-1 and line i of the input file should contain the outneighbours of vertex i-1 separated by spaces.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s