FITCH -- Fitch-Margoliash and Least-Squares Distance Methods
© Copyright 1986-2002 by the University of Washington. Written by Joseph Felsenstein. Permission is granted to copy this document provided that no fee is charged for it and that this copyright notice is not removed.
This program carries out Fitch-Margoliash, Least Squares, and a number of similar methods as described in the documentation file for distance methods.
The options for FITCH are selected through the menu, which looks like this:
Most of the input options (U, P, -, O, L, R, S, J, and M) are as given in the documentation page for distance matrix programs, and their input format is the same as given there. The U (User Tree) option has one additional feature when the N (Lengths) option is used. This menu option will appear only if the U (User Tree) option is selected. If N (Lengths) is set to "Yes" then if any branch in the user tree has a branch length, that branch will not have its length iterated. Thus you can prevent all branches from having their lengths changed by giving them all lengths in the user tree, or hold only one length unchanged by giving only that branch a length (such as, for example, 0.00). You may find program RETREE useful for adding and removing branch lengths from a tree. This option can also be used to compute the Average Percent Standard Deviation for a tree obtained from NEIGHBOR, for comparison with trees obtained by FITCH or KITSCH.
The D (methods) option allows choice between the Fitch-Margoliash criterion and the Minimum Evolution method (Kidd and Sgaramella-Zonta, 1971; Rzhetsky and Nei, 1993). Minimum Evolution (not to be confused with parsimony) uses the Fitch-Margoliash criterion to fit branch lengths to each topology, but then chooses topologies based on their total branch length (rather than the goodness of fit sum of squares). There is no constraint on negative branch lengths in the Minimum Evolution method; it sometimes gives rather strange results, as it can like solutions that have large negative branch lengths, as these reduce the total sum of branch lengths!
Another input option available in FITCH that is not available in KITSCH or NEIGHBOR is the G (Global) option. G is the Global search option. This causes, after the last species is added to the tree, each possible group to be removed and re-added. This improves the result, since the position of every species is reconsidered. It approximately triples the run-time of the program. It is not an option in KITSCH because it is the default and is always in force there. The O (Outgroup) option is described in the main documentation file of this package. The O option has no effect if the tree is a user-defined tree (if the U option is in effect). The U (User Tree) option requires an unrooted tree; that is, it require that the tree have a trifurcation at its base:
The output consists of an unrooted tree and the lengths of the interior segments. The sum of squares is printed out, and if P = 2.0 Fitch and Margoliash's "average percent standard deviation" is also computed and printed out. This is the sum of squares, divided by N-2, and then square-rooted and then multiplied by 100 (n is the number of species on the tree):
APSD = ( SSQ / (N-2) )1/2 x 100.
where N is the total number of off-diagonal distance measurements that are in the (square) distance matrix. If the S (subreplication) option is in force it is instead the sum of the numbers of replicates in all the non-diagonal cells of the distance matrix. But if the L or R option is also in effect, so that the distance matrix read in is lower- or upper-triangular, then the sum of replicates is only over those cells actually read in. If S is not in force, the number of replicates in each cell is assumed to be 1, so that N is n(n-1), where n is the number of species. The APSD gives an indication of the average percentage error. The number of trees examined is also printed out.
The constants available for modification at the beginning of the program are: "smoothings", which gives the number of passes through the algorithm which adjusts the lengths of the segments of the tree so as to minimize the sum of squares, "delta", which controls the size of improvement in sum of squares that is used to control the number of iterations improving branch lengths, and "epsilonf", which defines a small quantity needed in some of the calculations. There is no feature saving multiply trees tied for best, partly because we do not expect exact ties except in cases where the branch lengths make the nature of the tie obvious, as when a branch is of zero length.
The algorithm can be slow. As the number of species rises, so does the number of distances from each species to the others. The speed of this algorithm will thus rise as the fourth power of the number of species, rather than as the third power as do most of the others. Hence it is expected to get very slow as the number of species is made larger.
TEST DATA SET
OUTPUT FROM TEST DATA SET (with all numerical options on)