University of Connecticut University of UC Title Fallback Connecticut

Software

The software offered below was written by Paul O. Lewis unless otherwise indicated. Most are useful for teaching concepts in statistics or phylogenetics, and all are free for downloading.

Note that unless otherwise indicated these programs are offered AS IS with absolutely NO WARRANTY of any kind. In fact, I find it is very hard to find time to keep most of these in a working state given the pace with which Java, iOS, and other operating systems evolve. Please feel free to write to me (paul.lewis@uconn.edu) if you find something is broken, and I’ll do my best to fix it as soon as I can.


Educational

MCMC Robot app iconMCMC Robot

This is a free app for iPad/iPhone that illustrates the basic principles of Markov chain Monte Carlo using the metaphor of a robot following simple rules to walk around on a landscape and, in the process, learning about the topography of the landscape. For the iOS version, P. Lewis acknowledges the support provided by NSF grant DEB-1036448 (GrAToL).

A D3 javascript version of MCMC Robot that can be run inside any modern web browser is being developed. Feel free to try it here, but keep in mind that the probability of encountering a bug is larger for this version than the more mature iOS or Windows versions because it is so new.

Note: an older Windows version of this app is available below or from the MCMC Robot home page.

C++ Bayesian Phylogenetics Tutorial (version 2)

This is a tutorial showing how to create a functioning Bayesian phylogenetics application in C++. It is designed for graduate students who need a base program that they understand and which can be easily modified to implement new models or methods.


Data Analysis

Galax

Program by Paul O. Lewis that estimates topological information content from a tree file (or list of tree files) representing a sample from the posterior distribution generated by a Bayesian phylogenetic analysis. The analysis performed by Galax is described in the following manuscript:

Lewis, P. O., M.-H. Chen, L. Kuo, L. A. Lewis, K. Fucikova, S. Neupane, Y.-B. Wang, D. Shi. Estimating Bayesian phylogenetic information content. Accepted in Systematic Biology. Download the advance access version.

Important! The methods outlined in the above paper and implemented in the Galax software are most useful for problems involving fewer than 12 taxa. Please give Table 2 in the paper your full attention before using the software on your own data, especially if you suspect information content is low. We are working on a more general solution that will more accurately measure information content.

Download Galax v1.0.0

PhycasPhycas logo image

Program by Paul O. Lewis, Mark T. Holder and David L. Swofford that performs Bayesian phylogenetic analyses. Specializes in marginal likelihood estimation and model selection, allows data partitioning and tree space including unresolved (polytomous) tree topologies. Phycas is free and open-source, written primarily in C++ but has a Python 2.x interface. Versions are available for Windows and MacOS, and it can be compiled for Linux.

Hickory

Program by Kent E. Holsinger and Paul O. Lewis that implements the Bayesian method described in Holsinger (1999) for estimating F-statistics co-dominant marker data and the method described in Holsinger et al. (2002) for estimating F-statistics from dominant marker data. It also includes routines to allow posterior comparisons as described in Holsinger and Wallace (2004). Hickory is free and open-source, written in C++, using the wxWindows library for cross-platform compatibility.

References:

Holsinger, K. E. 1999. Analysis of genetic diversity in geographically structured populations: a Bayesian perspective. Hereditas 130:245-255.

Holsinger, K. E., P. O. Lewis, and D. K. Dey. 2002. A Bayesian approach to inferring population structure from dominant markers. Molecular Ecology 11(7):1157-1164. [pdf]

Holsinger, K. E., and L. E. Wallace. 2004. Bayesian approaches for the analysis of population structure: an example from Platanthera leucophaea (Orchidaceae). Molecular Ecology 13:887-894. [pdf]

GDA logo imageGDA

Program by Paul O. Lewis and Dmitri Zaykin designed to accompany the book “Genetic Data Analysis” by Bruce S. Weir (1996, Sinaur Associates). Computes linkage and Hardy-Weinberg disequilibrium, some genetic distances, and provides method-of-moments estimators for hierarchical F-statistics. On 11 January 2008 I changed the download format from self-extracting zip archive to a simple zip archive. Let me know if this causes problems. The new zip file contains an additional example data file (fbi99.nex) included at the request of Bruce Weir to accompany his forthcoming review paper.

GDA has a graphical user interface (GUI) that works under Windows only, but Chris Basten has compiled a command-line-only version of GDA that runs under Mac OS 10.2.8 and 10.3 (Jaguar and Panther). This version can be downloaded here. After downloading, you should open a terminal window, navigate to the folder containing the file, and type “chmod +x gda1.1” to make GDA executable.

Reference:

Weir, B. S. 1996. Genetic Data Analysis. 2nd ed. Sinauer Associates, Sunderland, Massachusetts. 376 pages.


Source Code Libraries

NCL

NCL is a C++ class library for reading data files formatted in the NEXUS file format common to several phylogenetic analysis software applications. The original version was written by me in the 1990s, but the current version has been nearly completely rewritten (and improved tremendously in the process) by Mark T. Holder. Mark’s version is now used in several phylogenetic analysis programs, including Garli, RevBayes and Phycas.

Reference:
Lewis, P. O. 2003. NCL: a C++ class library for interpreting data files in NEXUS formatBioinformatics 19 (17): 2330-2331.

bannerBeagle-lib

Although I played a very small part in developing this fantastic resource, I include it here to advertise its availability. This library allows one to write software for maximum likelihood or Bayesian phylogenetics in either C or C++ without needing to write code to compute the likelihood! Better yet, it allows your software to make use of Graphical Processing Units (GPUs) if available to parallelize the computation of the likelihood.

Reference:
Ayres, D. L., A. Darling, D. J. Zwickl, P. Beerli, M. T. Holder, P. O. Lewis, J. P. Huelsenbeck, F. Ronquist, D. L. Swofford, M. P. Cummings, A. Rambaut and M. A. Suchard. 2011. BEAGLE: an application programming interface and high-performance computing library for statistical phylogeneticsSystematic Biology 61(1):170–173. [pdf]


Java Applets

These Java applets are now quite old, having been written back in the days when there was only one or two books of Java in the local (now extinct) Borders book store. Some of them (e.g. Osmosis and Diffusion) are quite silly, so don’t try to make more of these than is intended.

CLT

Thumbnail image of CLT applet in action Illustrates the fact that sums of just about anything are approximately normally distributed (for those of us who really want to believe the Central Limit Theorem, but who need to see it to believe it)

 

LBA

Thumbnail image of LBA applet Animates the long branch attraction example in Joe Felsenstein’s classic 1978 paper entitled “Cases in which parsimony or compatibility methods will be positively misleading”

 

Brownian

Thumbnail of Brownian applet in action Illustrates the Brownian motion model used by Joe Felsenstein in his classic 1985 paper “Phylogenies and the comparative method” (the paper that introduced his Phylogenetically Independent Contrasts)

 

Diffusion

Thumbnail of Diffusion applet in action Illustrates simple diffusion. Begins with a container full of perfume molecules (for example) undergoing random movements, bouncing off the walls of the container. When the Remove Lid button is pressed, the lid disappears and the perfume molecules are allowed to diffuse into the room

 

Osmosis

Thumbnail showing the Osmosis applet in action Illustrates concept of a semipermeable membrane (smaller particles diffuse across barrier but larger ones cannot)

 

Crossover

Thumbnail of the Crossover applet in action Simulates crossing over between homologous chromatids and is designed to aid in understanding the notion of the recombination fraction and its use in constructing linkage maps

 

Bulbs

bulbs Illustrates a simple failure model using a simulated bank of light bulbs. The applet begins with a bank of bulbs, all burning (yell0w). A plot at the right shows the proportion of bulbs that have burned out in the simulation compared with the theoretical expectation

 


Microsoft® Windows® Applications

These are also quite old, but still useful if you use Windows.

Adares

Teaching application useful for illustrating the adaptive rejection sampling approach used for gibbs sampling in Bayesian statistics.

ChisCalc

Chi-square significance probability calculator.

Bayesian Coin Tosser

Illustrates Bayesian concepts of prior and posterior densities for a simple coin flipping example.

FSim

Graphically illustrates the concept of inbreeding coefficient.

MCRobot

Illustrates the basic principles of Markov chain Monte Carlo simulation, using arbitrary, user-defined landscapes composed of one or more bivariate normal densities. Note: this is the Windows version, for the newer iPad/iPhone version, see the MCMC Robot web site.

MLCalc

A calculator useful in demonstrating the estimation of evolutionary distances using the method of maximum likelihood under four simple substitution models.

ThetaSim

Graphically illustrates the concept of population differentiation due to isolation and genetic drift.