University of Connecticut University of UC Title Fallback Connecticut

Phylogenetic Software Development Tutorial (version 2)

This tutorial teaches you how to create C++ software that performs Bayesian phylogenetic analyses. This tutorial was written primarily to help students in my laboratory to develop software that they can later modify for their own purposes, but I hope it is more broadly useful.

This is the second version of the tutorial. It is being developed throughout Fall 2017, so you may find that pages change weekly to reflect corrections and additions made as we work out problems and add features. The first version is still available here, but I strongly recommend using version 2 (it covers the same material as version 1 and yet is less complicated).

This second version differs substantially from the first in several ways.

  • It does away with most of the templates that made the first version difficult to read and understand.
  • It uses nice C++11 features such as built-in shared pointers, regular expressions, and more concise for loops, which make for less reliance on the Boost C++ libraries.
  • It uses the Nexus Class Library (NCL) to read data files. The original tutorial used regular expressions to parse data files, but that approach assumes a certain regularity in how data files are put together; a regularity that doesn’t really exist in the real world. Mark Holder has turned the NCL into a really amazing library that, in turns, will allow us to write programs that are quite robust in terms of data file input formats.
  • It places emphasis on allowing both rooted and unrooted as well as binary and multifurcating trees. In particular, the MCMC analyses will allow for priors whose domain includes trees with polytomies.

The tutorial assumes you have some experience with C++ programming

All tutorials must make some assumptions about the student’s background. This tutorial is designed for those with some background in C++ programming and thus does not explain in detail how C++ works. If you have never written anything in C++, you will still end up with a working program and may, given sufficient motivation, find it useful as a vehicle for learning C++. The tutorial uses features of C++11.

The tutorial also assumes that you have some experience with Bayesian phylogenetics

This tutorial will be most useful to a biologist who has experience with Bayesian phylogenetics software such as MrBayes/RevBayes or BEAST and is interested in developing new phylogenetic methods/models not yet implemented in these programs.

License

The software that you will create falls under the permissive open-source MIT License.

Funding

This tutorial was developed as a broader impact project associated with National Science Foundation grant DEB-1354146 (Estimating the Bayesian phylogenetic information content of systematic data, PI Paul O. Lewis).

Get started!

The menu in the sidebar on the right will appear on every page of the tutorial and list each step in order. Use the sidebar menu to navigate through the tutorial, starting with Create a C++ project, which will help you set up a development environment on either Windows or Mac.