Adiantum (maidenhair fern) pinnule with inlaid phylogeny
Adiantum (maidenhair fern) pinnule with inlaid phylogeny

Phylogenetics (EEB 5349)

This is a graduate-level course in phylogenetics, emphasizing primarily maximum likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides hands-on experience with several important phylogenetic software packages (PAUP*, IQ-TREE, RAxML, MrBayes, RevBayes, and others) and introduces students to the use of remote high performance computing resources to perform phylogenetic analyses.

EEB 5349 is being taught Spring Semester 2020:

Lecture: Tuesday/Thursday 11-12:15  (lecture instructor: Paul O. Lewis)
Lab: Friday 1:25-3:20 (laboratory instructor: Katie Taylor; office hour TBA)
Room: Torrey Life Science (TLS) 181, Storrs Campus
Text: none required, registered students will receive PDF copies of a textbook I am currently writing (see list of optional texts below)


Important! The syllabus below is currently mostly as it was when the course finished in Spring 2018. Modifications will occur as needed as the semester proceeds.

Date Lecture topics Lab/Homework
Jan. 21
The jargon of phylogenetics (edges, vertices, leaves, degree, split, polytomy, taxon, clade); types of genealogies; rooted vs. unrooted trees; newick descriptions; monophyletic, paraphyletic, and polyphyletic groups [slides (1/page)] [slides (4/page)]
Homework 1: Trees From Splits (due in lecture Tuesday Jan 28)
Jan. 23
Optimality criteria, search strategies, consensus trees
Exhaustive enumeration, branch-and-bound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search strategies (NNI, SPR, TBR), evolutionary algorithms; consensus trees [slides 1/page][slides 4/page]
Friday, Jan. 24 Lab: Using the Xanadu cluster; Introduction to PAUP*NEXUS format
Jan. 28
The parsimony criterion
Strict, semi-strict, and majority-rule consensus trees; maximum agreement subtrees; Camin-Sokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony [slides (1/page)] [slides (4/page)]
Homework 2: Parsimony
Jan. 30
Distance methods
Distance methods: least squares criterion, minimum evolution criterion, neighbor-joining [slides (1/page)] [slides (4/page)]
Friday, Jan. 31 Lab: Searching
Feb. 4
Substitution models
Instantaneous rates, expected number of substitutions, equilibrium frequencies, JC69 model. [slides (1/page)][slides (4/page)]
Homework 3: Least squares distances (working through the Python Primer first will make this homework much easier)
Feb. 6
Maximum likelihood criterion
JC distance formula; common substitution models: K2P, F81, F84, HKY85, and GTR; likelihood: the probability of data given a model, likelihood of a “tree” with just one vertex and no edges, why likelihoods are always on the log scale, likelihood ratio tests. (Transition Probability Applet)[slides (1/page)][slides 4/page]
Friday, Feb. 7 Lab: Likelihood
Feb. 11
Maximum likelihood (cont.)
Likelihood of a tree with 2 vertices connected by one edge, transition probabilities, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree.[slides (1/page)][slides 4/page]
Homework 4: Site likelihoods
Feb. 13
Bootstrapping, Rate heterogeneity
Non-parametric bootstrapping, ultrafast bootstrapping  [slides (1/page)][slides 4/page]Rate heterogeneity, I model, G model, site-specific rates, mixture models.[slides (1/page)][slides 4/page]
Friday, Feb. 14 Lab: IQ-TREE tutorial
Feb. 18
How to simulate nucleotide sequence data, and why it’s done
[slides (1/page)][slides 4/page]
Homework 5: Rate heterogeneity (python program to modify)
Feb. 20
Long branch attraction
Statistical consistency, long branch attraction.Topology tests
KH, SH, and AU tests.
[slides (1/page)][slides 4/page]
Friday, Feb. 21 Lab: Simulating sequence data using PAUP*
Feb. 25
Codon and secondary structure models
Nonsynonymous vs. synonymous rates, codon models; RNA stem/loop structure, compensatory substitutions, stem models.

Amino acid models
Empirical amino acid rate matrices (PAM, JTT, WAG, LE, etc.).
[slides (1/page)][slides (4/page)]
Homework 6: Simulation
Feb. 27
Calculating expected substitutions/site; using eigenvectors and eigenvalues to turn rate matrices into transition probability matrices.

Bayes’ Rule
Joint, conditional, and marginal probabilities, and how they interact to create Bayes’ Rule.
[slides (1/page)][slides (4/page)](Eigenvector/eigenvalue applet.)

Friday, Feb. 28 Lab: Using HyPhy to test hypotheses
Mar. 3
Bayesian statistics
Probability vs. probability density. (archery priors applet)
Markov chain Monte Carlo (MCMC)
MCMC “robot” metaphor, Metropolis-Hastings algorithm, mixing, burn-in, and trace plots. (MCMC Robot applet)[slides (1/page)][slides (4/page)]
Homework 7: MCMC
Mar. 5
MCMCMC, topology proposals

Metropolis-coupled MCMC (i.e. “heated chains”), algorithms (a.k.a. updaters, moves, operators, proposals) for updating parameters and trees during MCMC.
[slides (1/page)][(slides (4/page)]

Friday, Mar. 6 Lab: Using R to explore probability distributions and plot trees
Mar. 10
Prior distributions used in phylogenetics

Discrete Uniform (topology), Gamma (kappa, omega), Beta (pinvar), Dirichlet (base frequencies, GTR exchangeabilities); Tree length prior; induced split prior.
[slides (1/page)][(slides (4/page)]

No homework assigned this week
Mar. 12
Quiz 1

20 point quiz on first half of course. Expect some general brief-essay questions about what we’ve covered, and you will be given some choice in which topics to answer.

Friday, Mar. 13 Lab: MRBAYES
Mar. 17
Mar. 19
Mar. 24
Bayesian phylogenetics (continued)

Dirichlet process priors, credible vs. confidence intervals. [CI applet] [Stick-breaking applet]

See HuskyCT for video mini-lectures.

Homework: no homework this week
Mar. 26
Bayes factors and Bayesian model selection

Bayes factors, steppingstone estimation of marginal likelihood.

See HuskyCT for video mini-lectures and links to the slides.

Friday, Mar. 27 Lab: Morphology and partitioning in MrBayes
Mar. 31
Discrete morphological models

Dirichlet process prior models revisited; introduction to discrete morphological models; Mk model; conditioning on variability.
[lecture videos and slide pdfs are posted in HuskyCT]

Homework 8: read and summarize the main points in the Maddison & Fitzjohn (2015) paper (please see the assignment linked here before starting)
Apr. 2
Testing for evolutionary dependence

BIC; Pagel’s (1994) test for correlated evolution among discrete traits; reversible-jump MCMC. [lecture videos and slide pdfs are posted in HuskyCT]

Friday, Apr. 3 Lab: BayesTraits
Apr. 7
Stochastic character mapping

An alternative to Pagel’s (1994) test for assessing whether correlation among characters goes beyond what is expected from inheritance alone. [one 22 min lecture video and PDFs of slides have been posted in the HuskyCT course]


 Homework 9: Summarize what you plan to do for your course project. What data will you use? Why are these data of interest? What is at least one question you plan to address using these data?
Apr. 9
Evolutionary Correlation: Continuous Traits

Independent Contrasts and Phylogenetic Generalized Least Squares (PGLS). [a 13.5 min lecture on PIC, a 21 min lecture on PGLS, and PDFs of slides have been posted in the HuskyCT course]


Friday, Apr. 10 Lab: Using sMap to perform stochastic mapping analyses
Apr. 14
PGLS (cont.)
Estimating ancestral states in PGLS. Ornstein-Uhlenbeck model vs. Brownian motion.
Homework: no homework this week
Apr. 16
Phylogenetic signal in comparative data
Measuring the amount of phylogenetic information in continuous traits (Pagel’s lambda, Blomberg’s K).

Introduction to the coalescent
Just enough coalescent theory to understand the multispecies coalescent used to estimate species trees given possibly conflicting gene trees.Two videos (13.5 min and 14 min) are on the HuskyCT course site. There is also an applet that I forgot to mention in the “signal” video.
Friday, Apr. 17 Lab: APE
Apr. 21
Species Tree Estimation (cont.)

Deep coalescence, incomplete lineage sorting, gene tree discordance due to ILS, estimating species trees using the multispecies coalescent. The SVDQuartets and ASTRAL species tree methods.

 Homework 10: TBA
Apr. 23
Divergence time estimation

Strict vs. relaxed clocks, correlated vs. uncorrelated relaxed clocks, calibrating the clock using fossils.

 Friday, Apr. 24 Lab: Divergence time estimation with RevBayes
Apr. 28
Diversification rate evolution

State-dependent diversification models (BiSSE and its descendants); BAMM: estimating the number of shifts in diversification regime and where these occur on the tree.


No homework (work on projects)
Apr. 30
Quiz 2

Open book, due 8pm Wednesday of finals week (i.e. the end time of our scheduled final exam).

Optional lecture: Estimating phylogenetic information

A talk I gave at the American Museum of Natural History in January.

(Quiz 2 and the lecture are available in HuskyCT.)

 Friday, May 1: Project presentations: please time your presentation to last no longer than 15 minutes , leaving 5 minutes for questions/comments (20*6=2 hours total).
Finals week Quiz 2 due Wednesday by 8pm

Books on phylogenetics

This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.

Harmon, L. 2019. Phylogenetic comparative methods. (Version 1.4, released 15 March 2019). Published online by the author.

Yang, Z. 2014. Molecular evolution: a statistical approach. Oxford University Press.

Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)

Garamszegi, L. Z. 2014. Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. Springer-Verlag, Berlin. (Well-written chapters by current leaders in phylogenetic comparative methods.)

Hall, B. G. 2011. Phylogenetic trees made easy: a how-to manual (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)

Lemey, P., Salemi, M., and Vandamme, A.-M. 2009. The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)

Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)

Page, R., and Holmes, E. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science (Very nice and accessible pre-Bayesian-era introduction to the field.)

Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapters 11 (“Phylogenetic inference”) and 12 (“Applications of molecular systematics”). (Still a very valuable compendium of pre-Bayesian-era phylogenetic methods.)