Adiantum (maidenhair fern) pinnule with inlaid phylogeny
Adiantum (maidenhair fern) pinnule with inlaid phylogeny

Phylogenetics (EEB 5349)

This is a graduate-level course in phylogenetics, emphasizing primarily maximum likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides hands-on experience with several important phylogenetic software packages (PAUP*, IQ-TREE, RAxML, MrBayes, RevBayes, and others) and introduces students to the use of remote high performance computing resources to perform phylogenetic analyses.

EEB 5349 is being taught Spring Semester 2020:

Lecture: Tuesday/Thursday 11-12:15  (lecture instructor: Paul O. Lewis)
Lab: Friday 1:25-3:20 (laboratory instructor: Katie Taylor; office hour TBA)
Room: Torrey Life Science (TLS) 181, Storrs Campus
Text: none required, registered students will receive PDF copies of a textbook I am currently writing (see list of optional texts below)


Important! The syllabus below is currently mostly as it was when the course finished in Spring 2018. Modifications will occur as needed as the semester proceeds.

Date Lecture topics Lab/Homework
Jan. 21
The jargon of phylogenetics (edges, vertices, leaves, degree, split, polytomy, taxon, clade); types of genealogies; rooted vs. unrooted trees; newick descriptions; monophyletic, paraphyletic, and polyphyletic groups [slides (1/page)] [slides (4/page)]
Homework 1: Trees From Splits (due in lecture Tuesday Jan 28)
Jan. 23
Optimality criteria, search strategies, consensus trees
Exhaustive enumeration, branch-and-bound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search strategies (NNI, SPR, TBR), evolutionary algorithms; consensus trees [slides 1/page][slides 4/page]
Friday, Jan. 24 Lab: Using the Xanadu cluster; Introduction to PAUP*NEXUS format
Jan. 28
The parsimony criterion
Strict, semi-strict, and majority-rule consensus trees; maximum agreement subtrees; Camin-Sokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony [slides (1/page)] [slides (4/page)]
Homework 2: Parsimony
Jan. 30
Distance methods
Distance methods: least squares criterion, minimum evolution criterion, neighbor-joining [slides (1/page)] [slides (4/page)]
Friday, Jan. 31 Lab: Searching
Feb. 4
Substitution models
Instantaneous rates, expected number of substitutions, equilibrium frequencies, JC69 model. [slides (1/page)][slides (4/page)]
Homework 3: Least squares distances (working through the Python Primer first will make this homework much easier)
Feb. 6
Maximum likelihood criterion
JC distance formula; common substitution models: K2P, F81, F84, HKY85, and GTR; likelihood: the probability of data given a model, likelihood of a “tree” with just one vertex and no edges, why likelihoods are always on the log scale, likelihood ratio tests. (Transition Probability Applet)[slides (1/page)][slides 4/page]
Friday, Feb. 7 Lab: Likelihood
Feb. 11
Maximum likelihood (cont.)
Likelihood of a tree with 2 vertices connected by one edge, transition probabilities, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree.[slides (1/page)][slides 4/page]
Homework 4: Site likelihoods
Feb. 13
Bootstrapping, Rate heterogeneity
Non-parametric bootstrapping, ultrafast bootstrapping  [slides (1/page)][slides 4/page]Rate heterogeneity, I model, G model, site-specific rates, mixture models.[slides (1/page)][slides 4/page]
Friday, Feb. 14 Lab: IQ-TREE tutorial
Feb. 18
How to simulate nucleotide sequence data, and why it’s done
[slides (1/page)][slides 4/page]
Homework 5: Rate heterogeneity (python program to modify)
Feb. 20
Long branch attraction
Statistical consistency, long branch attraction.Topology tests
KH, SH, and AU tests.
[slides (1/page)][slides 4/page]
Friday, Feb. 21 Lab: Simulating sequence data using PAUP*
Feb. 25
Codon and secondary structure models
Nonsynonymous vs. synonymous rates, codon models; RNA stem/loop structure, compensatory substitutions, stem models.

Amino acid models
Empirical amino acid rate matrices (PAM, JTT, WAG, LE, etc.).
[slides (1/page)][slides (4/page)]
Homework 6: Simulation
Feb. 27
Calculating expected substitutions/site; using eigenvectors and eigenvalues to turn rate matrices into transition probability matrices.

Bayes’ Rule
Joint, conditional, and marginal probabilities, and how they interact to create Bayes’ Rule.
[slides (1/page)][slides (4/page)](Eigenvector/eigenvalue applet.)

Friday, Feb. 28 Lab: Using HyPhy to test hypotheses
Mar. 3
Bayesian statistics
Probability vs. probability density. (archery priors applet)
Markov chain Monte Carlo (MCMC)
MCMC “robot” metaphor, Metropolis-Hastings algorithm, mixing, burn-in, and trace plots. (MCMC Robot applet)[slides (1/page)][slides (4/page)]
Homework 7: MCMC
Mar. 5
MCMCMC, topology proposals

Metropolis-coupled MCMC (i.e. “heated chains”), algorithms (a.k.a. updaters, moves, operators, proposals) for updating parameters and trees during MCMC.
[slides (1/page)][(slides (4/page)]

Friday, Mar. 6 Lab: Using R to explore probability distributions and plot trees
Mar. 10
Prior distributions used in phylogenetics

Discrete Uniform (topology), Gamma (kappa, omega), Beta (pinvar), Dirichlet (base frequencies, GTR exchangeabilities); Tree length prior; induced split prior.
[slides (1/page)][(slides (4/page)]

No homework assigned this week
Mar. 12
Quiz 1

20 point quiz on first half of course. Expect some general brief-essay questions about what we’ve covered, and you will be given some choice in which topics to answer.

Friday, Mar. 13 Lab: MRBAYES
Mar. 17
Mar. 19
Mar. 24
Bayesian phylogenetics (continued)

Dirichlet process priors, credible vs. confidence intervals. [CI applet] [Stick-breaking applet]

See HuskyCT for video mini-lectures.

Homework: no homework this week
Mar. 26
Bayes factors and Bayesian model selection

Bayes factors, steppingstone estimation of marginal likelihood.

See HuskyCT for video mini-lectures and links to the slides.

Friday, Mar. 27 Lab: Morphology and partitioning in MrBayes
Mar. 31
Morphology models, Correlation in Discrete Traits

Conditioning on variability; Pagel’s (1994) test for correlated evolution among discrete traits; reversible-jump MCMC; the “No Common Mechanism” (NCM) model.

Homework 8: TBA
Apr. 2
Evolutionary Correlation: Continuous Traits

Independent Contrasts and Phylogenetic Generalized Least Squares (PGLS).

Friday, Apr. 3 Lab: BayesTraits
Apr. 7
Trait evolution (cont.)

PGLS (cont.), estimating ancestral states (PGLS slides)


 Homework 9: Independent contrasts (Independent Contrasts slides)
Apr. 9
Estimating ancestral states

Ancestral state estimation for discrete traits (we also spent time discussing the Maddison and Fitzjohn paper and Jack’s ancestral state estimation problem.)

Friday, Apr. 10 Lab: APE
Apr. 14
Mixture models

rjMCMC (polytomies,heterotachy), covarion models, Dirichlet process mixture models

 Homework 10: TBA
Apr. 16
Species trees vs. gene trees

The coalescent, deep coalescence, incomplete lineage sorting (ILS)

Friday, Apr. 17 Lab: ggtree
Apr. 21
Species Tree Estimation

Gene tree discordance due to ILS, estimating species trees using the multispecies coalescent

 Homework 11: Simulate a coalescent tree
Apr. 23
Species Tree Estimation (cont.)

The SVDQuartets and ASTRAL species tree methods.

 Friday, Apr. 24 Lab: SVDQuartets, ASTRAL
Apr. 28
 Divergence time estimation

Strict vs. relaxed clocks, correlated vs. uncorrelated relaxed clocks, calibrating the clock using fossils

No homework assignment this week
Apr. 30
 Divergence times (cont.)

Fossilized birth-death model, information content

 Friday, May 1 Lab: Divergence time estimation using BEAST2
Finals week  Final exam

Books on phylogenetics

This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.

Yang, Z. 2014. Molecular evolution: a statistical approach. Oxford University Press.

Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)

Garamszegi, L. Z. 2014. Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. Springer-Verlag, Berlin. (Well-written chapters by current leaders in phylogenetic comparative methods.)

Hall, B. G. 2011. Phylogenetic trees made easy: a how-to manual (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)

Lemey, P., Salemi, M., and Vandamme, A.-M. 2009. The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)

Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)

Page, R., and Holmes, E. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science (Very nice and accessible pre-Bayesian-era introduction to the field.)

Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapters 11 (“Phylogenetic inference”) and 12 (“Applications of molecular systematics”). (Still a very valuable compendium of pre-Bayesian-era phylogenetic methods.)