Adiantum (maidenhair fern) pinnule with inlaid phylogeny
Adiantum (maidenhair fern) pinnule with inlaid phylogeny

Phylogenetics (EEB 5349)

This is a graduate-level course in phylogenetics, emphasizing primarily maximum likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides hands-on experience with several important phylogenetic software packages (PAUP*, GARLI, RAxML, MrBayes, RevBayes, BEAST) and introduces students to the use of remote high performance computing resources to perform phylogenetic analyses.

EEB 5349 is being taught Spring Semester 2018:

Lecture: Tuesday/Thursday 11-12:15  (lecture instructor: Paul O. Lewis)
Lab: Friday 1:25-3:20 (laboratory instructor: Kevin Keegan; office hour Friday 10-11)
Room: Torrey Life Science (TLS) 181, Storrs Campus
Text: none required, registered students will receive PDF copies of a textbook I am currently writing (see list of optional texts below)


Important! The syllabus below is partially empty because the course is being taught in a different way this semester and the topics are not aligned exactly as they were in the 2016 version of the course.

Date Lecture topics Lab/Homework
Jan. 16
The jargon of phylogenetic trees (edges, vertices, leaves, cherries, degree, split, polytomy, taxon, clade); types of genealogies; rooted vs. unrooted trees; newick descriptions; monophyletic, paraphyletic, and polyphyletic groups; why are phylogenies useful?
Homework 1: Trees From Splits (due in lecture Tuesday Jan 23)
Jan. 18
Optimality criteria, search strategies
Exhaustive enumeration, branch-and-bound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search strategies (NNI, SPR, TBR), evolutionary algorithms
Friday, Jan. 19 Lab: Using the UConn Bioinformatics Facility cluster; Introduction to PAUP*NEXUS format
Jan. 23
Consensus trees, the parsimony criterion
Strict, semi-strict, and majority-rule consensus trees; maximum agreement subtrees; Camin-Sokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony
Homework 2: Parsimony (due in class Tuesday, Jan. 30)
Jan. 25
Bootstrapping, distance methods
Bootstrapping; Distance methods: split decomposition, quartet puzzling, neighbor-joining, least squares criterion, minimum evolution criterion
Friday, Jan. 26 Lab: Searching
Jan. 30(individual meetings this week)
Substitution models
Instantaneous rates, expected number of substitutions, equilibrium frequencies, JC69 model, K2P model, F81 model, F84 model, HKY85 model, GTR model
Homework 3: Work through the Python Primer (nothing to hand in)
Feb. 1(individual meetings this week)
Maximum likelihood criterion
Likelihood: the probability of data given a model, likelihood of a “tree” with just one vertex and no edges, why likelihoods are always on the log scale, likelihood ratio tests.
Friday, Feb. 2 Lab: Likelihood
Feb. 6
Maximum likelihood (cont.)
Likelihood of a tree with 2 vertices connected by one edge, transition probabilities, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree. (Transition probability applet)
Homework 4: Site likelihood (due Tuesday, Feb. 13)
Feb. 8
Rate heterogeneity
Proportion of invariable sites, discrete gamma, site-specific rates.
Friday, Feb. 9 Lab: ML analyses of large data sets using RAxML and GARLI
Feb. 13
How to simulate nucleotide sequence data, and why it’s done
Long branch attraction
Statistical consistency, long branch attraction
Homework 5: Rate heterogeneity (due Feb. 27)
Feb. 15
Codon and secondary structure models
Nonsynonymous vs. synonymous rates, codon models; RNA stem/loop structure, compensatory substitutions, stem models.
Friday, Feb. 16 Lab: IQ-TREE
Feb. 20(individual meetings this week)
Amino acid models

Empirical amino acid rate matrices (PAM, JTT, WAG, LE, etc.); using eigenvectors and eigenvalues to turn rate matrices into transition probability matrices. (Eigenvector/eigenvalue applet.)

Model selection
Likelihood ratio tests (LRTs) revisited; testing the molecular clock.

Continue working on Homework 5: Rate heterogeneity (due Feb. 27)
Feb. 22(individual meetings this week)
Model selection (cont.)

Akaike Information criterion (AIC); Bayesian information criterion (BIC).

Topology tests
KH, SH, and AU tests.

Friday, Feb. 23 Lab: Simulating sequence data
Feb. 27
Bayesian statistics

Bayes’ Rule, prior and posterior probability distributions, marginal probability of the data, probability vs. probability density. (archery priors applet)

Homework 6: Simulation
Mar. 1
Markov chain Monte Carlo (MCMC)

MCMC “robot” metaphor, Metropolis-Hastings algorithm, mixing, burn-in, and trace plots. (MCMC Robot applet)

Friday, Mar. 2 Lab: Using R to explore probability distributions
Mar. 5
MCMCMC, MCMC “moves”

Metropolis-coupled MCMC (i.e. “heated chains”), algorithms (a.k.a. updaters, moves, operators, proposals) for updating parameters and trees during MCMC. (applet showing slider proposal is indeed symmetric.)

Homework: no homework this week
Mar. 7
Prior distributions used in phylogenetics

Gamma/Exponential/Lognormal distributions for edge lengths and rate ratios, the Beta distribution for proportions, and the Dirichlet distribution for state frequencies and GTR exchangeabilities.

Friday, Mar. 8 Lab: MRBAYES
Mar. 13
Mar. 15
Mar. 20
Bayesian phylogenetics (continued)

Tree length vs. edge length prior, credible vs. confidence intervals, hierarchical models vs. empirical Bayes

Homework 7: MCMC
Mar. 22
Bayes factors and Bayesian model selection

Bayes factors, steppingstone estimation of marginal likelihood.

Friday, Mar. 23 Lab: Morphology and partitioning in MrBayes
Mar. 27
Morphology models, Correlation in Discrete Traits

Conditioning on variability; Pagel’s (1994) test for correlated evolution among discrete traits; reversible-jump MCMC; the “No Common Mechanism” (NCM) model. (individual meetings this week)

Homework 8: Read Maddison and Fitzjohn (2015)
Mar. 29
Evolutionary Correlation: Continuous Traits

Independent Contrasts and Phylogenetic Generalized Least Squares (PGLS). (individual meetings this week)

Friday, Mar. 30 Lab: BayesTraits
Apr. 3
Trait evolution (cont.)

PGLS (cont.), estimating ancestral states (PGLS slides)


 Homework 9: Independent contrasts (Independent Contrasts slides)
Apr. 5
Estimating ancestral states

Ancestral state estimation for discrete traits (we also spent time discussing the Maddison and Fitzjohn paper and Jack’s ancestral state estimation problem.)

Friday, Apr. 6 Lab: APE
Apr. 10
Mixture models

rjMCMC (polytomies,heterotachy), covarion models, Dirichlet process mixture models

 Homework 10: Read Degnan and Rosenberg (2009)
Apr. 12
Species trees vs. gene trees

The coalescent, deep coalescence, incomplete lineage sorting (ILS)

Friday, Apr. 13 Lab: ggtree
Apr. 17
Species Tree Estimation

Gene tree discordance due to ILS, estimating species trees using the multispecies coalescent

 Homework 11: Simulate a coalescent tree
Apr. 19
Species Tree Estimation (cont.)

The SVDQuartets and ASTRAL species tree methods. Guest lecturer: David Swofford. (slides).

 Friday, Apr. 20 Lab: SVDQuartets, ASTRAL
Apr. 24
 Divergence time estimation

Strict vs. relaxed clocks, correlated vs. uncorrelated relaxed clocks, calibrating the clock using fossils

No homework assignment this week
Apr. 26
 Divergence times (cont.)

Fossilized birth-death model, information content

 Friday, Apr. 27 Lab: Divergence time estimation using BEAST2
Finals week  Individual meetings this week

Books on phylogenetics

This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.

Yang, Z. 2014. Molecular evolution: a statistical approach. Oxford University Press.

Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)

Garamszegi, L. Z. 2014. Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. Springer-Verlag, Berlin. (Well-written chapters by current leaders in phylogenetic comparative methods.)

Hall, B. G. 2011. Phylogenetic trees made easy: a how-to manual (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)

Lemey, P., Salemi, M., and Vandamme, A.-M. 2009. The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)

Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)

Page, R., and Holmes, E. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science (Very nice and accessible pre-Bayesian-era introduction to the field.)

Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapters 11 (“Phylogenetic inference”) and 12 (“Applications of molecular systematics”). (Still a very valuable compendium of pre-Bayesian-era phylogenetic methods.)