Phylogenetics (EEB 5349)
This is a graduate-level course in phylogenetics, emphasizing primarily maximum likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides hands-on experience with several important phylogenetic software packages (PAUP*, GARLI, RAxML, MrBayes, RevBayes, BEAST) and introduces students to the use of remote high performance computing resources to perform phylogenetic analyses.
EEB 5349 is being taught Spring Semester 2018:
Lecture: Tuesday/Thursday 11-12:15 (lecture instructor: Paul O. Lewis)
Lab: Friday 1:25-3:20 (laboratory instructor: Kevin Keegan; office hour Friday 10-11)
Room: Torrey Life Science (TLS) 181, Storrs Campus
Text: none required, registered students will receive PDF copies of a textbook I am currently writing (see list of optional texts below)
Syllabus
Important! The syllabus below is partially empty because the course is being taught in a different way this semester and the topics are not aligned exactly as they were in the 2016 version of the course.
Literature cited in lecture: I’ve created a page containing papers cited in the lectures.
Bayesian phylogenetics
GammaDir prior, credible vs. confidence intervals, hierarchical models vs. empirical Bayes.
Date | Lecture topics | Lab/Homework |
Tuesday Jan. 16 |
Introduction The jargon of phylogenetic trees (edges, vertices, leaves, cherries, degree, split, polytomy, taxon, clade); types of genealogies; rooted vs. unrooted trees; newick descriptions; monophyletic, paraphyletic, and polyphyletic groups; why are phylogenies useful? |
Homework 1: Trees From Splits (due in lecture Tuesday Jan 23) |
Thursday Jan. 18 |
Optimality criteria, search strategies Exhaustive enumeration, branch-and-bound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search strategies (NNI, SPR, TBR), evolutionary algorithms |
Friday, Jan. 19 Lab: Using the UConn Bioinformatics Facility cluster; Introduction to PAUP*; NEXUS format |
Tuesday Jan. 23 |
Consensus trees, the parsimony criterion Strict, semi-strict, and majority-rule consensus trees; maximum agreement subtrees; Camin-Sokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony |
Homework 2: Parsimony (due in class Tuesday, Jan. 30) |
Thursday Jan. 25 |
Bootstrapping, distance methods Bootstrapping; Distance methods: split decomposition, quartet puzzling, neighbor-joining, least squares criterion, minimum evolution criterion |
Friday, Jan. 26 Lab: Searching |
Tuesday Jan. 30(individual meetings this week) |
Substitution models Instantaneous rates, expected number of substitutions, equilibrium frequencies, JC69 model, K2P model, F81 model, F84 model, HKY85 model, GTR model |
Homework 3: Work through the Python Primer (nothing to hand in) |
Thursday Feb. 1(individual meetings this week) |
Maximum likelihood criterion Likelihood: the probability of data given a model, likelihood of a “tree” with just one vertex and no edges, why likelihoods are always on the log scale, likelihood ratio tests. |
Friday, Feb. 2 Lab: Likelihood |
Tuesday Feb. 6 |
Maximum likelihood (cont.) Likelihood of a tree with 2 vertices connected by one edge, transition probabilities, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree. (Transition probability applet) |
Homework 4: Site likelihood (due Tuesday, Feb. 13) |
Thursday Feb. 8 |
Rate heterogeneity Proportion of invariable sites, discrete gamma, site-specific rates. |
Friday, Feb. 9 Lab: ML analyses of large data sets using RAxML and GARLI |
Tuesday Feb. 13 |
Simulation How to simulate nucleotide sequence data, and why it’s done Long branch attraction Statistical consistency, long branch attraction |
Homework 5: Rate heterogeneity (due Feb. 27) |
Thursday Feb. 15 |
Codon and secondary structure models Nonsynonymous vs. synonymous rates, codon models; RNA stem/loop structure, compensatory substitutions, stem models. |
Friday, Feb. 16 Lab: IQ-TREE |
Tuesday Feb. 20(individual meetings this week) |
Amino acid models
Empirical amino acid rate matrices (PAM, JTT, WAG, LE, etc.); using eigenvectors and eigenvalues to turn rate matrices into transition probability matrices. (Eigenvector/eigenvalue applet.) Model selection |
Continue working on Homework 5: Rate heterogeneity (due Feb. 27) |
Thursday Feb. 22(individual meetings this week) |
Model selection (cont.)
Akaike Information criterion (AIC); Bayesian information criterion (BIC). Topology tests |
Friday, Feb. 23 Lab: Simulating sequence data |
Tuesday Feb. 27 |
Bayesian statistics
Bayes’ Rule, prior and posterior probability distributions, marginal probability of the data, probability vs. probability density. (archery priors applet) |
Homework 6: Simulation |
Thursday Mar. 1 |
Markov chain Monte Carlo (MCMC)
MCMC “robot” metaphor, Metropolis-Hastings algorithm, mixing, burn-in, and trace plots. (MCMC Robot applet) |
Friday, Mar. 2 Lab: Using R to explore probability distributions |
Tuesday Mar. 5 |
MCMCMC, MCMC “moves”
Metropolis-coupled MCMC (i.e. “heated chains”), algorithms (a.k.a. updaters, moves, operators, proposals) for updating parameters and trees during MCMC. (applet showing slider proposal is indeed symmetric.) |
Homework: no homework this week |
Thursday Mar. 7 |
Prior distributions used in phylogenetics
Gamma/Exponential/Lognormal distributions for edge lengths and rate ratios, the Beta distribution for proportions, and the Dirichlet distribution for state frequencies and GTR exchangeabilities. |
Friday, Mar. 8 Lab: MRBAYES |
Tuesday Mar. 13 |
SPRING BREAK | |
---|---|---|
Thursday Mar. 15 |
||
Tuesday Mar. 20 |
Bayesian phylogenetics (continued)
Tree length vs. edge length prior, credible vs. confidence intervals, hierarchical models vs. empirical Bayes |
Homework 8: MCMC |
Thursday Mar. 22 |
Bayes factors and Bayesian model selection
Bayes factors, steppingstone estimation of marginal likelihood. |
Friday, Mar. 23 Lab: Morphology and partitioning in MrBayes |
Tuesday Mar. 27(individual meetings this week) |
Homework 9: Independent contrasts | |
Thursday Mar. 29(individual meetings this week) |
Friday, Mar. 30 Lab: BayesTraits | |
Tuesday Apr. 3 |
||
Thursday Apr. 5 |
Friday, Apr. 6 Lab: | |
Tuesday Apr. 10 |
||
Thursday Apr. 12 |
Friday, Apr. 13 Lab: BEAST | |
Tuesday Apr. 17 |
||
Thursday Apr. 19 |
Friday, Apr. 20 Lab: | |
Tuesday Apr. 24 |
||
Thursday Apr. 26 |
Friday, Apr. 27 Lab: | |
Finals week | Individual meetings this week |
Books on phylogenetics
This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.
Yang, Z. 2014. Molecular evolution: a statistical approach. Oxford University Press.
Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)
Hall, B. G. 2011. Phylogenetic trees made easy: a how-to manual (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)
Lemey, P., Salemi, M., and Vandamme, A.-M. 2009. The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)
Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)
Page, R., and Holmes, E. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science (Very nice and accessible pre-Bayesian-era introduction to the field.)
Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapters 11 (“Phylogenetic inference”) and 12 (“Applications of molecular systematics”). (Still a very valuable compendium of pre-Bayesian-era phylogenetic methods.)