Phylogenetics (EEB 5349)
This is a graduate-level course in phylogenetics, emphasizing primarily maximum likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides hands-on experience with several important phylogenetic software packages (PAUP*, GARLI, RAxML, MRBAYES, BEAST) and introduces students to the computing resources of the UConn Bioinformatics Facility.
EEB 5349 is being taught Spring Semester 2016:
Lecture: Tuesdays 11-12:15 and Thursdays 9-10:15 (instructor: Paul O. Lewis)
Lab: Thursdays, 10:15-12:15 (instructor: Suman Neupane)
Room: Torrey Life Science (TLS) 181, Storrs Campus
Text: none required, registered students will receive PDF copies of a textbook I am currently writing (see list of optional texts below)
Grade: based on midterm exam, final exam, homeworks, and project presentation
Syllabus
This syllabus has been updated for the Spring 2016 version of the course, but will continue to be updated periodically throughout the semester. I will post PDF versions of each lecture after they are given. Homeworks are due 1 week after the date they are assigned in the syllabus.
| Date | Lecture topics | Lab/Homework |
| Tuesday Jan. 19 |
Introduction The terminology of phylogenetics, rooted vs. unrooted trees, ultrametric vs. unconstrained, paralogy vs. orthology, lineage sorting, “basal” lineages, crown vs. stem groups |
Homework 1: trees from splits (due in lecture Tuesday Jan 26) |
| Thursday Jan. 21 |
Optimality criteria, search strategies Exhaustive enumeration, branch-and-bound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search strategies (NNI, SPR, TBR), evolutionary algorithms |
Lab: Using the UConn Bioinformatics Facility cluster; Introduction to PAUP*; NEXUS format |
| Tuesday Jan. 26 |
Consensus trees, the parsimony criterion Strict, semi-strict, and majority-rule consensus trees; maximum agreement subtrees; Camin-Sokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony |
Homework 2: Parsimony (due in class Tuesday, Feb. 2) |
| Thursday Jan. 28 |
Bootstrapping, distance methods Bootstrapping; Distance methods: split decomposition, quartet puzzling, neighbor-joining, least squares criterion, minimum evolution criterion |
Lab: Python Primer |
| Tuesday Feb. 2 |
Substitution models (updated after lecture) Transition probability, instantaneous rates, Poisson processes, JC69 model, K2P model, F81 model, F84 model, HKY85 model, GTR model |
Homework 3: Distances |
| Thursday Feb. 4 |
Maximum likelihood criterion Likelihood: the probability of data given a model, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree, likelihood ratio test |
Lab: Searching |
| Tuesday Feb. 9 |
Rate heterogeneity Proportion of invariable sites, discrete gamma, site-specific rates |
Homework 4: Likelihood |
| Thursday Feb. 11 |
Codon, amino acid, secondary structure models Empirical amino acid rate matrices, transition probabilities by exponentiating the rate matrix, RNA stem/loop structure, compensatory substitutions, stem models, nonsynonymous vs. synonymous rates, codon models. (Eigenvector demo) |
Lab: Likelihood |
| Tuesday Feb. 16 |
Model selection Likelihood ratio test (LRT), Akaike Information criterion (AIC), Bayesian Information Criterion (BIC) Expected number of substitutions An example derivation for the F81 model |
Homework 5: Rate heterogeneity |
| Thursday Feb. 18 |
Simulation How to simulate nucleotide sequence data, and why it’s done Long branch attraction Statistical consistency, long branch attraction |
Lab: ML analyses of large data sets using RAxML and GARLI |
| Tuesday Feb. 23 |
Topology tests ILD, KH, SH, AU and SOWH tests |
Homework 6: Simulation |
| Thursday Feb. 25 |
Bayesian statistics Conditional/joint probabilities, Bayes rule, prior vs. posterior distributions, probability mass vs. probability density |
Lab: Exploring probability distributions using R |
| Tuesday Mar. 1 |
Markov chain Monte Carlo Metropolis algorithm, MCMC, mixing, heated chains, Hastings ratio |
Homework 7: MCMC |
| Thursday Mar. 3 |
Priors used in Bayesian phylogenetics Commonly-used prior distributions: Beta, Gamma, Lognormal, Dirichlet |
Lab: MrBayes 3.2 |
| Tuesday Mar. 8 |
Prior miscellany Hierarchical models and hyperpriors, Empirical Bayes, Dirichlet process priors, MCMC without data Confidence vs. credible intervals Frequentist confidence intervals differ from Bayesian credible intervals |
Homework 8: LOCAL move |
| Thursday Mar. 10 |
Bayesian model selection Marginal likelihoods and Bayes factors |
Lab: Morphology, partitioning and model selection in MRBAYES |
| Tuesday Mar. 15 |
SPRING BREAK | |
|---|---|---|
| Thursday Mar. 17 |
||
| Tuesday Mar. 24 |
Discrete morphological characters DNA sequences vs. morphological characters, Symmetric vs. asymmetric 2-state models, Mk model |
Review study questions handed out in lecture (will discuss Mar. 29) |
| Thursday Mar. 24 |
No lecture (Paul is out of town) | Lab: HyPhy |
| Tuesday Mar. 29 |
Discussion of study guide questions | Homework 9: Independent contrasts |
| Thursday Mar. 31 |
Correlated discrete character evolution Pagel’s likelihood ratio test Correlated continuous character evolution Felsenstein’s independent contrasts (simulator shown in class) |
Lab:BayesTraits |
| Tuesday Apr. 5 |
Phylogenetic Generalized Least Squares (PGLS) Linear regression with correlation structure of residuals determined by the phylogeny |
Read O’Meara (2012) before Tuesday Apr. 12 |
| Thursday Apr. 7 |
Stochastic character mapping Introduction to the use of stochastic character mapping for estimating ancestral states and character correlation |
Lab: APE |
| Tuesday Apr. 12 |
Discussion of O’Meara (2012) O’Meara, B. C. 2012. Evolutionary inferences from phylogenies: a review of methods. Ann. Rev. Ecol. Evol. Syst. 43:267-285. Mixture models Mixture of rate matrices, rjMCMC, heterotachy models, covarion models, Dirichlet process models. |
Read Maddison and FitzJohn (2015) before Tuesday Apr. 19 |
| Thursday Apr. 14 |
Divergence time estimation Thorne/Kishino autocorrelated log-normal model; BEAST uncorrelated log-normal model; Yule tree priors; Fossilized Birth-Death Prior |
Lab: BEAST |
| Tuesday Apr. 19 |
Discussion of Maddison & FitzJohn (2015) Maddison, W. P., and FitzJohn, R. G. 2015. The unsolved challenge to phylogenetic correlation tests for categorical characters. Syst. Biol. 64(1):127-136. Just enough coalescent theory Coalescent theory needed for understanding the multispecies coalescent model |
Final exam (take-home) will be handed out (due May 5 at 5:30pm) |
| Thursday Apr. 21 |
Gene trees within species trees *BEAST, ASTRAL2, SVDQuartets. |
Lab: *BEAST |
| Tuesday Apr. 26 |
Medley of topics Polytomy priors and community phylogenetics |
No homework (use time to work on final) |
| Thursday Apr. 28 |
Medley (cont.) Bayesian information content |
Astral-2, SVDQuartets (GARLI bootstrap trees) |
| Thursday May 5 | Final exam due by 5:30pm | |
Books on phylogenetics
This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.
Yang, Z. 2014. Molecular evolution: a statistical approach. Oxford University Press.
Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)
Hall, B. G. 2011. Phylogenetic trees made easy: a how-to manual (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)
Lemey, P., Salemi, M., and Vandamme, A.-M. 2009. The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)
Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)
Page, R., and Holmes, E. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science (Very nice and accessible pre-Bayesian-era introduction to the field.)
Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapters 11 (“Phylogenetic inference”) and 12 (“Applications of molecular systematics”). (Still a very valuable compendium of pre-Bayesian-era phylogenetic methods.)
