Phylogenetics (EEB 5349)
This is a graduatelevel course in phylogenetics, emphasizing primarily maximum likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides handson experience with several important phylogenetic software packages (PAUP*, IQTREE, RAxML, MrBayes, RevBayes, and others) and introduces students to the use of remote high performance computing resources to perform phylogenetic analyses.
EEB 5349 is being taught Spring Semester 2020:
Lecture: Tuesday/Thursday 1112:15 (lecture instructor: Paul O. Lewis)
Lab: Friday 1:253:20 (laboratory instructor: Katie Taylor; office hour TBA)
Room: Torrey Life Science (TLS) 181, Storrs Campus
Text: none required, registered students will receive PDF copies of a textbook I am currently writing (see list of optional texts below)
Syllabus
Important! The syllabus below is currently mostly as it was when the course finished in Spring 2018. Modifications will occur as needed as the semester proceeds.
Date  Lecture topics  Lab/Homework 
Tuesday Jan. 21 
Introduction The jargon of phylogenetics (edges, vertices, leaves, degree, split, polytomy, taxon, clade); types of genealogies; rooted vs. unrooted trees; newick descriptions; monophyletic, paraphyletic, and polyphyletic groups [slides (1/page)] [slides (4/page)] 
Homework 1: Trees From Splits (due in lecture Tuesday Jan 28) 
Thursday Jan. 23 
Optimality criteria, search strategies, consensus trees Exhaustive enumeration, branchandbound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search strategies (NNI, SPR, TBR), evolutionary algorithms; consensus trees [slides 1/page][slides 4/page] 
Friday, Jan. 24 Lab: Using the Xanadu cluster; Introduction to PAUP*; NEXUS format 
Tuesday Jan. 28 
The parsimony criterion Strict, semistrict, and majorityrule consensus trees; maximum agreement subtrees; CaminSokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony [slides (1/page)] [slides (4/page)] 
Homework 2: Parsimony 
Thursday Jan. 30 
Distance methods Distance methods: least squares criterion, minimum evolution criterion, neighborjoining [slides (1/page)] [slides (4/page)] 
Friday, Jan. 31 Lab: Searching 
Tuesday Feb. 4 
Substitution models Instantaneous rates, expected number of substitutions, equilibrium frequencies, JC69 model. [slides (1/page)][slides (4/page)] 
Homework 3: Least squares distances (working through the Python Primer first will make this homework much easier) 
Thursday Feb. 6 
Maximum likelihood criterion JC distance formula; common substitution models: K2P, F81, F84, HKY85, and GTR; likelihood: the probability of data given a model, likelihood of a “tree” with just one vertex and no edges, why likelihoods are always on the log scale, likelihood ratio tests. (Transition Probability Applet)[slides (1/page)][slides 4/page] 
Friday, Feb. 7 Lab: Likelihood 
Tuesday Feb. 11 
Maximum likelihood (cont.) Likelihood of a tree with 2 vertices connected by one edge, transition probabilities, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree.[slides (1/page)][slides 4/page] 
Homework 4: Site likelihoods 
Thursday Feb. 13 
Bootstrapping, Rate heterogeneity Nonparametric bootstrapping, ultrafast bootstrapping [slides (1/page)][slides 4/page]Rate heterogeneity, I model, G model, sitespecific rates, mixture models.[slides (1/page)][slides 4/page] 
Friday, Feb. 14 Lab: IQTREE tutorial 
Tuesday Feb. 18 
Simulation How to simulate nucleotide sequence data, and why it’s done [slides (1/page)][slides 4/page] 
Homework 5: Rate heterogeneity (python program to modify) 
Thursday Feb. 20 
Long branch attraction Statistical consistency, long branch attraction.Topology tests KH, SH, and AU tests. [slides (1/page)][slides 4/page] 
Friday, Feb. 21 Lab: Simulating sequence data using PAUP* 
Tuesday Feb. 25 
Codon and secondary structure models Nonsynonymous vs. synonymous rates, codon models; RNA stem/loop structure, compensatory substitutions, stem models. Amino acid models Empirical amino acid rate matrices (PAM, JTT, WAG, LE, etc.). [slides (1/page)][slides (4/page)] 
Homework 6: Simulation 
Thursday Feb. 27 
Calculating expected substitutions/site; using eigenvectors and eigenvalues to turn rate matrices into transition probability matrices.
Bayes’ Rule 
Friday, Feb. 28 Lab: Using HyPhy to test hypotheses 
Tuesday Mar. 3 
Bayesian statistics Probability vs. probability density. (archery priors applet) Markov chain Monte Carlo (MCMC) MCMC “robot” metaphor, MetropolisHastings algorithm, mixing, burnin, and trace plots. (MCMC Robot applet)[slides (1/page)][slides (4/page)] 
Homework 7: MCMC 
Thursday Mar. 5 
MCMCMC, topology proposals
Metropoliscoupled MCMC (i.e. “heated chains”), algorithms (a.k.a. updaters, moves, operators, proposals) for updating parameters and trees during MCMC. 
Friday, Mar. 6 Lab: Using R to explore probability distributions and plot trees 
Tuesday Mar. 10 
Prior distributions used in phylogenetics
Discrete Uniform (topology), Gamma (kappa, omega), Beta (pinvar), Dirichlet (base frequencies, GTR exchangeabilities); Tree length prior; induced split prior. 
No homework assigned this week 
Thursday Mar. 12 
Quiz 1
20 point quiz on first half of course. Expect some general briefessay questions about what we’ve covered, and you will be given some choice in which topics to answer. 
Friday, Mar. 13 Lab: MRBAYES 
Tuesday Mar. 17 
SPRING BREAK  

Thursday Mar. 19 

Tuesday Mar. 24 
Bayesian phylogenetics (continued)
Dirichlet process priors, credible vs. confidence intervals. [CI applet] [Stickbreaking applet] See HuskyCT for video minilectures. 
Homework: no homework this week 
Thursday Mar. 26 
Bayes factors and Bayesian model selection
Bayes factors, steppingstone estimation of marginal likelihood. See HuskyCT for video minilectures and links to the slides. 
Friday, Mar. 27 Lab: Morphology and partitioning in MrBayes 
Tuesday Mar. 31 
Discrete morphological models
Dirichlet process prior models revisited; introduction to discrete morphological models; Mk model; conditioning on variability. 
Homework 8: read and summarize the main points in the Maddison & Fitzjohn (2015) paper (please see the assignment linked here before starting) 
Thursday Apr. 2 
Testing for evolutionary dependence
BIC; Pagel’s (1994) test for correlated evolution among discrete traits; reversiblejump MCMC. [lecture videos and slide pdfs are posted in HuskyCT] 
Friday, Apr. 3 Lab: BayesTraits 
Tuesday Apr. 7 
Stochastic character mapping
An alternative to Pagel’s (1994) test for assessing whether correlation among characters goes beyond what is expected from inheritance alone. [one 22 min lecture video and PDFs of slides have been posted in the HuskyCT course]

Homework 9: Summarize what you plan to do for your course project. What data will you use? Why are these data of interest? What is at least one question you plan to address using these data? 
Thursday Apr. 9 
Evolutionary Correlation: Continuous Traits
Independent Contrasts and Phylogenetic Generalized Least Squares (PGLS). [a 13.5 min lecture on PIC, a 21 min lecture on PGLS, and PDFs of slides have been posted in the HuskyCT course] . 
Friday, Apr. 10 Lab: Using sMap to perform stochastic mapping analyses 
Tuesday Apr. 14 
PGLS (cont.) Estimating ancestral states in PGLS. OrnsteinUhlenbeck model vs. Brownian motion. 
Homework: no homework this week 
Thursday Apr. 16 
Phylogenetic signal in comparative data Measuring the amount of phylogenetic information in continuous traits (Pagel’s lambda, Blomberg’s K). Introduction to the coalescent Just enough coalescent theory to understand the multispecies coalescent used to estimate species trees given possibly conflicting gene trees.Two videos (13.5 min and 14 min) are on the HuskyCT course site. There is also an applet that I forgot to mention in the “signal” video. 
Friday, Apr. 17 Lab: APE 
Tuesday Apr. 21 
Species Tree Estimation (cont.)
Deep coalescence, incomplete lineage sorting, gene tree discordance due to ILS, estimating species trees using the multispecies coalescent. The SVDQuartets and ASTRAL species tree methods. 
Homework 10: TBA 
Thursday Apr. 23 
Divergence time estimation
Strict vs. relaxed clocks, correlated vs. uncorrelated relaxed clocks, calibrating the clock using fossils. 
Friday, Apr. 24 Lab: Divergence time estimation with RevBayes 
Tuesday Apr. 28 
Diversification rate evolution
Statedependent diversification models (BiSSE and its descendants); BAMM: estimating the number of shifts in diversification regime and where these occur on the tree.

No homework (work on projects) 
Thursday Apr. 30 
Quiz 2
Open book, due 8pm Wednesday of finals week (i.e. the end time of our scheduled final exam). Optional lecture: Estimating phylogenetic information A talk I gave at the American Museum of Natural History in January. (Quiz 2 and the lecture are available in HuskyCT.) 
Friday, May 1: Project presentations: please time your presentation to last no longer than 15 minutes , leaving 5 minutes for questions/comments (20*6=2 hours total). 
Finals week  Quiz 2 due Wednesday by 8pm 
Books on phylogenetics
This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.
Harmon, L. 2019. Phylogenetic comparative methods. (Version 1.4, released 15 March 2019). Published online by the author.
Yang, Z. 2014. Molecular evolution: a statistical approach. Oxford University Press.
Baum, D. A., and S. D. Smith. 2013. Tree thinking: an introduction to phylogenetic biology. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)
Garamszegi, L. Z. 2014. Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. SpringerVerlag, Berlin. (Wellwritten chapters by current leaders in phylogenetic comparative methods.)
Hall, B. G. 2011. Phylogenetic trees made easy: a howto manual (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)
Lemey, P., Salemi, M., and Vandamme, A.M. 2009. The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)
Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)
Page, R., and Holmes, E. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science (Very nice and accessible preBayesianera introduction to the field.)
Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapters 11 (“Phylogenetic inference”) and 12 (“Applications of molecular systematics”). (Still a very valuable compendium of preBayesianera phylogenetic methods.)