## Phylogenetics (EEB 5349)

This is a graduate-level course in phylogenetics, emphasizing primarily Maximum Likelihood and Bayesian approaches to estimating phylogenies, which are genealogies at or above the species level. A primary goal is to provide an accessible introduction to the theory so that by the end of the course students should be able to understand much of the primary literature on modern phylogenetic methods and know how to intelligently apply these methods to their own problems. The laboratory provides hands-on experience with several important phylogenetic software packages (PAUP*, GARLI, RAxML, MRBAYES, BEAST) and introduces students to the computing resources of the UConn Bioinformatics Facility.

### EEB 5349 is being taught Spring Semester 2014:

**Lecture:** Tuesdays and Thursdays, 11-12:15 (instructor: Paul O. Lewis)

**Lab:** Thursdays, 9-11 (instructors: Paul O. Lewis and Geert Goemans)

**Room:** Torrey Life Science (TLS) 181, Storrs Campus

**Text:** none required, registered students will receive printed copies of a textbook I am currently writing (see list of optional texts below)

**Grade**: based on midterm exam, final exam, homeworks, and project presentation

## Syllabus

This syllabus will be updated periodically through the semester. I will post PDF versions of each lecture after they are given. Homeworks are due 1 week after the date they are assigned in the syllabus.

Date |
Lecture topics |
Lab/Homework |

Tuesday Jan. 21 |
IntroductionThe terminology of phylogenetics, rooted vs. unrooted trees, ultrametric vs. unconstrained, paralogy vs. orthology, lineage sorting, “basal” lineages, crown vs. stem groups |
Homework 1: trees from splits (due in lecture Tuesday Jan 28) |

Thursday Jan. 23 |
Optimality criteria, search strategiesExhaustive enumeration, branch-and-bound search, algorithmic methods (star decomposition, stepwise addition, NJ), heuristic search stragegies (NNI, SPR, TBR), evolutionary algorithms |
Lab: Using the UConn Bioinformatics Facility cluster; Introduction to PAUP*; NEXUS format |

Tuesday Jan. 28 |
Consensus trees, the parsimony criterionStrict, semi-strict, and majority-rule consensus trees; maximum agreement subtrees; Camin-Sokal, Wagner, Fitch, Dollo, and transversion parsimony; step matrices and generalized parsimony |
Homework 2: Parsimony |

Thursday Jan. 30 |
Bootstrapping, distance methodsBootstrapping; Distance methods: split decomposition, quartet puzzling, neighbor-joining, least squares criterion, minimum evolution criterion |
Lab: Python Primer |

Tuesday Feb. 4 |
Transition probability, instantaneous rates, Poisson processes, JC69 model, K2P model, F81 model, F84 model, HKY85 model, GTR modelSubstitution models |
Homework 3: Distances |

Thursday Feb. 6 |
Maximum likelihood criterionLikelihood: the probability of data given a model, maximum likelihood estimates (MLEs) of model parameters, likelihood of a tree, likelihood ratio test |
Lab: Searching |

Tuesday Feb. 11 |
Rate heterogeneityProportion of invariable sites, discrete gamma, site-specific rates |
Homework 4: Likelihood |

Thursday Feb. 13 |
Snow day: lecture canceled |
Snow day: lab canceled |

Tuesday Feb. 18 |
Empirical amino acid rate matrices, transition probabilities by exponentiating the rate matrix, RNA stem/loop structure, compensatory substitutions, stem models, nonsynonymous vs. synonymous rates, codon modelsCodon, amino acid, secondary structure models |
Homework 5: Rate heterogeneity |

Thursday Feb. 20 |
Model selectionLikelihood ratio test (LRT), Akaike Information criterion (AIC), Bayesian Information Criterion (BIC) Expected number of substitutionsAn example derivation for the F81 model |
Lab: Likelihood |

Tuesday Feb. 25 |
How to simulate nucleotide sequence data, and why it’s doneSimulationStatistical consistency, long branch attractionLong branch attraction |
Homework 6: Simulation |

Thursday Feb. 27 |
Bring (or email) questions and I’ll try to help you build the big-picture view of the semester thus far.Review sessionKH test, SH test, SOWH testTopology tests RAxML CAT modelThe site-specific rates alternative to discrete gamma mixture model used in RAxML |
Lab: ML analyses of large data sets using RAxML and GARLI |

Tuesday Mar. 4 |
Bayesian Conditional/joint probabilities, Bayes rule, prior vs. posterior distributions, probability mass vs. probability densitystatistics |
Homework 7: MCMC |

Thursday Mar. 6 |
Markov chain Monte CarloMetropolis algorithm, MCMC, mixing, heated chains, Hastings ratio |
Lab: Exploring prior distributions using R |

Tuesday Mar. 11 |
Commonly-used prior distributions: Beta, Gamma, Lognormal, DirichletPriors used in Bayesian phylogenetics |
No homework assigned: study for test |

Thursday Mar. 13 |
Midterm Exam(on material up to and including Mar. 6) |
Lab: MrBayes 3.2 |

Tuesday Mar. 18 |
SPRING BREAK | |
---|---|---|

Thursday Mar. 20 |
||

Tuesday Mar. 25 |
Hierarchical models and hyperpriors, Empirical Bayes, Dirichlet process priors, MCMC without dataPrior miscellanyFrequentist confidence intervals differ from Bayesian credible intervalsConfidence vs. credible intervals |
Homework 8: LOCAL move |

Thursday Mar. 27 |
Marginal likelihoods and Bayes factorsBayesian model selectionDNA sequences vs. morphological characters, Symmetric vs. asymmetric 2-state models, Mk modelDiscrete morphological characters |
Lab: Morphology, paritioning and model selection in MRBAYES |

Tuesday Apr. 1 |
Pagel’s likelihood ratio testCorrelated discrete character evolutionCorrelated continuous character evolutionFelsenstein’s independent contrasts |
Homework 9: Independent contrasts |

Thursday Apr. 3 |
Likelihood, (empirical) Bayesian and parsimony reconstruction of ancestral statesAncestral state estimationIntroduction to stochastic character mappingStochastic character mapping |
Lab: Compositional heterogeneity |

Tuesday Apr. 8 |
SIMMAP demo: using stochastic mapping for estimating ancestral states and character correlationStochastic character mapping (continued)Mixture of Rate Matrices, rjMCMC, heterotachy modelsMixture models |
Homework 10: Character Mapping |

Thursday Apr. 10 |
Mixture models (cont.), BIC, plus Polytomies and Bayesian AnalysesHow is the Bayesian Information Criterion Bayesian? The Bayesian Star Tree Paradox and an rjMCMC solution |
Lab: BayesTraits |

Tuesday Apr. 15 |
Thorne/Kishino autocorrelated log-normal model; BEAST uncorrelated log-normal model; coalescent, exponential growth coalescent, and Yule tree priorsDivergence time estimation |
No homework: work on presentation |

Thursday Apr. 17 |
Dirichlet process and birth-death methods, random local clocks approach, nonparametric rate smoothing/penalized likelihoodDivergence time estimation (cont.) |
Lab: BEAST |

Tuesday Apr. 22 |
Mostly *BEAST, but will mention BEST, STEM, and BUCKySpecies tree estimation |
No homework: work on presentation |

Thursday Apr. 24 |
Miscellanous final topicsA few topics we did not yet get a chance to talk about:BUCKy, IC, CPO, KL divergence |
Lab: FDPPDiv and Seq-Gen |

Tuesday Apr. 29 |
Student presentations | No homework: work on presentation |

Thursday May 1 |
Student presentations | Lab: student presentations |

Tuesday May 6 | Final Exam (10:30-12:30, TLS 181) |

## Books on phylogenetics

This is a list of books that you should know about, but none are required texts for this course. Listed in reverse chronological order.

Baum, D. A., and S. D. Smith. 2013. **Tree thinking: an introduction to phylogenetic biology**. Roberts and Company Publishers, Greenwood Village, Colorado. (This book is probably the most useful companion volume for this course, introducing the methods in a very accessible way but also providing lots of practice interpreting phylogenies correctly.)

Hall, B. G. 2011. **Phylogenetic trees made easy: a how-to manual** (4th edition). Sinauer Associates, Sunderland. (A guide to running some of the most important phylogenetic software packages.)

Lemey, P., Salemi, M., and Vandamme, A.-M. 2009. **The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing** (2nd edition). Cambridge University Press, Cambridge, UK (Chapters on theory are paired with practical chapters on software related to the theory.)

Felsenstein, J. 2004. **Inferring phylogenies**. Sinauer Associates, Sunderland. (Comprehensive overview of both history and methods of phylogenetics.)

Page, R., and Holmes, E. 1998. **Molecular evolution: a phylogenetic approach.** Blackwell Science (Very nice and accessible pre-Bayesian-era introduction to the field.)

Hillis, D., Moritz, C., and Mable, B. 1996. Molecular systematics (2nd ed.). Sinauer Associates, Sunderland. Chapters 11 (“**Phylogenetic inference**”) and 12 (“**Applications of molecular systematics**”). (Still a very valuable compendium of pre-Bayesian-era phylogenetic methods.)