Instructors: Tadashi Fukami and Jes Coyle


Course Goals:

Students will be able to:

  • Design statistically sound data collection strategies to answer a given research questions.
  • Choose among modern statistical tools and analyze data using R.
  • Present results effectively usin R for peer-reviewed papers.
  • Advise colleagues about statistical analyses.

Class Format:

During the first few weeks, each student will briefly introduce their research question and data and we will go over basics of how to use R and interact with the course content on this website using R markdown files and GitHub. From then on, one selected topic will be covered during each class. First the instructor will briefly summarize the assigned reading (~15 min) and go over key concepts and analyses, R packages and how they should be used, how to differentiate among them, including time for questions and clarification. Then, a student will present an example of the topic (~15 min). This could be (1) going over an example presented in the text by walking through the procedure and details of the code that they think are important; or (2) applying these techniques to the student’s own data and walking others through the analysis, and present on their results/challenges faced, etc. Each class concludes with a ~20-min discussion led by the student (who will come prepared with questions to facilitate discussion and will share novel/important/interesting functions and pieces of R code). Each student will lead 1 or 2 topics during the quarter. We will also have a mid-term and final presentation and discussion on student projects. Students will prepare their lesson content ahead of time by modifying the Rmarkdown webpage on this site.

Final product:

Each student will complete the results section to include in a paper to submit for publication or data collection plans for a research proposal.

Tentative Schedule

Week Day Date Topic & Reading Summary Discussion
1 Mon Sept. 25 Intro to R & RStudio Tad & Jes ————
Wed Sept. 27 Topic selection, Intro to R markdown & class website Tad & Jes ————
Fri Sept. 29 R workshop (optional) Jes ————
2 Mon Oct. 2 Intro to ggplot2 and tidyr Jes ————
Wed Oct. 4 Intro to GitHub & git Jes ————
Fri Oct. 6 Each person introduces their questions & data Tad & Jes ————
3 Mon Oct. 9 Topic 1. Data exploration Tad
Wed Oct. 11 Topic 2. Linear models Tad
Fri Oct. 13 Topic 3. Dealing with heterogeneity Tad
4 Mon Oct. 16 Topic 4. Mixed-effects models with nested data Tad
Wed Oct. 18 Topic 5. GLM with binary & proportional data Tad
Fri Oct. 20 Topic 6. GLM with zero-inflated data Tad
5 Mon Oct. 23 Topic 7. Bayesian linear models Jes
Wed Oct. 25 Topic 8. Bayesian inference with prior information Jes
Fri Oct. 27 Topic 9. Advanced Bayesian model example Jes
6 Mon Oct. 30 Topic 10. Unconstrained ordination Tad
Wed Nov. 1 Topic 11. Constrained ordination Tad
Fri Nov. 3 Topic 12. Comparing multivariate data Jes
7 Mon Nov. 6 Mid-term project presentation & discussion
Wed Nov. 8 Mid-term project presentation & discussion
Fri Nov. 10 Mid-term project presentation & discussion
8 Mon Nov. 13 Topic 13. Spatial analysis, Zuur et al. approach Jes
Wed Nov. 15 Topic 14. Spatial analysis, Borcard et al. approach Jes
Fri Nov. 17 Topic 15. Spatial analysis, Bayesian approach Jes
RECESS Thanksgiving
9 Mon Nov. 27 Topic 16. To be selected by class
Wed Nov. 29 Topic 17. To be selected by class
Fri Dec. 1 Topic 18. To be selected by class
10 Mon Dec. 4 Final project presentations
Wed Dec. 6 Final project presentations
Fri Dec. 8 Final project presentations

Before Class

Before the first class please read through the computer setup instructions that walk you through how to set up your computer to run R and Rstudio. Even if you have these programs already installed, make sure to check that you are running the latest versions of R and RStudio (which the instructions will tell you how to do).

Our course website runs from a repository on GitHub. You can view this repository here. Notice that the repository has 2 files for each lesson webpage (named 00-lesson-topic), one is an html file (.html) that is displayed as a webpage while the other is a R markdown (.Rmd) file that will allow us to write R code and explanatory text in the same file. The html file is created by R Studio from the R markdown file. The repository also has a folder called “data”. This is where we will put data files used by our example code. Note that this repository can be downloaded by anyone on the internet. If you have data that you wish to use in class but do not wish to share with the entire WWW, you are welcome to email the data files to the rest of the class prior to your lesson.

During class you will be editing the course webpages to include R code for the lesson discussion that you lead. If you are familiar with git and have a GitHub account, you should fork the repository to your account so that you can make changes there. If that is gibberish to you- don’t worry! You can either choose to edit the course webpages by downloading them and emailing any edits to Jes (see computer setup instructions) or you can choose to learn a little git and create you own GitHub account during an optional class session during the first week.

References & Readings

Assigned Readings

All assigned readings can be obtained for free through Stanford library online resources.

  • Borcard, D., Gillet, F. and Legendre, P. (2011), Numerical Ecology with R. Springer. Stanford Full Text
  • Korner-Nievergelt et al. (2015), Bayesian data analysis in ecology using linear models with R, BUGS, and Stan. Elsevier. Stanford Full Text
  • Martin, T. G., Wintle, B. A., Rhodes, J. R., Kuhnert, P. M., Field, S. A., Low-Choy, S. J., Tyre, A. J. and Possingham, H. P. (2005), Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecology Letters, 8: 1235-1246. DOI:[10.1111/j.1461-0248.2005.00826.x](
  • Ramette, A. (2007), Multivariate analyses in microbial ecology. FEMS Microbiology Ecology, 62: 142-160. DOI: 10.1111/j.1574-6941.2007.00375.x
  • Wickham, H. (2016), ggplot2: Elegant Graphics for Data Analysis. Springer. Stanford Full Text
  • Zuur, A. F., Leno, E. N., Walker, N., Savliev, A. A. and Smith, G. M. (2009), Mixed Effect Models and Extensions in Ecology with R. Springer. Stanford Full Text
  • Zuur, A. F. (2010), A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution 1: 3-14. DOI: 10.1111/j.2041-210X.2009.00001.x

Additional Suggested References

  • Beale, C. M., Lennon, J. J., Yearsley, J. M., Brewer, M. J. and Elston, D. A. (2010), Regression analysis of spatial data. Ecology Letters, 13: 246-264. DOI:[10.1111/j.1461-0248.2009.01422.x](
  • Legendre, P. and L. Legendre. (2012), Numerical Ecology. Elsevier. Stanford Full Text
  • McElreath, R. (2016), Statistical rethinking: a Bayesian course with examples in R and Stan. CRC Press. Available in the Science and Marine libraries Author’s website
  • Venables, W. N., Smith, D. M. and the R Core Team. (2012), An Introduction to R: Notes on R: A Programming Environment for Data Analysis and Graphics. Version 2.15.1. full text
  • Wagner, H. H. and Fortin, M.-J. (2005), Spatial analysis of landscapes: concepts and statistics. Ecology, 86: 1975–1987. DOI: 10.1890/04-0914
  • Zuur, A. F., J. M. Hilbe and E. N. Ieno. (2013), A beginner’s guide to GLM and GLMM with R: a frequentist and Bayesian perspective for ecologists. Highland Statistics Ltd. Stanford Library Record