Assigned Reading:
Chapters 1.1, 1.2, 2, 3, 9 and 10 in: Wickham, H. 2009. ggplot2: Elegant Graphics for Data Analysis. Springer. DOI: Stanford Full Text
The sections on mutating and filtering joins in this vignette
Optional Reading:
Chapter 4 in: Wickham, H. 2009. ggplot2: Elegant Graphics for Data Analysis. Springer. DOI: Stanford Full Text
RStudio has produced several helpful cheatsheets on data management and visualization with ggplot2 and tidyr. You may want to view or print one or several of them.
Download this R script and save it in your code folder in the research project you created on the first day of class. You will need to click the ‘Raw’ button and then download the text file that appears. Be sure to remove the .txt extension that may have been added when you saved the file to your computer so that the file ends in .R. You may want to run the first part of the script that downloads the data to be sure that there will be no issues with data access during class.
dplyr package contains functions for manipulating data:
spread and gather change the format of data tables.separate and unite change the format of variables.filter creates a subset of a data table.mutate creates new variables.group_by and summarise are used to summarize observations according to the values of their variables.%>% is used to link several operations together (e.g. function composition).ggplot2 package is set of functions for visualizing data.
aes must minimally describe which variables define the plot coordinate space.facet_wrap creates plots for specified data subsets.Open the R project that you created on the first day of class. Then when R Studio opens, open the R script file that you downloaded for today’s class (see above). This script downloads a data set from our course website and then proceeds to manipulate and summarise the data. These data contain three tables from the Winter 2016 BIO46 class on lichen microorganisms:
The R script explores algal diversity among lichens and trees. Working in pairs, execute each line of code and then add a comment (using #) above each line of code with a description of what the code does.
When you have finished commenting the code, try to complete the following challenge:
Challenge:
How could you modify the code that creates the
lichenXgenotable so that it displays the fraction of successful sequences belonging to each GenotypeID for each lichen.
Working with your partner, create the following plots using ggplot2: