Assigned Reading:
Chapters 1.1, 1.2, 2, 3, 9 and 10 in: Wickham, H. 2009. ggplot2: Elegant Graphics for Data Analysis. Springer. DOI: Stanford Full Text
The sections on mutating and filtering joins in this vignette
Optional Reading:
Chapter 4 in: Wickham, H. 2009. ggplot2: Elegant Graphics for Data Analysis. Springer. DOI: Stanford Full Text
RStudio has produced several helpful cheatsheets on data management and visualization with ggplot2 and tidyr. You may want to view or print one or several of them.
Download this R script and save it in your code folder in the research project you created on the first day of class. You will need to click the ‘Raw’ button and then download the text file that appears. Be sure to remove the .txt extension that may have been added when you saved the file to your computer so that the file ends in .R. You may want to run the first part of the script that downloads the data to be sure that there will be no issues with data access during class.
dplyr
package contains functions for manipulating data:
spread
and gather
change the format of data tables.separate
and unite
change the format of variables.filter
creates a subset of a data table.mutate
creates new variables.group_by
and summarise
are used to summarize observations according to the values of their variables.%>%
is used to link several operations together (e.g. function composition).ggplot2
package is set of functions for visualizing data.
aes
must minimally describe which variables define the plot coordinate space.facet_wrap
creates plots for specified data subsets.Open the R project that you created on the first day of class. Then when R Studio opens, open the R script file that you downloaded for today’s class (see above). This script downloads a data set from our course website and then proceeds to manipulate and summarise the data. These data contain three tables from the Winter 2016 BIO46 class on lichen microorganisms:
The R script explores algal diversity among lichens and trees. Working in pairs, execute each line of code and then add a comment (using #
) above each line of code with a description of what the code does.
When you have finished commenting the code, try to complete the following challenge:
Challenge:
How could you modify the code that creates the
lichenXgeno
table so that it displays the fraction of successful sequences belonging to each GenotypeID for each lichen.
Working with your partner, create the following plots using ggplot2
: