Getting started
The research crew wants to determine if there is a relationship between isotope levels and temperature. We have temperature data for each of the islands which have been helpfully uploaded to GitHub.
- https://raw.githubusercontent.com/tidy-MN/R-camp-penguins/main/public/data/Biscoe_temperatures.csv
- https://raw.githubusercontent.com/tidy-MN/R-camp-penguins/main/public/data/Dream_temperatures.csv
- https://raw.githubusercontent.com/tidy-MN/R-camp-penguins/main/public/data/Torgersen_temperatures.csv
Let’s get all of this data into R so we can combine it with our penguin data. Hint: use read_csv
.
biscoe <- read_csv("https://raw.githubusercontent.com/tidy-MN/R-camp-penguins/main/public/data/Biscoe_temperatures.csv")
dream <- read_csv("https://raw.githubusercontent.com/tidy-MN/R-camp-penguins/main/public/data/Dream_temperatures.csv")
torg <- read_csv("https://raw.githubusercontent.com/tidy-MN/R-camp-penguins/main/public/data/Torgersen_temperatures.csv")
Clean-up time
The island names are in the file name, but are not in the data itself. Let’s add the island names to use for joining later. Remember that R is picky about names matching, so it’s important to make sure the names are exactly correct including caps. The island names are Biscoe
, Dream
, and Torgersen
. Copy/paste is your friend here.
biscoe <- biscoe %>%
mutate(island = "Biscoe")
dream <- dream %>%
mutate(island = "Dream")
torg <- torg %>%
mutate(island = "Torgersen")
Do you notice anything about the dates? They are all in different formats! Luckily we have lubridate
to come to the rescue. Let’s convert those dates.
Putting it all together
To join the temperature data to our penguin data easily, we want to combine all 3 data frames into one data frame. Keeping in mind tidy data, what is the best way to do this?
Now that we have a tidy penguin data frame and a tidy temperature data frame, it’s time to combine them into one mega data frame. Both data frames have an island name and a date. What operation do we want to use? Which type is most appropriate here? The resulting data frame should have the same number of rows as the penguin data frame.