# 12 Answer Key

## 12.1 Chapter 4 - Object Types in R Programming

1. This object was an array.
2. This object was a vector.
3. This would output a vector.
4. This would output a data frame.
5. To output a factor, you would run the following code:
data(mtcars)
factor(mtcars\$gear)

## 12.2 Chapter 5 - How to Filter and Transform Data in Base R

1. Filter the following vector to values greater than 2
  q1 <- seq(1,20,2)
q1[q1 > 2]
1. Filter the following vector to values between 20 and 30, but only for the first three entries that meet that criteria. (Hint: add [n:n] for the range of values after you determine which values meet that criteria)
  q2 <- round(rnorm(20,32,7),0)
q2 >= 20 & q2 <= 30
q2[q2 >= 20 & q2 <= 30][1:3]
1. Multiply the following matrices together.
  q3_1 <- matrix(round(seq(1,40,3.27),0),3)
q3_2 <- matrix(seq(1,8,1),4)
q3_1 %*% q3_2
1. Subtract 41 from every entry in the second column of the following matrix. Replace the column with those new values.
  q4 <- matrix(seq(1,120,4),10,3)
q4
q4[,2] <- q4[,2] - 41
q4
1. Select the second row each matrix in the following array. Subtract 5 from those rows.
  q5 <- array(data=c(matrix(seq(1,15,1),5,3),
matrix(seq(4,60,4),5,3),
matrix(seq(2,30,2),5,3)),
dim=c(5,3,3))
q5
q5[2,,]-5
1. Filter the following data frame to Bond films starring Roger Moore.
  bond[bond["actor"]=="Roger Moore",]
1. Filter the following data frame to Bond films starring Sean Connery made after 1966.
  bond[bond["actor"]=="Sean Connery" & bond["year"] > 1966,]

## 12.3 Chapter 6 - How to Filter and Transform Data with the Dplyr Package

1. You would use the %>% notation, filter() function, and the operates |, ==, and > to accomplish this.
  data(mtcars)
library(dplyr)
mtcars %>% filter(gear==4 | hp > 115)
1. In addition to the same script as above, you would use the select() function to reduce the columns.
  data(mtcars)
library(dplyr)
mtcars %>%
filter(gear==4 | hp > 115) %>%
select(mpg,cyl,gear,hp)
1. Instead of using select() in the previous script, you would use transmute(). This function allows you to both transform a column and select only those that are mentioned.
  data(mtcars)
library(dplyr)
mtcars %>%
filter(gear==4 | hp > 115) %>%
transmute(mpg_log=log(mpg),cyl,gear,hp)
1. You would use the filter(), group_by(), and summarize() functions to pull this summary data.
  data(mtcars)
library(dplyr)
mtcars %>%
filter(wt > 2) %>%
group_by(gear) %>%
summarize(avg_mpg=mean(mpg))

## 12.4 Chapter 7 - Understanding and Using R Packages

1. To install the tidyverse set of packages, run the script install.packages("tidyverse").
2. To load the dplyr package, run the script library(dplyr).

## 12.5 Chapter 8 - How to Write Functions

1. Modify the simply standard deviation function we wrote and change it to calculate mean. Do this without using the built-in mean function.
  avg.simple <- function(data,field) {
field <- data[,paste(field)]
sum(field)/length(field)
}
1. Alter the summary.group function to include median, minimum, and maximum values.
  summary.group <- function(data,group,field) {
groups <- levels(factor(data[,paste(group)]))
output <- data.frame(group=character(),
mean=numeric(),
sd=numeric(),
median=numeric(),
minimum=numeric(),
maximum=numeric())
for(i in 1:length(groups)) {
subdata <- data[data[,paste(group)]==groups[i],
paste(field)]
output[i,1:6] <- data.frame(groups[i],
mean(subdata),
sd(subdata),
median(subdata),
min(subdata),
max(subdata))
}
output
}
1. Write a function for the Fibonacci Sequence, which ends at a number you choose. You’ll need to use a control flow to accomplish this and a default value for the end of the sequence. (Hint: You won’t use the for(var in seq) expr control flow. Execute ?Control to use a different version.)
  fib <- function(end=55){
x <- c(0,1)
n <- length(x)
while(x[n]<end){
x[n+1] <- x[n]+x[n-1]
n <- length(x)
}
x
}

## 12.6 Chapter 10 - How to Plot Data in R

1. Use the ggplot(), aes(), and geom_point() functions to construct a plot.
  library(ggplot2)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg)) +
geom_point(size=3)
1. Simply add factor(cyl) to the color argument in the aes() function.
  library(ggplot2)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg,color=factor(cyl))) +
geom_point(size=3)
1. Use the x, y, and color arguments in the labs() function to use a more intuitive naming convention.
  library(ggplot2)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg,color=factor(cyl))) +
geom_point(size=3) +
labs(x="Horsepower",
y="Miles Per Gallon",
color="Cylinders")
1. Use the title argument in the labs() function.
  library(ggplot2)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg,color=factor(cyl))) +
geom_point(size=3) +
labs(x="Horsepower",
y="Miles Per Gallon",
color="Cylinders",
title="Car Performance")
1. Use the theme_few() theme from the ggthemes package.
  library(ggplot2)
library(ggthemes)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg,color=factor(cyl))) +
geom_point(size=3) +
labs(x="Horsepower",
y="Miles Per Gallon",
color="Cylinders",
title="Car Performance") +
theme_few()

## 12.7 Chapter 11 - Statistical Functions in R

1. After loading your data with data(), use the lm() function to build a model.
  data(iris)
PracticeModel <- lm(Petal.Length~Sepal.Length+Sepal.Width,data=iris)
1. Use the summary() function on your model to determine model performance, such as p-values.
  summary(PracticeModel)
1. Use the predict() function to make model predictions on a new data set.
  NewData <- data.frame(Sepal.Length=5,Sepal.Width=3.25)
predict(PracticeModel,NewData)
1. Use the confint() function to determine the confident intervals for a model.
  confint(PracticeModel)