# 12 Answer Key

## 12.1 Chapter 4 - Object Types in R Programming

- This object was an array.
- This object was a vector.
- This would output a vector.
- This would output a data frame.
- To output a factor, you would run the following code:

## 12.2 Chapter 5 - How to Filter and Transform Data in Base R

- Filter the following vector to values greater than 2

```
q1 <- seq(1,20,2)
q1[q1 > 2]
```

- Filter the following vector to values between 20 and 30, but only for the first three entries that meet that criteria. (Hint: add
`[n:n]`

for the range of values after you determine which values meet that criteria)

- Multiply the following matrices together.

- Subtract 41 from every entry in the second column of the following matrix. Replace the column with those new values.

- Select the second row each matrix in the following array. Subtract 5 from those rows.

```
q5 <- array(data=c(matrix(seq(1,15,1),5,3),
matrix(seq(4,60,4),5,3),
matrix(seq(2,30,2),5,3)),
dim=c(5,3,3))
q5
q5[2,,]-5
```

- Filter the following data frame to Bond films starring Roger Moore.

` bond[bond["actor"]=="Roger Moore",]`

- Filter the following data frame to Bond films starring Sean Connery made after 1966.

` bond[bond["actor"]=="Sean Connery" & bond["year"] > 1966,]`

## 12.3 Chapter 6 - How to Filter and Transform Data with the Dplyr Package

- You would use the
`%>%`

notation,`filter()`

function, and the operates`|`

,`==`

, and`>`

to accomplish this.

- In addition to the same script as above, you would use the
`select()`

function to reduce the columns.

- Instead of using
`select()`

in the previous script, you would use`transmute()`

. This function allows you to both transform a column and select only those that are mentioned.

```
data(mtcars)
library(dplyr)
mtcars %>%
filter(gear==4 | hp > 115) %>%
transmute(mpg_log=log(mpg),cyl,gear,hp)
```

- You would use the
`filter()`

,`group_by()`

, and`summarize()`

functions to pull this summary data.

## 12.4 Chapter 7 - Understanding and Using R Packages

- To install the
`tidyverse`

set of packages, run the script`install.packages("tidyverse")`

. - To load the
`dplyr`

package, run the script`library(dplyr)`

.

## 12.5 Chapter 8 - How to Write Functions

- Modify the simply standard deviation function we wrote and change it to calculate mean. Do this without using the built-in
`mean`

function.

- Alter the
`summary.group`

function to include median, minimum, and maximum values.

```
summary.group <- function(data,group,field) {
groups <- levels(factor(data[,paste(group)]))
output <- data.frame(group=character(),
mean=numeric(),
sd=numeric(),
median=numeric(),
minimum=numeric(),
maximum=numeric())
for(i in 1:length(groups)) {
subdata <- data[data[,paste(group)]==groups[i],
paste(field)]
output[i,1:6] <- data.frame(groups[i],
mean(subdata),
sd(subdata),
median(subdata),
min(subdata),
max(subdata))
}
output
}
```

- Write a function for the
*Fibonacci Sequence*, which ends at a number you choose. You’ll need to use a control flow to accomplish this and a default value for the end of the sequence. (Hint: You won’t use the`for(var in seq) expr`

control flow. Execute`?Control`

to use a different version.)

## 12.6 Chapter 10 - How to Plot Data in R

- Use the
`ggplot()`

,`aes()`

, and`geom_point()`

functions to construct a plot.

- Simply add
`factor(cyl)`

to the**color**argument in the`aes()`

function.

```
library(ggplot2)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg,color=factor(cyl))) +
geom_point(size=3)
```

- Use the
**x**,**y**, and**color**arguments in the`labs()`

function to use a more intuitive naming convention.

```
library(ggplot2)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg,color=factor(cyl))) +
geom_point(size=3) +
labs(x="Horsepower",
y="Miles Per Gallon",
color="Cylinders")
```

- Use the
**title**argument in the`labs()`

function.

```
library(ggplot2)
data(mtcars)
ggplot(data=mtcars,
mapping=aes(x=hp,y=mpg,color=factor(cyl))) +
geom_point(size=3) +
labs(x="Horsepower",
y="Miles Per Gallon",
color="Cylinders",
title="Car Performance")
```

- Use the
`theme_few()`

theme from the`ggthemes`

package.

## 12.7 Chapter 11 - Statistical Functions in R

- Use the
`summary()`

function on your model to determine model performance, such as p-values.

` summary(PracticeModel)`

- Use the
`predict()`

function to make model predictions on a new data set.

```
NewData <- data.frame(Sepal.Length=5,Sepal.Width=3.25)
predict(PracticeModel,NewData)
```

- Use the
`confint()`

function to determine the confident intervals for a model.

` confint(PracticeModel)`