# SPU Scatterplot Two Quantitative Variables Question

1 assignment . Excel Instructions: ScatterplotAs you can see, our dataset contains the following variables:
gender: 0 = male, 1 = female.
height: in inches.
weight: in pounds.
First we will create a scatterplot to examine how weight is related to height, ignoring gender.
To do that in Excel:

Sort the data by gender:
Hold down the Control key (Command key on MacOS) and click the A key to select all
of the data in the worksheet.
Select the Home tab, then the Editing group Sort & Filter -> Custom Sort.
In the pop-up window, make sure that My list has headers box is checked and then
choose gender from the pull-down menu next to Sort by. Click OK.
Now select all of the data in columns B and C, select the Insert tab and in
the Charts group choose Scatter.
Choose the first scatterplot option (Scatter with only Markers).
Now we have a scatterplot, but the data is all on the right of the plot. To fix this:
Right-click on the x-axis, and choose Format Axis from the pop-up menu.
Make sure that Axis Options is selected on the left, and then next
to Minimum enter 50 in the textbox. Click the X button to close the menu.
Unit 3 Assignment 1: Scatterplot
In this exercise we will:
• Learn how to create a scatterplot.
• Use the scatterplot to examine the relationship between two quantitative variables.
• Learn how to create a labeled scatterplot.
• Use the labeled scatterplot to better understand the form of a relationship.
In this activity we explore the relationship between weight and height for 81 adults. We will use height as
the explanatory variable. Weight is the response variable.
We will then label the men and women by adding the categorical variable gender to the scatterplot. We
will see if separating the groups contributes to our understanding of the form of the relationship between
height and weight.
Question 1:
Describe the relationship between the height and weight of the subjects. To describe the relationship
write about the pattern (direction, form, and strength) and any deviations from the pattern (outliers).
So far we have studied the relationship between height and weight for all of the males and females
together. It may be interesting to examine whether the relationship between height and weight is different
for males and females. To visualize the effect of the third variable, gender, we will indicate in the
scatterplot which observations are males and which are females.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
Excel
Question 2:
Compare and contrast the relationship between height and weight for males and females. To compare
and contrast the relationships by gender write about the pattern (direction, form, and strength) and any
deviations from the pattern (outliers) for each group.
Discuss how the patterns for the two groups are similar and how they are different.
gender
height
0
0
0
1
1
1
1
1
0
0
0
0
0
1
1
1
0
0
1
0
1
1
1
1
1
1
1
1
1
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
72
67
65
67
63
54
66
64
72
66
66
71
71
57
66
67
70
73
68
72
65
70
64
64
63
60
69
65
67
67
68
65
62
66
65
63
73
69
70
72
73
69
68
71
71
68
69
67
66
67
72
weight
155
145
125
120
105
120
125
125
160
133
175
205
175
82
125
133
175
163
133
180
107
170
110
140
110
110
125
120
180
120
140
130
122
114
115
125
195
135
145
170
172
168
155
185
175
158
185
146
135
150
160
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
68
75
68
73
72
72
72
72
74
68
73
68
70
72
70
67
67
71
72
73
68
72
68
67
70
71
70
67
58
56
155
230
149
240
170
198
163
230
170
151
220
145
130
160
210
145
185
237
205
147
170
181
150
150
200
175
155
167
100
100
Unit 3 Assignment 2: Linear Relationships
In this activity we will:
• Learn how to compute the correlation.
• Practice interpreting the value of the correlation.
• See an example of how including an outlier can increase the correlation.
Recall the following example: The average gestation period, or time of pregnancy, of an animal is closely
related to its longevity—the length of its lifespan. Data on the average gestation period and longevity (in
captivity) of 40 different species of animals have been recorded.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
Remember that the correlation is only an appropriate measure of the linear relationship between two
quantitative variables. First produce a scatterplot to verify that gestation and longevity are nearly linear in
their relationship.
Instructions
Click on the link to see instructions for completing the activity, and then answer the questions below.
Here’s a reminder of how to do this in Excel:

Select all of the data in columns B and C, and then in the Insert tab choose Scatter in
the Charts group.
Choose the first scatterplot option (Scatter with only Markers).
Observe that the relationship between gestation period and longevity is linear and positive. Now we will
compute the correlation between gestation period and longevity.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
To do that in Excel:

Click in a cell outside of the first three columns of data.
Type =correl(B2:B41,C2:C41)
Question 1:
Report the correlation between gestation and longevity and comment on the strength and direction of the
relationship. Interpret your findings in context.
Now return to the scatterplot that you created earlier. Notice that there is an outlier in both longevity (40
years) and gestation (645 days). Note: This outlier corresponds to the longevity and gestation period of
the elephant.
What do you think will happen to the correlation if we remove this outlier?
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
To do this in Excel:

Scroll down to row 16 of the data. You will see that this contains the values of
the variables for the elephant.
Click on the row header 16 to select the entire row of data.
Right-click and choose Delete from the pop-up menu to delete the row.
You will see that the values of the variables for the elephant have been removed from
the data. Notice also that the correlation between gestation and longevity has changed.
Question 2:
Report the new value for the correlation between gestation and longevity and compare it to the value you
found earlier when the outlier was included. What is it about this outlier that results in the fact that its
inclusion in the data causes the correlation to increase? (Hint: look at the scatterplot.)
Comment
In the last activity, we saw an example where there was a positive linear relationship between the two
variables, and including the outlier just “strengthened” it. Consider the hypothetical data displayed by the
following scatterplot:
In this case, the low outlier gives an “illusion” of a positive linear relationship, whereas in reality, there is
no linear relationship between X and Y.
animal
gestation longevity
baboon
187
20
bear, black
219
18
bear, grizzly
225
25
bear, polar
240
20
beaver
122
5
buffalo
278
15
camel
406
12
cat
63
12
chimpanzee
231
20
chipmunk
31
6
cow
284
15
deer
201
8
dog
61
12
donkey
365
12
elephant
645
40
elk
250
15
fox
52
7
giraffe
425
10
goat
151
8
gorilla
257
20
guinea pig
68
4
hippopotamus
238
25
horse
330
20
kangaroo
42
7
leopard
98
12
lion
100
15
monkey
164
15
moose
240
12
mouse
21
3
opossum
15
1
pig
112
10
puma
90
12
rabbit
31
5
rhinoceros
450
15
sea lion
350
12
sheep
154
12
squirrel
44
10
tiger
105
16
wolf
63
5
zebra
365
15
Year
1896
1900
1904
1908
1912
1920
1924
1928
1932
1936
1948
1952
1956
1960
1964
1968
1972
1976
1980
1984
1988
1992
1996
2000
Time
273.2
246
245.4
243.4
236.8
241.8
233.6
233.2
231.2
227.8
229.8
225.1
221.2
215.6
218.1
214.9
216.3
219.2
218.4
212.53
215.96
220.12
215.78
212.32
1 assignment . Excel Instructions: Scatterplot
As you can see, our dataset contains the following variables:
gender: 0 = male, 1 = female.
height: in inches.
weight: in pounds.
First we will create a scatterplot to examine how weight is related to height, ignoring gender.
To do that in Excel:

Sort the data by gender:
Hold down the Control key (Command key on MacOS) and click the A key to select all
of the data in the worksheet.
Select the Home tab, then the Editing group Sort & Filter -> Custom Sort.
In the pop-up window, make sure that My list has headers box is checked and then
choose gender from the pull-down menu next to Sort by. Click OK.
Now select all of the data in columns B and C, select the Insert tab and in
the Charts group choose Scatter.
Choose the first scatterplot option (Scatter with only Markers).
Now we have a scatterplot, but the data is all on the right of the plot. To fix this:
Right-click on the x-axis, and choose Format Axis from the pop-up menu.
Make sure that Axis Options is selected on the left, and then next
to Minimum enter 50 in the textbox. Click the X button to close the menu.
Unit 3 Assignment 1: Scatterplot
In this exercise we will:
• Learn how to create a scatterplot.
• Use the scatterplot to examine the relationship between two quantitative variables.
• Learn how to create a labeled scatterplot.
• Use the labeled scatterplot to better understand the form of a relationship.
In this activity we explore the relationship between weight and height for 81 adults. We will use height as
the explanatory variable. Weight is the response variable.
We will then label the men and women by adding the categorical variable gender to the scatterplot. We
will see if separating the groups contributes to our understanding of the form of the relationship between
height and weight.
Question 1:
Describe the relationship between the height and weight of the subjects. To describe the relationship
write about the pattern (direction, form, and strength) and any deviations from the pattern (outliers).
So far we have studied the relationship between height and weight for all of the males and females
together. It may be interesting to examine whether the relationship between height and weight is different
for males and females. To visualize the effect of the third variable, gender, we will indicate in the
scatterplot which observations are males and which are females.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
Excel
Question 2:
Compare and contrast the relationship between height and weight for males and females. To compare
and contrast the relationships by gender write about the pattern (direction, form, and strength) and any
deviations from the pattern (outliers) for each group.
Discuss how the patterns for the two groups are similar and how they are different.
gender
height
0
0
0
1
1
1
1
1
0
0
0
0
0
1
1
1
0
0
1
0
1
1
1
1
1
1
1
1
1
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
72
67
65
67
63
54
66
64
72
66
66
71
71
57
66
67
70
73
68
72
65
70
64
64
63
60
69
65
67
67
68
65
62
66
65
63
73
69
70
72
73
69
68
71
71
68
69
67
66
67
72
weight
155
145
125
120
105
120
125
125
160
133
175
205
175
82
125
133
175
163
133
180
107
170
110
140
110
110
125
120
180
120
140
130
122
114
115
125
195
135
145
170
172
168
155
185
175
158
185
146
135
150
160
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
68
75
68
73
72
72
72
72
74
68
73
68
70
72
70
67
67
71
72
73
68
72
68
67
70
71
70
67
58
56
155
230
149
240
170
198
163
230
170
151
220
145
130
160
210
145
185
237
205
147
170
181
150
150
200
175
155
167
100
100
Unit 3 Assignment 2: Linear Relationships
In this activity we will:
• Learn how to compute the correlation.
• Practice interpreting the value of the correlation.
• See an example of how including an outlier can increase the correlation.
Recall the following example: The average gestation period, or time of pregnancy, of an animal is closely
related to its longevity—the length of its lifespan. Data on the average gestation period and longevity (in
captivity) of 40 different species of animals have been recorded.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
Remember that the correlation is only an appropriate measure of the linear relationship between two
quantitative variables. First produce a scatterplot to verify that gestation and longevity are nearly linear in
their relationship.
Instructions
Click on the link to see instructions for completing the activity, and then answer the questions below.
Here’s a reminder of how to do this in Excel:

Select all of the data in columns B and C, and then in the Insert tab choose Scatter in
the Charts group.
Choose the first scatterplot option (Scatter with only Markers).
Observe that the relationship between gestation period and longevity is linear and positive. Now we will
compute the correlation between gestation period and longevity.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
To do that in Excel:

Click in a cell outside of the first three columns of data.
Type =correl(B2:B41,C2:C41)
Question 1:
Report the correlation between gestation and longevity and comment on the strength and direction of the
relationship. Interpret your findings in context.
Now return to the scatterplot that you created earlier. Notice that there is an outlier in both longevity (40
years) and gestation (645 days). Note: This outlier corresponds to the longevity and gestation period of
the elephant.
What do you think will happen to the correlation if we remove this outlier?
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
To do this in Excel:

Scroll down to row 16 of the data. You will see that this contains the values of
the variables for the elephant.
Click on the row header 16 to select the entire row of data.
Right-click and choose Delete from the pop-up menu to delete the row.
You will see that the values of the variables for the elephant have been removed from
the data. Notice also that the correlation between gestation and longevity has changed.
Question 2:
Report the new value for the correlation between gestation and longevity and compare it to the value you
found earlier when the outlier was included. What is it about this outlier that results in the fact that its
inclusion in the data causes the correlation to increase? (Hint: look at the scatterplot.)
Comment
In the last activity, we saw an example where there was a positive linear relationship between the two
variables, and including the outlier just “strengthened” it. Consider the hypothetical data displayed by the
following scatterplot:
In this case, the low outlier gives an “illusion” of a positive linear relationship, whereas in reality, there is
no linear relationship between X and Y.
Unit 3 Assignment 3: Linear Regression
In this activity we will:

Find a regression line and plot it on the scatterplot.
Examine the effect of outliers on the regression line.
Use the regression line to make predictions and evaluate how reliable these predictions are.
Background
The modern Olympic Games have changed dramatically since their inception in 1896. For example, many
commentators have remarked on the change in the quality of athletic performances from year to year.
Regression will allow us to investigate the change in winning times for one event—the 1,500 meter race.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
Excel Instructions: Linear Regression
To open Excel with the data in the worksheet, click and it will automatically
Our dataset contains the following variables:
Year: the year of the Olympic Games, from 1896 to 2000.
Time: the winning time for the 1,500 meter race, in seconds.
First, let’s explore the relationship between the two quantitative variables—year and
time. Produce a scatterplot and use it to verify that year and time are nearly linear in
their relationship.
Here’s a reminder of how to do this in Excel:

Select all of the data in columns A and B, and then select the Insert tab and
choose Scatter in the Charts group next to the Recommended Charts button.
Choose the first scatterplot option (Scatter with only Markers).
Now we have a scatterplot, but the data is all towards the top of the plot. To fix
this:
Right-click on the Y axis, and choose Format Axis from the bottom of the

In the Format Axis window make sure that Axis Options button is selected,
find the Minimum option and enter 200 in the textbox. Click the X button to
Observe that the form of the relationship between the 1,500 meter race’s winning time and the year is
linear. The least squares regression line is therefore an appropriate way to summarize the relationship
and examine the change in winning times over the course of the last century. We will now find the least
squares regression line and plot it on a scatterplot.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
To do that in Excel:

Right-click on one of the points in the graph and choose Add Trendline from
Make sure that the Trendline Options tab is selected in the Format
Trendline menu box displayed on the right, and then scroll down and check the
boxes next to Display Equation on chart and Display R-squared value on
chart.
Click the X button on the menu window to close it.
(You should see a graph window with the scatterplot and the regression line plotted on
it. The regression line equation appears on the graph next to the line itself.)
Question 1:
Give the equation for the least squares regression line, and interpret it in context.
Instructions
Click on the link corresponding to see instructions for completing the activity, and then answer the
questions below.
Our dataset contains the following variables:
Year: the year of the Olympic Games, from 1896 to 2000.
Time: the winning time for the 1,500 meter race, in seconds.
First, let’s explore the relationship between the two quantitative variables—year and
time. Produce a scatterplot and use it to verify that year and time are nearly linear in
their relationship.
Here’s a reminder of how to do this in Excel:

Select all of the data in columns A and B, and then select the Insert tab and
choose Scatter in the Charts group next to the Recommended Charts button.
Choose the first scatterplot option (Scatter with only Markers).
Now we have a scatterplot, but the data is all towards the top of the plot. To fix
this:
Right-click on the Y axis, and choose Format Axis from the bottom of the
In the Format Axis window make sure that Axis Options button is selected,
find the Minimum option and enter 200 in the textbox. Click the X button to
Question 2:
Give the equation for this new line and compare it with the line you found for the whole dataset,
commenting on the effect of the outlier.
Question 3:
Our least squares regression line associates years as an explanatory variable, with times in the 1,500
meter race as the response variable. Use the least squares regression line you found in question 2 to
predict the 1,500 meter time in the 2008 Olympic Games in Beijing. Comment on your prediction.
animal
gestation longevity
baboon
187
20
bear, black
219
18
bear, grizzly
225
25
bear, polar
240
20
beaver
122
5
buffalo
278
15
camel
406
12
cat
63
12
chimpanzee
231
20
chipmunk
31
6
cow
284
15
deer
201
8
dog
61
12
donkey
365
12
elephant
645
40
elk
250
15
fox
52
7
giraffe
425
10
goat
151
8
gorilla
257
20
guinea pig
68
4
hippopotamus
238
25
horse
330
20
kangaroo
42
7
leopard
98
12
lion
100
15
monkey
164
15
moose
240
12
mouse
21
3
opossum
15
1
pig
112
10
puma
90
12
rabbit
31
5
rhinoceros
450
15
sea lion
350
12
sheep
154
12
squirrel
44
10
tiger
105
16
wolf
63
5
zebra
365
15
Year
1896
1900
1904
1908
1912
1920
1924
1928
1932
1936
1948
1952
1956
1960
1964
1968
1972
1976
1980
1984
1988
1992
1996
2000
Time
273.2
246
245.4
243.4
236.8
241.8
233.6
233.2
231.2
227.8
229.8
225.1
221.2
215.6
218.1
214.9
216.3
219.2
218.4
212.53
215.96
220.12
215.78
212.32

Pages (275 words)
Standard price: \$0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back