# DASC 512 AFIT Practice Worker Mistakes Are Regular and Uniform Problem

detailed explanation and corresponding IPYNB file information in explanation needs to match the python code

1

Cover Sheet and Instructions

There are five (5) pages and nine (9) questions. Submissions must include both a PDF fully detailing

your response to the questions (i.e., results, narrative, tables, and graphs) along with any Python code you

used in ipynb format. Code does not need to be part of the PDF.

Instructions: For all problems, be sure to give full details of your analysis.

I recommend using an Assume, Given, Find, Solution, Answer method to organize your thoughts and

response for problems that are not hypothesis tests. This is not required, but it helps guide your thought

process.

For hypothesis tests, be sure to include:

• a non-technical summary of results

• hypothesis statements

• assumptions

• test chosen with justification

• significance level

• appropriate results, such as test statistic, p-value, rejection region, confidence intervals, and/or ANOVA

tables

• technical conclusion

Examples of well-formulated solutions are given on the next page.

1

2

Example Solutions

Example Problem 1: In the board game Gloomhaven, characters start with decks of 20 cards that

provide modifiers for an attack: 1 miss (0 damage), 1 -2 (base – 2 damage), 5 -1 (base – 1 damage), 6 +0

(base damage), 5 +1 (base + 1 damage), 1 +2 (base + 2 damage), and 1 critical hit (2x base damage). A

character attacks with an 3-damage attack and uses advantage — taking the higher of two random modifiers.

What is the probability that they do at least 4 damage?

Assume: The full deck is available and well shuffled. Let X be the damage from a single card draw. Let

Y be the damage with advantage.

Given: For X, P(0) = 1/20, P(1) = 1/20, P(2) = 5/20, P(3) = 6/20, P(4) = 5/20, P(5) = 1/20, P(6) =

1/20.

Find: P (Y ≥ 4)

Solution: On the first draw, P (X ≥ 4) = 7/20. If the first draw results in less than 4 damage, then

P (Y ≥ 4) = 7/20. If the first draw results in at least 4 damage, then (P (Y ≥ 4) = 1. Thus

P (Y ≥ 4) = P (Y ≥ 4|X ≥ 4) + P (Y ≥ 4|X < 4) = (7/20)(1) + (13/20)(7/20) = (231/400) = 0.5775
Answer: The probability that they do at least 4 damage is
231
400
= 0.5775.
Example Problem 2: Chris believes that he rolls dice worse than Beau. They each roll a six-sided die
100 times and record their rolls. Chris rolls a total of 325. Beau rolls a total of 375. Perform a hypothesis
test to determine if Beau rolls better than Chris.
BLUF: This experiment gave sufficient evidence to conclude that Beau rolls better than Chris.
Hypotheses: H0 : µB − µC = 0, Ha : µB − µC > 0.

Assumptions: Both dice are fair dice with 1/6 chance of rolling each outcome — a uniform distribution

between 1 and 6. Each roll is iid. Although the underlying distribution is uniform, the sample size is 100 for

both groups, so assume the population is normally distributed with µ = 3.5, σ 2 = 35/12 — the mean and

variance of a discrete uniform distribution with 6 outcomes.

Type of test: Two-sample t-test with equal variance. Sample size is sufficient for the Central Limit

Theorem to apply to the sampling distribution despite underlying uniform distributions.

Significance: The risk of a type-I error is low, so let α = 0.1.

Test statistic: t = 2.07

Rejection region: t > 1.29

P-value: p = 0.0199

Confidence interval: With 10% confidence, Beau’s rolls are better than Chris’s by between 0.22 and 0.78

on average.

Conclusion: At the 0.1 significance level, we reject the null hypothesis that Chris and Beau roll equally

well and conclude that Beau’s rolls are higher on average than Chris’s.

2

Problem 1: 5 points

Suppose that there are four inspectors at a film factory who are supposed to stamp the expiration date

on each package of film at the end of the assembly line: John, Tina, Wayne, and Amy. John processes

20% of all packages, and he fails to stamp 1/200 packages that he processes. Tina processes 60% of all

packages, and she fails to stamp 1/100 packages. Wayne processes 15% of all packages, and he fails to

stamp 1/90 packages. Amy processes 5% of all packages, and she fails to stamp 1/200 packages.

A customer calls to complain that her package of film does not show an expiration date. What is

the probability that it was inspected by John?

Problem 2: 5 points

A regional telephone company operates three identical relay stations at different locations. During a

one-year period, the number of malfunctions reported by each station and the causes are shown below.

Causes

Problems with Electricity Supplied

Computer Malfunction

Malfunctioning Equipment

Human Error

Station

A B C

2 6 4

4 3 1

3 4 3

9 4 7

Suppose that a malfunction was reported and it was found to be caused by human error. What is

the probability that it came from Station C?

Problem 3: 10 points

Describe the effect of sample size, effect size, and level of significance on statistical power.

Problem 4: 5 points

Suppose that we are interested in the IQ of incoming students at AFIT. We want to run a test that

will detect a 5 IQ point difference between the true mean and our new class. From previous research,

we can assume a standard deviation of 15 (i.e., the parameter is known), and leadership wants to have

options for α = 0.05 and α = 0.1. Create a plot for a power analysis where the y-axis is the power of

our test and the x-axis is the sample size. It should have a line for each significance level.

3

Problem 5: 5 points

The accuracy of a new precision air drop system being tested by the US Air Force follows a normal

distribution with a mean of 50 ft and a standard deviation of 10 ft. A particular resupply mission drops

12 payloads. It is considered to be successful if at least 9 of the 12 payloads are delivered at between

45 and 60 feet. What is the probability that the resupply mission will be successful?

Problem 6: 5 points

The diameter of steel rods manufactured by an extruder follows a normal distribution. A sample of

rods were taken at random and the diameter of each rod is recorded in the data file title steel.txt.

As the quality control manager for the manufacturing company, it is your job to report evidence that

the population standard deviation of rods exceeds 3.3 to the factory manager. Based on the data in the

sample, should you report anything to the manager?

Problem 7: 5 points

A random sample of 15 water samples is tested for chlorine content. The results of these tests (recorded

in ppm of chlorine) are captured in the data file title chlorine.txt. Assume the population data

are normally distributed. Test the hypothesis that the mean chlorine content is less than 71ppm and

interpret the results.

Problem 8: 5 points

The data represented in dental_crown.csv represents the data collected on hardness of dental gold

used on teeth. Column 1 is the dentist who applied the crown, column 2 is the method used, column

3 is the alloy used, and column 4 is the response (diamond pyramid hardness). Construct an ANOVA

table and perform hypothesis tests to determine which (if any) factors and interactions are significant.

4

Problem 9: 55 points

The data given in faithful.csv records data of 272 eruptions at the Old Faithful geyser at Yellowstone

National Park. Column 1 reports the length of the eruption (in minutes) and column 2 reports the length

of the interval between the previous eruption and this one (i.e., wait time).

(a): 5 points

Create a scatterplot of the data, with eruptions as the x-axis and wait times as the y-axis.

(b): 5 points

Restricting your analysis to eruptions less than 3 minutes long, perform a test to determine if the wait

time averages less than 60 minutes. Use α = 0.05.

(c): 5 points

Create a confidence interval for the mean value of the wait time for eruptions less than 3 minutes long.

(d): 5 points

Restricting your analysis to eruptions at least 3 minutes long, perform a test to determine if the wait

time averages less than 80 minutes. Use α = 0.05.

(e): 5 points

Create a confidence interval for the mean value of the wait time for eruptions at least 3 minutes long.

(f ): 5 points

Create box-plots for scenarios where eruptions last less than 3 minutes and at least 3 minutes. Plot the

box-plots on the same axis for easy comparison (this should be two box-plots on one graph).

(g): 5 points

Conduct a test to determine if the difference in average wait time between the two scenarios is equal to

20 minutes.

(h): 5 points

Create a confidence interval for the mean difference in wait times between eruptions that last less than

3 minutes and those that last at least 3 minutes.

(i): 5 points

Create a histogram for the wait times.

(j): 5 points

Create a QQ-Plot to assess normality of wait times. State your conclusion.

(k): 5 points

Perform an analytical test for normality on wait times. State your conclusion.

5

Mid-Term Review

Week 1

DASC 512

Mid-Term Review

Week 2

DASC 512

Probability – Events

An event is a set of outcomes of an experiment

Usually denoted by a capital letter

May be one (simple event) or more (complex event) outcomes

Probability – Set operators

Union – ? ∪ ?: Either A or B (or both) occur

Intersection – ? ∩ ?: Both A and B occur

Complement – ?? : A does not occur

Probability – Partitions

Any set of events that is both exhaustive and mutually exclusive is a

partition.

∑? ?? = 1,

? ?? ∩ ?? = 0, for all ?, ?

Probability – Additive Rule

? ?∪? =? ? +? ? −? ?∩?

Conditional Probability

If B occurs, the probability that A also occurred is

? ??

This is read “A given B”

? ?∩?

? ?? =

? ?

Law of Total Probability

If the set of events ?1 , ?2 , … , ?? is a partition (mutually exclusive and

exhaustive)

? ? = ?? ? ?1 ? ?1 + ? ? ?2 ? ?2 + ⋯ + ? ? ?? ? ??

= ? ? ?? ? ??

?=1

Probability – Bayes’s Rule

? ?∩?

? ?? =

,

? ?

? ?∩?

? ?? ? ?

=

,

? ?

? ?

Additive Rule

Bayes′s Rule

? ?? ? ?

? ?? ? ?

=

,

?

?

? ?

? ? ? ? ? +? ? ? ? ?

Total Probability

Probability Example

As of 3 Feb 2022, The Economist’s model of the French election:

The Economist’s French election model | The Economist

Two-stage election: top two candidates in first round face each other in a

face-off election.

Probability Example

The Economist predicts these probabilities of candidates being in the runoff election:

Macron: 0.91

Le Pen: 0.48

Pecresse: 0.33

Zemmour: 0.21

Melenchon: 0.05

Probability Example

The probability of each pair being in the run-off is

Macron: 0.91

Le Pen: 0.48

Pecresse: 0.33

Macron Pecresse Le Pen Zemmour Melenchon

Zemmour: 0.21

Macron

0.26

0.45

0.18

0.01

Melenchon: 0.05

Pecresse

0.26

0.02

0.01

0.03

Le Pen

Zemmour

Melenchon

Other

Total

0.45

0.18

0.01

0.01

0.91

0.02

0.01

0.03

0.01

0.33

0.01

0.01

0.01

0.01

0.48

0.21

0.05

Probability Example

If Valerie Pecresse is in the run-off, what is the probability that Emmanuel

Macron is her opponent?

Macron Pecresse

Macron

0.26

Pecresse

0.26

Le Pen

0.45

0.02

Zemmour 0.18

0.01

Melenchon 0.01

0.03

Other

0.01

0.01

Total

0.91

0.33

Le Pen Zemmour Melenchon

0.45

0.18

0.01

0.02

0.01

0.03

0.01

0.01

0.01

0.01

0.48

0.21

0.05

Probability Example

If Valerie Pecresse is in the run-off, what is the probability that Emmanuel

Macron is her opponent?

? ?∩?

0.26

? ?? =

=

= 0.7879

? ?

0.33

Macron Pecresse

Macron

0.26

Pecresse

0.26

Le Pen

0.45

0.02

Zemmour 0.18

0.01

Melenchon 0.01

0.03

Other

0.01

0.01

Total

0.91

0.33

Le Pen Zemmour Melenchon

0.45

0.18

0.01

0.02

0.01

0.03

0.01

0.01

0.01

0.01

0.48

0.21

0.05

Probability Example

The Economist reports that the probability of Emmanuel Macron winning

against each candidate is:

Le Pen: 0.88

Pecresse: 0.76

Macron Pecresse Le Pen Zemmour Melenchon

Zemmour: 0.98

Macron

0.26

0.45

0.18

0.01

Melenchon: 1.00

Pecresse

0.26

0.02

0.01

0.03

(I assume)

Le Pen

0.45

0.02

0.01

Zemmour

Melenchon

Other

Total

0.18

0.01

0.01

0.91

0.01

0.03

0.01

0.33

0.01

0.01

0.01

0.48

0.21

0.05

Probability Example

The Economist reports that the probability of Emmanuel Macron winning

against each candidate is:

? ?2 ? ∩ ? = 0.88

? ?2 ? ∩ ? = 0.76

Macron Pecresse Le Pen Zemmour Melenchon

? ?2 ? ∩ ? = 0.98

Macron

0.26

0.45

0.18

0.01

? ?2 ? ∩ ?? = 1

Pecresse

0.26

0.02

0.01

0.03

Le Pen

Zemmour

Melenchon

Other

Total

0.45

0.18

0.01

0.01

0.91

0.02

0.01

0.03

0.01

0.33

0.01

0.01

0.01

0.01

0.48

0.21

0.05

Probability Example

If Emmanuel Macron wins the presidency, what is the likelihood that he

faced Marine Le Pen in the runoff ?

?

?

?

?

?2

?2

?2

?2

Macron Pecresse

Macron

0.26

Pecresse

0.26

? ∩ ? = 0.88

Le Pen

0.45

0.02

0.01

? ∩ ? = 0.76 Zemmour 0.18

Melenchon 0.01

0.03

? ∩ ? = 0.98

Other

0.01

0.01

? ∩ ?? = 1

Total

0.91

0.33

Le Pen Zemmour Melenchon

0.45

0.18

0.01

0.02

0.01

0.03

0.01

0.01

0.01

0.01

0.48

0.21

0.05

Probability Example

If Emmanuel Macron wins the presidency, what is the likelihood that he

faced Marine Le Pen in the runoff ?

? ? ∩ ? ?2

?

?

?

?

?2

?2

?2

?2

Macron Pecresse

Macron

0.26

Pecresse

0.26

? ∩ ? = 0.88

Le Pen

0.45

0.02

? ∩ ? = 0.76 Zemmour 0.18

0.01

0.03

? ∩ ? = 0.98 Melenchon 0.01

Other

0.01

0.01

? ∩ ?? = 1

Total

0.91

0.33

Le Pen Zemmour Melenchon

0.45

0.18

0.01

0.02

0.01

0.03

0.01

0.01

0.01

0.01

0.48

0.21

0.05

Probability Example

? ? ∩ ? ?2

? ? ∩ ? ∩ ?2

=

? ?2

? ?2 ? ∩ ? ? ? ∩ ?

=

? ?2 ? ∩ ? ? ? ∩ ? + ⋯ + ? ?2 |?? ? ? ∩ ??

0.88 0.45

=

0.88 0.45 + 0.76 0.26 + 0.98 0.18 + 1 0.01 + 1 0.01 + 0 0.09

= 0.5077

? ?2 ? ∩ ? = 0.88, ? ?2 ? ∩ ? = 0.76, ? ?2 ? ∩ ? = 0.98, ? ?2 ? ∩ ?? = 1

? ?2 ? ∩ ??ℎ?? = 1, ? ?2 ?? = 0

Macron

Macron Pecresse Le Pen Zemmour Melenchon

0.26

0.45

0.18

0.01

Probability Functions: PMF (Discrete only)

Probability Mass Function of a discrete distribution

If Random Variable X is distributed ?~Dist, then

Dist ??? = ? ? = ?

Probability Functions: PDF (Continuous only)

The Probability Density Function of continuous distribution

If Random Variable X is distributed ?~Dist, then

Dist ??? = ? ?

?

න ? ? ?? = ? ? < ? < ?
?
Probability Functions: CDF
The Continuous Distribution Function
If Random Variable X is distributed ?~Dist, then
Dist ??? = ? ? = ? ? ≤ ?
Probability Functions: PPF
The Percentile Point Function (inverse CDF)
If Random Variable X is distributed ?~Dist, then
Dist ??? Dist ??? ?
If Dist ??? ? = ?, then ? ? ≤ ? = ?.
=?
Probability Functions: SF
The Survival Function
If Random Variable X is distributed ?~Dist, then
Dist?? = ? ? > ?

Probability Functions: ISF

The Inverse Survival Function

If Random Variable X is distributed ?~Dist, then

Dist?? Dist ??? ?

If Dist ??? ? = ?, then ? ? > ? = ?.

=?

Distributions

Mid-Term Review

Week 3

DASC 512