DASC 512 AFIT Practice Worker Mistakes Are Regular and Uniform Problem

detailed explanation and corresponding IPYNB file information in explanation needs to match the python code

1
Cover Sheet and Instructions
There are five (5) pages and nine (9) questions. Submissions must include both a PDF fully detailing
your response to the questions (i.e., results, narrative, tables, and graphs) along with any Python code you
used in ipynb format. Code does not need to be part of the PDF.
Instructions: For all problems, be sure to give full details of your analysis.
I recommend using an Assume, Given, Find, Solution, Answer method to organize your thoughts and
response for problems that are not hypothesis tests. This is not required, but it helps guide your thought
process.
For hypothesis tests, be sure to include:
• a non-technical summary of results
• hypothesis statements
• assumptions
• test chosen with justification
• significance level
• appropriate results, such as test statistic, p-value, rejection region, confidence intervals, and/or ANOVA
tables
• technical conclusion
Examples of well-formulated solutions are given on the next page.
1
2
Example Solutions
Example Problem 1: In the board game Gloomhaven, characters start with decks of 20 cards that
provide modifiers for an attack: 1 miss (0 damage), 1 -2 (base – 2 damage), 5 -1 (base – 1 damage), 6 +0
(base damage), 5 +1 (base + 1 damage), 1 +2 (base + 2 damage), and 1 critical hit (2x base damage). A
character attacks with an 3-damage attack and uses advantage — taking the higher of two random modifiers.
What is the probability that they do at least 4 damage?
Assume: The full deck is available and well shuffled. Let X be the damage from a single card draw. Let
Y be the damage with advantage.
Given: For X, P(0) = 1/20, P(1) = 1/20, P(2) = 5/20, P(3) = 6/20, P(4) = 5/20, P(5) = 1/20, P(6) =
1/20.
Find: P (Y ≥ 4)
Solution: On the first draw, P (X ≥ 4) = 7/20. If the first draw results in less than 4 damage, then
P (Y ≥ 4) = 7/20. If the first draw results in at least 4 damage, then (P (Y ≥ 4) = 1. Thus
P (Y ≥ 4) = P (Y ≥ 4|X ≥ 4) + P (Y ≥ 4|X < 4) = (7/20)(1) + (13/20)(7/20) = (231/400) = 0.5775 Answer: The probability that they do at least 4 damage is 231 400 = 0.5775. Example Problem 2: Chris believes that he rolls dice worse than Beau. They each roll a six-sided die 100 times and record their rolls. Chris rolls a total of 325. Beau rolls a total of 375. Perform a hypothesis test to determine if Beau rolls better than Chris. BLUF: This experiment gave sufficient evidence to conclude that Beau rolls better than Chris. Hypotheses: H0 : µB − µC = 0, Ha : µB − µC > 0.
Assumptions: Both dice are fair dice with 1/6 chance of rolling each outcome — a uniform distribution
between 1 and 6. Each roll is iid. Although the underlying distribution is uniform, the sample size is 100 for
both groups, so assume the population is normally distributed with µ = 3.5, σ 2 = 35/12 — the mean and
variance of a discrete uniform distribution with 6 outcomes.
Type of test: Two-sample t-test with equal variance. Sample size is sufficient for the Central Limit
Theorem to apply to the sampling distribution despite underlying uniform distributions.
Significance: The risk of a type-I error is low, so let α = 0.1.
Test statistic: t = 2.07
Rejection region: t > 1.29
P-value: p = 0.0199
Confidence interval: With 10% confidence, Beau’s rolls are better than Chris’s by between 0.22 and 0.78
on average.
Conclusion: At the 0.1 significance level, we reject the null hypothesis that Chris and Beau roll equally
well and conclude that Beau’s rolls are higher on average than Chris’s.
2
Problem 1: 5 points
Suppose that there are four inspectors at a film factory who are supposed to stamp the expiration date
on each package of film at the end of the assembly line: John, Tina, Wayne, and Amy. John processes
20% of all packages, and he fails to stamp 1/200 packages that he processes. Tina processes 60% of all
packages, and she fails to stamp 1/100 packages. Wayne processes 15% of all packages, and he fails to
stamp 1/90 packages. Amy processes 5% of all packages, and she fails to stamp 1/200 packages.
A customer calls to complain that her package of film does not show an expiration date. What is
the probability that it was inspected by John?
Problem 2: 5 points
A regional telephone company operates three identical relay stations at different locations. During a
one-year period, the number of malfunctions reported by each station and the causes are shown below.
Causes
Problems with Electricity Supplied
Computer Malfunction
Malfunctioning Equipment
Human Error
Station
A B C
2 6 4
4 3 1
3 4 3
9 4 7
Suppose that a malfunction was reported and it was found to be caused by human error. What is
the probability that it came from Station C?
Problem 3: 10 points
Describe the effect of sample size, effect size, and level of significance on statistical power.
Problem 4: 5 points
Suppose that we are interested in the IQ of incoming students at AFIT. We want to run a test that
will detect a 5 IQ point difference between the true mean and our new class. From previous research,
we can assume a standard deviation of 15 (i.e., the parameter is known), and leadership wants to have
options for α = 0.05 and α = 0.1. Create a plot for a power analysis where the y-axis is the power of
our test and the x-axis is the sample size. It should have a line for each significance level.
3
Problem 5: 5 points
The accuracy of a new precision air drop system being tested by the US Air Force follows a normal
distribution with a mean of 50 ft and a standard deviation of 10 ft. A particular resupply mission drops
12 payloads. It is considered to be successful if at least 9 of the 12 payloads are delivered at between
45 and 60 feet. What is the probability that the resupply mission will be successful?
Problem 6: 5 points
The diameter of steel rods manufactured by an extruder follows a normal distribution. A sample of
rods were taken at random and the diameter of each rod is recorded in the data file title steel.txt.
As the quality control manager for the manufacturing company, it is your job to report evidence that
the population standard deviation of rods exceeds 3.3 to the factory manager. Based on the data in the
sample, should you report anything to the manager?
Problem 7: 5 points
A random sample of 15 water samples is tested for chlorine content. The results of these tests (recorded
in ppm of chlorine) are captured in the data file title chlorine.txt. Assume the population data
are normally distributed. Test the hypothesis that the mean chlorine content is less than 71ppm and
interpret the results.
Problem 8: 5 points
The data represented in dental_crown.csv represents the data collected on hardness of dental gold
used on teeth. Column 1 is the dentist who applied the crown, column 2 is the method used, column
3 is the alloy used, and column 4 is the response (diamond pyramid hardness). Construct an ANOVA
table and perform hypothesis tests to determine which (if any) factors and interactions are significant.
4
Problem 9: 55 points
The data given in faithful.csv records data of 272 eruptions at the Old Faithful geyser at Yellowstone
National Park. Column 1 reports the length of the eruption (in minutes) and column 2 reports the length
of the interval between the previous eruption and this one (i.e., wait time).
(a): 5 points
Create a scatterplot of the data, with eruptions as the x-axis and wait times as the y-axis.
(b): 5 points
Restricting your analysis to eruptions less than 3 minutes long, perform a test to determine if the wait
time averages less than 60 minutes. Use α = 0.05.
(c): 5 points
Create a confidence interval for the mean value of the wait time for eruptions less than 3 minutes long.
(d): 5 points
Restricting your analysis to eruptions at least 3 minutes long, perform a test to determine if the wait
time averages less than 80 minutes. Use α = 0.05.
(e): 5 points
Create a confidence interval for the mean value of the wait time for eruptions at least 3 minutes long.
(f ): 5 points
Create box-plots for scenarios where eruptions last less than 3 minutes and at least 3 minutes. Plot the
box-plots on the same axis for easy comparison (this should be two box-plots on one graph).
(g): 5 points
Conduct a test to determine if the difference in average wait time between the two scenarios is equal to
20 minutes.
(h): 5 points
Create a confidence interval for the mean difference in wait times between eruptions that last less than
3 minutes and those that last at least 3 minutes.
(i): 5 points
Create a histogram for the wait times.
(j): 5 points
Create a QQ-Plot to assess normality of wait times. State your conclusion.
(k): 5 points
Perform an analytical test for normality on wait times. State your conclusion.
5
Mid-Term Review
Week 1
DASC 512
Mid-Term Review
Week 2
DASC 512
Probability – Events
An event is a set of outcomes of an experiment
Usually denoted by a capital letter
May be one (simple event) or more (complex event) outcomes
Probability – Set operators
Union – ? ∪ ?: Either A or B (or both) occur
Intersection – ? ∩ ?: Both A and B occur
Complement – ?? : A does not occur
Probability – Partitions
Any set of events that is both exhaustive and mutually exclusive is a
partition.
∑? ?? = 1,
? ?? ∩ ?? = 0, for all ?, ?
? ?∪? =? ? +? ? −? ?∩?
Conditional Probability
If B occurs, the probability that A also occurred is
? ??
This is read “A given B”
? ?∩?
? ?? =
? ?
Law of Total Probability
If the set of events ?1 , ?2 , … , ?? is a partition (mutually exclusive and
exhaustive)
? ? = ?? ? ?1 ? ?1 + ? ? ?2 ? ?2 + ⋯ + ? ? ?? ? ??
= ෍ ? ? ?? ? ??
?=1
Probability – Bayes’s Rule
? ?∩?
? ?? =
,
? ?
? ?∩?
? ?? ? ?
=
,
? ?
? ?
Bayes′s Rule
? ?? ? ?
? ?? ? ?
=
,
?
?
? ?
? ? ? ? ? +? ? ? ? ?
Total Probability
Probability Example
As of 3 Feb 2022, The Economist’s model of the French election:
The Economist’s French election model | The Economist
Two-stage election: top two candidates in first round face each other in a
face-off election.
Probability Example
The Economist predicts these probabilities of candidates being in the runoff election:
Macron: 0.91
Le Pen: 0.48
Pecresse: 0.33
Zemmour: 0.21
Melenchon: 0.05
Probability Example
The probability of each pair being in the run-off is
Macron: 0.91
Le Pen: 0.48
Pecresse: 0.33
Macron Pecresse Le Pen Zemmour Melenchon
Zemmour: 0.21
Macron
0.26
0.45
0.18
0.01
Melenchon: 0.05
Pecresse
0.26
0.02
0.01
0.03
Le Pen
Zemmour
Melenchon
Other
Total
0.45
0.18
0.01
0.01
0.91
0.02
0.01
0.03
0.01
0.33
0.01
0.01
0.01
0.01
0.48
0.21
0.05
Probability Example
If Valerie Pecresse is in the run-off, what is the probability that Emmanuel
Macron is her opponent?
Macron Pecresse
Macron
0.26
Pecresse
0.26
Le Pen
0.45
0.02
Zemmour 0.18
0.01
Melenchon 0.01
0.03
Other
0.01
0.01
Total
0.91
0.33
Le Pen Zemmour Melenchon
0.45
0.18
0.01
0.02
0.01
0.03
0.01
0.01
0.01
0.01
0.48
0.21
0.05
Probability Example
If Valerie Pecresse is in the run-off, what is the probability that Emmanuel
Macron is her opponent?
? ?∩?
0.26
? ?? =
=
= 0.7879
? ?
0.33
Macron Pecresse
Macron
0.26
Pecresse
0.26
Le Pen
0.45
0.02
Zemmour 0.18
0.01
Melenchon 0.01
0.03
Other
0.01
0.01
Total
0.91
0.33
Le Pen Zemmour Melenchon
0.45
0.18
0.01
0.02
0.01
0.03
0.01
0.01
0.01
0.01
0.48
0.21
0.05
Probability Example
The Economist reports that the probability of Emmanuel Macron winning
against each candidate is:
Le Pen: 0.88
Pecresse: 0.76
Macron Pecresse Le Pen Zemmour Melenchon
Zemmour: 0.98
Macron
0.26
0.45
0.18
0.01
Melenchon: 1.00
Pecresse
0.26
0.02
0.01
0.03
(I assume)
Le Pen
0.45
0.02
0.01
Zemmour
Melenchon
Other
Total
0.18
0.01
0.01
0.91
0.01
0.03
0.01
0.33
0.01
0.01
0.01
0.48
0.21
0.05
Probability Example
The Economist reports that the probability of Emmanuel Macron winning
against each candidate is:
? ?2 ? ∩ ? = 0.88
? ?2 ? ∩ ? = 0.76
Macron Pecresse Le Pen Zemmour Melenchon
? ?2 ? ∩ ? = 0.98
Macron
0.26
0.45
0.18
0.01
? ?2 ? ∩ ?? = 1
Pecresse
0.26
0.02
0.01
0.03
Le Pen
Zemmour
Melenchon
Other
Total
0.45
0.18
0.01
0.01
0.91
0.02
0.01
0.03
0.01
0.33
0.01
0.01
0.01
0.01
0.48
0.21
0.05
Probability Example
If Emmanuel Macron wins the presidency, what is the likelihood that he
faced Marine Le Pen in the runoff ?
?
?
?
?
?2
?2
?2
?2
Macron Pecresse
Macron
0.26
Pecresse
0.26
? ∩ ? = 0.88
Le Pen
0.45
0.02
0.01
? ∩ ? = 0.76 Zemmour 0.18
Melenchon 0.01
0.03
? ∩ ? = 0.98
Other
0.01
0.01
? ∩ ?? = 1
Total
0.91
0.33
Le Pen Zemmour Melenchon
0.45
0.18
0.01
0.02
0.01
0.03
0.01
0.01
0.01
0.01
0.48
0.21
0.05
Probability Example
If Emmanuel Macron wins the presidency, what is the likelihood that he
faced Marine Le Pen in the runoff ?
? ? ∩ ? ?2
?
?
?
?
?2
?2
?2
?2
Macron Pecresse
Macron
0.26
Pecresse
0.26
? ∩ ? = 0.88
Le Pen
0.45
0.02
? ∩ ? = 0.76 Zemmour 0.18
0.01
0.03
? ∩ ? = 0.98 Melenchon 0.01
Other
0.01
0.01
? ∩ ?? = 1
Total
0.91
0.33
Le Pen Zemmour Melenchon
0.45
0.18
0.01
0.02
0.01
0.03
0.01
0.01
0.01
0.01
0.48
0.21
0.05
Probability Example
? ? ∩ ? ?2
? ? ∩ ? ∩ ?2
=
? ?2
? ?2 ? ∩ ? ? ? ∩ ?
=
? ?2 ? ∩ ? ? ? ∩ ? + ⋯ + ? ?2 |?? ? ? ∩ ??
0.88 0.45
=
0.88 0.45 + 0.76 0.26 + 0.98 0.18 + 1 0.01 + 1 0.01 + 0 0.09
= 0.5077
? ?2 ? ∩ ? = 0.88, ? ?2 ? ∩ ? = 0.76, ? ?2 ? ∩ ? = 0.98, ? ?2 ? ∩ ?? = 1
? ?2 ? ∩ ??ℎ?? = 1, ? ?2 ?? = 0
Macron
Macron Pecresse Le Pen Zemmour Melenchon
0.26
0.45
0.18
0.01
Probability Functions: PMF (Discrete only)
Probability Mass Function of a discrete distribution
If Random Variable X is distributed ?~Dist, then
Dist ??? = ? ? = ?
Probability Functions: PDF (Continuous only)
The Probability Density Function of continuous distribution
If Random Variable X is distributed ?~Dist, then
Dist ??? = ? ?
?
න ? ? ?? = ? ? < ? < ? ? Probability Functions: CDF The Continuous Distribution Function If Random Variable X is distributed ?~Dist, then Dist ??? = ? ? = ? ? ≤ ? Probability Functions: PPF The Percentile Point Function (inverse CDF) If Random Variable X is distributed ?~Dist, then Dist ??? Dist ??? ? If Dist ??? ? = ?, then ? ? ≤ ? = ?. =? Probability Functions: SF The Survival Function If Random Variable X is distributed ?~Dist, then Dist?? = ? ? > ?
Probability Functions: ISF
The Inverse Survival Function
If Random Variable X is distributed ?~Dist, then
Dist?? Dist ??? ?
If Dist ??? ? = ?, then ? ? > ? = ?.
=?
Distributions
Mid-Term Review
Week 3
DASC 512

Pages (275 words)
Standard price: \$0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back