NYU Regression Analysis STATA Worksheet

Please, answer every question. Note that Q3 has multiple parts. Use STATA and justify your answer.

datenum
9/30/2000
8/31/2000
7/31/2000
6/30/2000
5/31/2000
4/30/2000
3/31/2000
2/29/2000
1/31/2000
12/31/1999
11/30/1999
10/31/1999
9/30/1999
8/31/1999
7/31/1999
6/30/1999
5/31/1999
4/30/1999
3/31/1999
2/28/1999
1/31/1999
12/31/1998
11/30/1998
10/31/1998
9/30/1998
8/31/1998
7/31/1998
6/30/1998
5/31/1998
4/30/1998
3/31/1998
2/28/1998
1/31/1998
12/31/1997
11/30/1997
10/31/1997
9/30/1997
8/31/1997
7/31/1997
6/30/1997
5/31/1997
4/30/1997
3/31/1997
2/28/1997
1/31/1997
12/31/1996
datetext
09/2000
08/2000
07/2000
06/2000
05/2000
04/2000
03/2000
02/2000
01/2000
12/1999
11/1999
10/1999
09/1999
08/1999
07/1999
06/1999
05/1999
04/1999
03/1999
02/1999
01/1999
12/1998
11/1998
10/1998
09/1998
08/1998
07/1998
06/1998
05/1998
04/1998
03/1998
02/1998
01/1998
12/1997
11/1997
10/1997
09/1997
08/1997
07/1997
06/1997
05/1997
04/1997
03/1997
02/1997
01/1997
12/1996
sp_tr
riskfree
-5.30% 0.50%
6.19% 0.51%
-1.58% 0.49%
2.45% 0.48%
-2.07% 0.49%
-3.03% 0.47%
9.77% 0.48%
-1.91% 0.46%
-5.04% 0.44%
5.87% 0.44%
2.02% 0.42%
6.31% 0.40%
-2.76% 0.39%
-0.51% 0.40%
-3.14% 0.38%
5.53% 0.38%
-2.38% 0.38%
3.86% 0.36%
3.98% 0.37%
-3.12% 0.37%
4.16% 0.36%
5.75% 0.37%
6.04% 0.37%
8.12% 0.34%
6.39% 0.40%
-14.47% 0.41%
-1.08% 0.41%
4.05% 0.42%
-1.74% 0.42%
0.99% 0.42%
5.10% 0.42%
7.20% 0.43%
1.09% 0.42%
1.70% 0.43%
4.61% 0.43%
-3.36% 0.41%
5.46% 0.41%
-5.62% 0.43%
7.94% 0.42%
4.46% 0.41%
6.07% 0.43%
5.95% 0.43%
-4.13% 0.43%
0.77% 0.42%
6.23% 0.42%
-2.00% 0.41%
hedge
-0.41%
3.39%
0.11%
3.66%
-1.17%
-4.63%
-2.12%
6.49%
-0.10%
8.53%
4.96%
2.37%
-0.32%
-0.90%
0.26%
3.28%
0.13%
2.63%
1.22%
-1.31%
0.80%
3.03%
1.36%
-4.57%
-2.31%
-7.55%
0.90%
1.59%
0.26%
0.95%
5.94%
1.96%
-1.21%
3.22%
1.00%
-1.64%
3.99%
-1.26%
6.99%
2.26%
0.88%
2.86%
-1.41%
1.30%
5.48%
0.31%
exhedge
-0.91%
2.88%
-0.38%
3.18%
-1.66%
-5.10%
-2.60%
6.03%
-0.54%
8.09%
4.54%
1.97%
-0.71%
-1.30%
-0.12%
2.90%
-0.25%
2.27%
0.85%
-1.68%
0.44%
2.66%
0.99%
-4.91%
-2.71%
-7.96%
0.49%
1.17%
-0.16%
0.53%
5.52%
1.53%
-1.63%
2.79%
0.57%
-2.05%
3.58%
-1.69%
6.57%
1.85%
0.45%
2.43%
-1.84%
0.88%
5.06%
-0.10%
exsptr
-5.80%
5.69%
-2.07%
1.97%
-2.56%
-3.50%
9.29%
-2.37%
-5.49%
5.44%
1.59%
5.91%
-3.15%
-0.91%
-3.52%
5.15%
-2.75%
3.50%
3.61%
-3.50%
3.80%
5.38%
5.67%
7.78%
5.99%
-14.89%
-1.50%
3.63%
-2.15%
0.57%
4.69%
6.77%
0.67%
1.27%
4.18%
-3.77%
5.05%
-6.05%
7.52%
4.05%
5.64%
5.52%
-4.55%
0.35%
5.81%
-2.40%
exsptr1
5.69%
-2.07%
1.97%
-2.56%
-3.50%
9.29%
-2.37%
-5.49%
5.44%
1.59%
5.91%
-3.15%
-0.91%
-3.52%
5.15%
-2.75%
3.50%
3.61%
-3.50%
3.80%
5.38%
5.67%
7.78%
5.99%
-14.89%
-1.50%
3.63%
-2.15%
0.57%
4.69%
6.77%
0.67%
1.27%
4.18%
-3.77%
5.05%
-6.05%
7.52%
4.05%
5.64%
5.52%
-4.55%
0.35%
5.81%
-2.40%
7.12%
exsptr2
-2.07%
1.97%
-2.56%
-3.50%
9.29%
-2.37%
-5.49%
5.44%
1.59%
5.91%
-3.15%
-0.91%
-3.52%
5.15%
-2.75%
3.50%
3.61%
-3.50%
3.80%
5.38%
5.67%
7.78%
5.99%
-14.89%
-1.50%
3.63%
-2.15%
0.57%
4.69%
6.77%
0.67%
1.27%
4.18%
-3.77%
5.05%
-6.05%
7.52%
4.05%
5.64%
5.52%
-4.55%
0.35%
5.81%
-2.40%
7.12%
2.32%
11/30/1996
10/31/1996
9/30/1996
8/31/1996
7/31/1996
6/30/1996
5/31/1996
4/30/1996
3/31/1996
2/29/1996
1/31/1996
12/31/1995
11/30/1995
10/31/1995
9/30/1995
8/31/1995
7/31/1995
6/30/1995
5/31/1995
4/30/1995
3/31/1995
2/28/1995
1/31/1995
12/31/1994
11/30/1994
10/31/1994
9/30/1994
8/31/1994
7/31/1994
6/30/1994
5/31/1994
4/30/1994
3/31/1994
2/28/1994
1/31/1994
11/1996
10/1996
09/1996
08/1996
07/1996
06/1996
05/1996
04/1996
03/1996
02/1996
01/1996
12/1995
11/1995
10/1995
09/1995
08/1995
07/1995
06/1995
05/1995
04/1995
03/1995
02/1995
01/1995
12/1994
11/1994
10/1994
09/1994
08/1994
07/1994
06/1994
05/1994
04/1994
03/1994
02/1994
01/1994
7.54%
2.74%
5.61%
2.09%
-4.43%
0.36%
2.56%
1.46%
0.95%
0.91%
3.39%
1.91%
4.37%
-0.37%
4.20%
0.23%
3.30%
2.31%
3.98%
2.93%
2.93%
3.88%
2.58%
1.47%
-3.66%
2.23%
-2.46%
4.08%
3.27%
-2.47%
1.62%
1.27%
-4.38%
-2.73%
3.38%
0.42%
0.42%
0.43%
0.42%
0.43%
0.43%
0.42%
0.42%
0.41%
0.41%
0.42%
0.43%
0.45%
0.44%
0.44%
0.45%
0.46%
0.46%
0.47%
0.47%
0.48%
0.48%
0.48%
0.47%
0.44%
0.41%
0.39%
0.37%
0.36%
0.35%
0.35%
0.31%
0.29%
0.27%
0.25%
5.04%
3.35%
2.50%
2.48%
-4.13%
1.74%
1.93%
3.12%
1.06%
-3.59%
6.97%
3.06%
3.14%
-0.11%
0.38%
6.00%
2.76%
0.37%
1.01%
1.45%
3.48%
0.45%
-1.97%
-0.21%
0.41%
-1.35%
0.67%
2.77%
0.35%
-0.81%
2.23%
-1.74%
-3.57%
-4.09%
1.14%
4.62%
2.93%
2.07%
2.06%
-4.56%
1.31%
1.51%
2.70%
0.65%
-4.00%
6.55%
2.63%
2.69%
-0.55%
-0.06%
5.55%
2.30%
-0.09%
0.54%
0.98%
3.00%
-0.03%
-2.45%
-0.68%
-0.03%
-1.76%
0.28%
2.40%
-0.01%
-1.16%
1.88%
-2.05%
-3.86%
-4.36%
0.89%
7.12%
2.32%
5.18%
1.67%
-4.87%
-0.06%
2.14%
1.04%
0.53%
0.50%
2.97%
1.48%
3.93%
-0.82%
3.76%
-0.22%
2.84%
1.85%
3.51%
2.46%
2.46%
3.40%
2.09%
1.00%
-4.10%
1.82%
-2.85%
3.71%
2.90%
-2.82%
1.28%
0.96%
-4.67%
-3.00%
3.13%
2.32%
5.18%
1.67%
-4.87%
-0.06%
2.14%
1.04%
0.53%
0.50%
2.97%
1.48%
3.93%
-0.82%
3.76%
-0.22%
2.84%
1.85%
3.51%
2.46%
2.46%
3.40%
2.09%
1.00%
-4.10%
1.82%
-2.85%
3.71%
2.90%
-2.82%
1.28%
0.96%
-4.67%
-3.00%
3.13%
0.94%
5.18%
1.67%
-4.87%
-0.06%
2.14%
1.04%
0.53%
0.50%
2.97%
1.48%
3.93%
-0.82%
3.76%
-0.22%
2.84%
1.85%
3.51%
2.46%
2.46%
3.40%
2.09%
1.00%
-4.10%
1.82%
-2.85%
3.71%
2.90%
-2.82%
1.28%
0.96%
-4.67%
-3.00%
3.13%
0.94%
-1.23%
exsptr3
1.97%
-2.56%
-3.50%
9.29%
-2.37%
-5.49%
5.44%
1.59%
5.91%
-3.15%
-0.91%
-3.52%
5.15%
-2.75%
3.50%
3.61%
-3.50%
3.80%
5.38%
5.67%
7.78%
5.99%
-14.89%
-1.50%
3.63%
-2.15%
0.57%
4.69%
6.77%
0.67%
1.27%
4.18%
-3.77%
5.05%
-6.05%
7.52%
4.05%
5.64%
5.52%
-4.55%
0.35%
5.81%
-2.40%
7.12%
2.32%
5.18%
1.67%
-4.87%
-0.06%
2.14%
1.04%
0.53%
0.50%
2.97%
1.48%
3.93%
-0.82%
3.76%
-0.22%
2.84%
1.85%
3.51%
2.46%
2.46%
3.40%
2.09%
1.00%
-4.10%
1.82%
-2.85%
3.71%
2.90%
-2.82%
1.28%
0.96%
-4.67%
-3.00%
3.13%
0.94%
-1.23%
1.80%
b) [6 points] Devise a naming scheme for eight more variables. These will be for two groups of
interactions. The first group will be four interaction variables for up markets. For the first
variable in this group, multiply the contemporaneous excess market return (exsptr) times the
up-market indicator for the contemporaneous month. For the next variable in this group,
multiply the first lag excess market return (exsptrl) times the up-market indicator for the first
lag. For the next variable, multiply the second lag excess market return (exsptr2) times the
up-market indicator for the second lag. Follow the same process for the remaining up market
variable.
Follow a similar process for the down-market interaction variables. For the first variable in
this group, multiply the contemporaneous excess market return (exsptr) times the down-
market indicator for the contemporaneous month. For the next variable in this group,
multiply the first lag excess market return (exsptrl) times the down-market indicator for the
first lag. Create the remaining two variables similarly.
These eight new variables should now be populated with a mix of zeros and return values.
Only positive return values (along with the zeros) should appear in the up-market interaction
variables. Only negative return numbers (along with the zeros) should appear in the down-
market interaction variables. For credit for this question part, summarize, describe, inspect,
view or otherwise show that you have created these data. As a quantitative reference point,
what is the value for September 2000 (top of the list) of the third lag for down markets – a 0
or a negative return value?
c) [6 points] Run a regression (robust) of the excess hedge fund returns (exhedge) on the eight
interaction terms that you created. In addition to supplying your regression output, list the
six numeric values of the items below for grader review. Also answer this question: Do your
results essentially match those of the top row of exhibit 5 of page 14 (pdf page 9) of the
article?
Up Markets
Contemp. Beta
Sum of lag 1, lag 2 and lag 3 betas
Sum of all four betas
Down Markets
Contemp. Beta
Sum of lag 1, lag 2 and lag 3 betas
Sum of all four betas
d) [6 points] Examine the confidence intervals calculated by the software for the two
contemporaneous betas. Do the confidence intervals overlap? Using that information, and
making an educated guess about the summed betas for up markets and for down markets,
does it seem as though the hedge fund managers might be employing a different asset pricing
style in up markets versus down markets?
e) [6 points] Submit a PDF file, or similar Gradescope-acceptable format, copy of your code.
Problem 3 [30 points) In the questions below, you will be asked to further investigate the hedge
fund return dataset that you used in the previous two problem sets. Again you are asked for two
deliverables. The first deliverable will be your answer sheet with verbal answers to some
question parts and copies of the relevant portion of your Stata or R command and results in other
question parts. The second deliverable will be a PDF copy (or legibly clear pictures if need be)
of the .do or .R file that you create. You may choose to clean up this .do or R file before
submitting the PDF copy of it, but you should ensure that what you submit, and save, actually
executes properly.
As a reminder, the authors of Do Hedge Funds Hedge? are exploring the notion that hedge fund
managers price their securities in a way that makes their risk appear lower than the true riskiness.
Because managers may have some pricing flexibility, perhaps they price differently when the
market goes up than when the market goes down. You will “divide” the data into observations
when the excess S&P return is positive, up markets, and when it is negative, down markets, and
calculate the beta for each group.
a) [6 points] Start with the data supplied with this problem set, which reflects the work through
problem set 3. The goal of this step is to create eight new indicator variables. Devise a
naming scheme such that each variable name reflects two pieces of information, 1) whether it
relates to up markets or down markets, and 2) whether it is for the contemporaneous month,
one month prior, two months prior or three months prior. Thus, there will be four new
variables relating to up markets (contemporaneous, 1 lag, 2 lag, 3 lag) and four relating to
down markets.
The data names of the contemporaneous and lagged excess market returns in the supplied
dataset are,
exsptr
exsptr 1
exsptr2
exsptr3
For the first of the eight variables that you create, focus on the contemporaneous excess
market return. Using some type of an “if” statement in your code, value the new variable as
“1” if the excess market return (exsptr) is positive, and as a “O” if it is zero or negative. For
the second new variable that you create, focus on the first lagged excess market return
(exsptrl). Value the new variable as 1 if the lagged excess market return is positive, and 0 if
it is zero or negative. Create the next two variables similarly, focusing on the next two lags
of the excess market return.
The other four new variables must be populated with the opposite coding. In other words,
these variables are valued as 1 if the excess market had a return of zero or was down during
the month. They are coded as 0 if the market had a positive excess return. These are the
indicators for a down market.
For credit for this question part, summarize, describe, inspect, view or otherwise show that
you have created these data. As a quantitative reference point, what is the value for
September 2000 (top of the list) of the third lag for down markets – a 0 or a 1?
Problem 1 [35 points] You are studying firms in an industry in which the production function is
Y = KºLBU,
for some values of a and B. You have data on prices (cost of capital Pk, wage rate Pi, price of
output py), inputs (capital K and labor L), and output (Y) from a random sample of firms. You
may assume that the unobservable U is always positive and is independent of prices and inputs.
(a) [7 points] How would you obtain estimates of a and B? Describe the steps that you would
take if you had access to this dataset. What variables would you construct? What model
would you estimate? What are the coefficients of interest?
(b) [7 points] You want to test the null hypothesis that the production function exhibits
increasing returns to scale or constant returns to scale against the alternative hypothesis that
it does not. Express the null hypothesis in terms of a and B. How would you test this
hypothesis? In other words, what test statistic would you construct and when would you
reject the null hypothesis at the 5% significance level?
(c) [Information] Assume the constant returns to scale. Economic theory predicts that
expenditure shares are constant. In particular, when inputs are optimally chosen and there
are constant returns to scale, then
PKK
= a.
PyY
PyY
(d) [7 points] An implication is that pkk does not depend on the price of labor. In non-Cobb-
Douglas production functions, in contrast, the capital expenditure share
PKK
would in
PyY
general depend on the price of labor. How would you use this idea to formally test the
hypothesis that firms maximize a constant-returns-to-scale Cobb-Douglas production
function? What specification would you estimate? What test statistic would you construct?
When would you reject the null hypothesis at the 1% significance level?
(e) [7 points] Suppose that, when the firm produces some amount Y, you observe Y = Y XV
where V is independent of (Y, K, L, U, Pk, Pu, Py) and has mean 1. You do not observe the
true Y. How would this affect your estimates in part (a)? In particular, would your estimates
of a and ß be biased?
(f) [7 points] Suppose that, when the firm uses an amount K of captial, you observe Ñ = K x a
where V is independent of (Y, K, L, U,Pk, Pupy) and has mean 1, but you do not observe the
true K. How would this affect your estimates in part (a)? In particular, would your estimates
of a and ß be biased?
Problem 2 [35 points) In this problem we will try and work towards understanding whether
government-subsidized savings accounts help people save towards retirement, and if so, by how
much. We’ll do this using the 401ksubs.dta dataset attached to this problem set. This is a dataset
of a cross-section of individuals and includes information on basic demographics, their income
and wealth, and whether they participate in a 401(k) account.
a) [7 points] Start by running a naïve regression. Regress net total assets on the dummy variable
indicating whether the respondent has a 401(k) account. Interpret the sign and magnitude of
the coefficient. Can you give this estimate a causal interpretation? Why (not)?
b) [7 points] Now add in the dummy for eligibility for a 401(k) account and interpret the
coefficient (Hint: Can you have a 401(k) account if you are not eligible?]. Does the
coefficient on eligibility imply that being eligible for a 401(k) lowers savings? Why (not)?
What omitted factors do you think are being picked up here?
c) [7 points] Now let’s drop eligibility from the regression, but let’s add in a set of controls.
Add in the dummy for IRA participation, age, age squared, family size, income, income
squared, the male dummy, and the marriage dummy. Interpret five of the coefficients. How
does the coefficient on p401k change? Now do you think you can interpret the coefficient on
p401k as causal? Why (not)?
d) [7 points) Let’s explore the possibility that the controls matter differently for men than for
women. Run a regression of net total assets on the dummy for 401(k) participation and then all
the controls as well as their interactions with the male dummy. Interpret the coefficient on
p401k and the interaction with the male dummy for two of the controls. Test whether all of the
interactions of the controls with the male dummy are jointly significant. How does this change
whether you think the coefficient on p401k is causal?
e) [7 points] Finally, let’s see whether 401(k) participation affects savings differentially for men
vs women. Run the regression from part d) but also interact the 401k participation dummy
with the male dummy. What does this regression imply is the effect on savings of 401k
participation for women? For men? Test whether the effect for men is = 0. Test whether the
effect for women is = 0. Test whether the effects is the same for men as for women.
Incidentally, you can access this, and all the other datasets used in the Stock & Watson
(https://fmwww.bc.edu/ec-p/data/stockwatson/datasets.list.html) and Wooldridge
(http://fmwww.bc.edu/ec-p/data/wooldridge/datasets.list.html) textbooks through Boston
College. They even set up a nice stata command that lets you read them straight into stata called
bcuse. To install it type “ssc install bcuse” into stata. Then you can load this dataset by typing
“bcuse 401ksubs.dta, clear”

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
? Hi, how can I help?