# Southern New Hampshire University Statistics Worksheet

## Scenario

You have been hired by the D. M. Pan National Real Estate Company to develop a model to predict housing prices for homes sold in 2019. The CEO of D. M. Pan wants to use this information to help their real estate agents better determine the use of square footage as a benchmark for listing prices on homes. Your task is to provide a report predicting the housing prices based square footage. To complete this task, use the provided real estate data set for all U.S. home sales as well as national descriptive statistics and graphs provided.

## Directions

Using the Project One Template located in the What to Submit section, generate a report including your tables and graphs to determine if the square footage of a house is a good indicator for what the listing price should be. Reference the National Statistics and Graphs document for national comparisons and the

Real Estate Data

spreadsheet (both found in the Supporting Materials section) for your statistical analysis.

**Note:** Present your data in a clearly labeled table and using clearly labeled graphs.

Specifically, include the following in your report:

**Introduction**

- Describe the report: Give a brief description of the purpose of your report.Define the question your report is trying to answer.Explain when using linear regression is most appropriate.When using linear regression, what would you expect the scatterplot to look like?Explain the difference between response and predictor variables in a linear regression to justify the selection of variables.

**Data Collection**

- Sampling the data: Select a random sample of 50 houses.Identify your response and predictor variables.
- Scatterplot: Create a scatterplot of your response and predictor variables to ensure they are appropriate for developing a linear model.

**Data Analysis**

- Histogram: For your two variables, create histograms.
- Summary statistics: For your two variables, create a table to show the mean, median, and standard deviation.
- Interpret the graphs and statistics:Based on your graphs and sample statistics, interpret the center, spread, shape, and any unusual characteristic (outliers, gaps, etc.) for the two variables.Compare and contrast the shape, center, spread, and any unusual characteristic for your sample of house sales with the national population. Is your sample representative of national housing market sales?

**Develop Your Regression Model**

- Scatterplot: Provide a graph of the scatterplot of the data with a line of best fit.Explain if a regression model is appropriate to develop based on your scatterplot.
- Discuss associations: Based on the scatterplot, discuss the association (direction, strength, form) in the context of your model.Identify any possible outliers or influential points and discuss their effect on the correlation.Discuss keeping or removing outlier data points and what impact your decision would have on your model.
- Find r: Find the correlation coefficient (r).Explain how the r value you calculated supports what you noticed in your scatterplot.

**Determine the Line of Best Fit.** Clearly define your variables. Find and interpret the regression equation. Assess the strength of the model.

- Regression equation: Write the regression equation (i.e., line of best fit) and clearly define your variables.
- Interpret regression equation: Interpret the slope and intercept in context.
- Strength of the equation: Provide and interpret R-squared.Determine the strength of the linear regression equation you developed.
- Use regression equation to make predictions: Use your regression equation to predict how much you should list your home for based on the square footage of your home.

**Conclusions**

- Summarize findings: In one paragraph, summarize your findings in clear and concise plain language for the CEO to understand. Summarize your results.Did you see the results you expected, or was anything different from your expectations or experiences?What changes could support different results, or help to solve a different problem?Provide at least one question that would be interesting for follow-up research.

You can use the following tutorial that is specifically about this assignment. Make sure to check the assignment prompt for specific numbers used for national statistics. The videos may use different national statistics. You should use the national statistics posted with this assignment.

Median Housing Price Prediction Model for D. M. Pan National Real Estate Company

1

[Note: To complete this template, replace the bracketed text with your own content. Remove this

note before you submit your outline.]

Report: Housing Price Prediction Model for D. M. Pan National Real Estate Company

[Your Name]

Southern New Hampshire University

Median Housing Price Model for D. M. Pan National Real Estate Company

2

Introduction

[Describe the report: Include in this section a brief overview, including the purpose of

the report and your approach.]

Data Collection

[Sampling the data: Outline how you obtained your sample data, including the response

and predictor variables.]

[Scatterplot: Insert a correctly labeled scatterplot of your chosen variables.]

Data Analysis

[Histogram: Insert the histogram of the two variables. Be sure to include appropriate

labels.]

[Summary statistics: Insert a table to show the summary statistics.]

[Interpret the graphs and statistics: Describe the shape, center, spread, and any

unusual characteristic (outliers, gaps, etc.) and what they mean based on your sample data and

the graphs you created.]

[Explain how these characteristics of the sample data compare to the same characteristics

of the national population. Also, determine whether your sample is representative of the national

housing market sales.]

The Regression Model

[Scatterplot: Include the scatterplot graph of the sample with a line of best fit and the

regression equation.]

[Based on your graph, explain whether a regression model can be developed for the data

and how.]

Median Housing Price Model for D. M. Pan National Real Estate Company

3

[Discuss associations: Explain the associations in the scatterplot, including the direction,

strength, form in the context of your model.]

[Find r: Calculate the correlation coefficient and explain how it aligns with your

interpretation of the data from the scatterplot.]

The Line of Best Fit

[Regression equation: Insert the regression equation.]

[Interpret regression equation: Interpret the slope and intercept in context.]

[Strength of the equation: Interpret the strength of the regression equation, R-squared.]

[Use regression equation to make predictions: Use the regression equation to make a

sample prediction.]

Conclusions

[Summarize findings: Summarize your findings in clear and concise plain language.

Outline any questions arising from the study that might be interesting for follow-up research.]