Fitting a regression model – STAT 351 Project

Do you have an assignment like the one below and would like help with it? Why don’t consider getting r programming help online from our experts in this field. Just click the button below and send us your instructions.

Assignment Requirements

Answer all questions fully and include all requested answers and output onto a single .pdf, .doc, or .docx file. All written work for the report will need to be typed, and all requested plots need to be generated through R. All plots need to be correctly labeled and easy to read/understand. In addition to the typed project, upload the .R source file used to perform all requested statistical operations. Failure to upload the .R source file will result in a substantial loss of points. Data for completing this project may be found under the “Files/Project/Project 2’ page on Canvas.

Assignment Details

Select one of the following datasets to complete your project.

Modeling Real Estate prices in King County

The dataset kingcounty.csv contains information about housing prices in King County (which contains Seattle, Washington). Data includes information about square footage, number of bedrooms, number of bathrooms, whether or not the house is a waterfront property, etc. It also contains information on the square footage, etc. of its 15 closest houses. A complete description of all variables in the dataset can be found in the file kingcountydesc.pdf on Canvas.

  • For this project, we want to determine whether being on the waterfront significantly increases the price of a house.
  • The variable price is the response.
  • The variable waterfront is the main predictor of interest.

Do you need help with rstudio? We have a tutor waiting to help. Just click the link below and send us the details.

Assessing the efficacy of the National Supported Work Demonstration

The dataset WorkDemo.csv contains demographic data on a number of individuals, income data, and an indicator variable on whether or not that person particpated in the National Supported Work Demonstration (NSW). A complete description of all variables in the dataset can be found in the file WorkDemoDesc.pdf on Canvas.

  • For this project, we want to determine whether participating in the program increases real earnings in 1978.
  • The variable re78 is the response.
  • The variable program is the main predictor of interest.

More information on this program can be found here: LaLonde, Robert J. “Evaluating the econometric evaluations of training programs with experimental data.” The American Economic Review (1986).

Do you have a similar question or any other r question? We have rstudio help tutors to help you. Click the button below to proceed.

Predicting College GPA

The dataset GpaAdmissions.csv contains data on college GPA, high school GPA, SAT score, major, and several other variables. A complete description of all variables in the dataset can be found in Table A.2 in Appendix A of the Business Analytics textbook.

  • For this project, we want to determine whether students with higher SAT scores have higher College GPAs.
  • The variable College GPA is the response.
  • The variable SAT is the main predictor of interest.

Modeling NBA Salaries

The dataset NBASalaries.csv contains data NBA player salaries and their statistics during the 2015–2016 NBA Season. A complete description of all variables in the dataset can be found in Table A.5 in Appendix A of the Business Analytics textbook.

  • For this project, we want to determine whether players averaging more points per game are paid more.
  • The variable Salary is the response.
  • The variable Points is the main predictor of interest.

Assignment Instructions

A. Model fitting

  1. Fit three regression models to predict the response variable. Ensure that each model includes the main predictor of interest and at least two predictor variables in total. For each model, give the predicted regression equation. Also, include a one-totwo sentence description on why you chose the predictors in your model.
  2. Determine, using appropriate criteria, the best model for predicting your response variable out of the ones considered in part 1(a). You will use this model to answer questions 2, 3, and 4.

Find the right R studio homework Help for your Project by clicking the button below

B. Residual analysis

  1. Make a residual plot where the predicted (fitted) values from the regression line are on the x-axis. Ensure that the plot is labeled correctly. From the residual plot, do there appear to be any obvious violations of the standard regression assumptions? Explain.
  2. Repeat part B(1), but this time, use a predictor variable from your model as the x-axis of your plot.
  3. Find the observations that correspond with the three largest and three smallest residuals.
  4. Interpret these observations that correspond with these residuals in 2 sentences or less.

C. Regression interpretation

  1. In your regression model what is the estimated coefficient for your main predictor of interest and what is the standard error of this coefficient?
  2. Interpret the value of the estimated coefficient in two sentences or less.

D. Hypothesis testing using regression

Using regression output, conduct a hypothesis test at the α =0.05 significance level to answer the question located in the description of the dataset.

  1. State the null and alternative hypothesis in terms of regression coefficients.
  2. Give the t-statistic for this hypothesis test.
  3. Under the null hypothesis, this t-statistic has a t-distribution with how many degrees of freedom?
  4. Give the p-value for your test statistic.
  5. Define what a p-value is in 2 sentences or less.
  6. What do you conclude for your research question?

E. R Source File

Attach your R source file used for performing all model selection, residual analysis, estimation, statistical inference, and construction of plots.

Would you like our rstudio assignment help on an almost similar project? Click the button below and send us the details.