Question DetailsNormal
\$ 18.00
Project 2 | Complete Solution
Question posted by

For this assignment use the data set senic.xlsx. This data set consists of a random sample of 113 hospitals. The objective is to study the infection risk and what factors influence it. The variables from the data set are:

 Variable Name Description Identification number 1-113 Length of stay Average length of stay in hospital (in days) Age Average age of patients (in years) Infection risk Average estimated probability of acquiring infection in hospital (in percent) Routing culturing ratio Ratio of number of cultures performed to number of patients without signs or symptoms of pneumonia, times 100 Routine chest X-ray ratio Ratio of number of X-rays performed to number of patients without signs or symptoms of pneumonia, times 100 Number of beds Average number of beds in hospital Medical school affiliation 0 = Yes, 1 = No Average daily census Average number of patients in hospital per day Number of nurses Average number of full-time licensed practical nurses Available facilities and services Percent of 35 potential facilities and services that are provided by the hospital

The goal is to fit the best multiple regression model to the response (infection risk).

Do an analysis using the first 108 observations.

Use the stepwise regression method to see which model is the best. Repeat using subset regression. Do they agree?

Are there any outliers in the data? Look for x-outliers, y-outliers, and high-influence points.

Come up with one model that you think best describes the data and can be used for future predictions. Show the residual plot for this one. Does the model seem appropriate?

Use this model to predict (using prediction interval) y for the last 5 observations of the data and see if the model is doing well.

Available Solution
\$ 18.00
Project 2 | Complete Solution
• This Solution has been Purchased 1 time
• Submitted On 25 Apr, 2015 04:47:20
Solution posted by
First we tried to find whether there are any outliers in the data considering all variables and first 108 observations. There are few outliers as shown by the below tables with classification of outliers: Also the same can be seen from the below box plots:   We tr...
Buy now to view full solution.
Attachment

\$ 629.35