Statistics project Writing Service Available

Statistics project
Statistics project

Statistics project

Order Instructions:

Using any data of interest to your group, compile a data set comprised of one predictor and one
response variable with at least 20 observations (data points), and answer the questions below.
Your project needs to be typed and plots can be made using any software of your choice. Only one
project (with each member’s name) per group needs to be submitted. Your project should include
all the observations used.

Provide a brief description of your project. Make sure to identify the predictor and response
variables, as well as discussing the objective of your regression model.

1. (20%) All your answers must be in the order in which the questions are asked, otherwise you will be
deducted 20%. Note: Even if only one answer is out of order you will still be deducted 20%.

2. (15%) For your predictor and response variables:
(a) compute the range and IQR.
(b) make a histogram of your data.
(c) make a boxplot of your data.

3. (25%) Make a scatterplot of your data and describe the:
(a) Direction
(b) Form
(c) Strength
(d) Correlation
(g) Outliers

4. (40%) Based on your data, construct a linear regression model of your response variable as a function
of your predictor variable following the steps below:
(a) Compute ¯x and ¯y
(b) Compute sx and sy
(c) Compute r
(d) Compute a and b
(e) Construct the respective Least Squares line and plot it over your scatter plot.
(f) Compute the respective R2 and interpret your results.
(g) For your model, compute and plot the residuals vs x. Describe what you observe from
this plot.
(h) Are there any outliers? If so, are they high leverage and/or influential.
(i) Based on your model, make 3 predictions for your response variable (i.e., use 3 different
values of x that are not in your data, and compute the respective y value

SAMPLE ANSWER

Statistics project

Question One

The data below was obtained from an organization that wanted to estimate the cost of leasing a building given the contract value for constructing the building. It follows that the contract value was the predictor variable while the estimated cost is the response variable.

Estimated cost Contract value
85,000 310,000 100,000 360,000
70,000 305,000 120,000 370,000
110,000 180,000 150,000 200,000
90,000 170,000 80,000 250,000
130,000 160,000 180,000 300,000
160,000 110,000 190,000 160,000
160,000 150,000 200,000 210,000
280,000 180,000 350,000 230,000
130,000 175,000 180,000 250,000
320,000 180,000 380,000 270,000

 Question Two

  • compute the range and IQR.

Range

Constructed value           =380,000-80,000

=300,000

Estimated cost                  = 320,000-70,000

=250,000

Quartile Range

Constructed value           = 300000- 175000

=125000

Estimated cost                  = 197500- 125000

= 72500

(b) Make a histogram of your data.

 

(c) make a boxplot of your data.

 

Question Three

(a) Direction

The direction of a relationship tells whether the values on two variables go up

and down together. The nature of the plot indicates direction. If two variables have a positive direction, then as the values on one variable go up, so do the values on the other variable. The data used has a positive direction because the points of the scatter plots run from the lower left to the upper right. This implies that as the vales of the contract value go up so does the value of the estimated cost and vice versa.

(b) Form

The shape of the plot could explain the form of the scatter plot. This is because there are instances where the plot has a curved shape. In other instances, the plot could have a straight line plot. If there is a linear relationship, then the plot will appear to swarm or cloud in a generally straight and consistent form. The plot above indicates that the data points are straight and consistent. I.e. there is a linear relationship between the estimated cost and the contract value.

  • Strength

The strength of the relationship between variables is determined by how close the plotted points are from one another. Closely placed points indicate a strong relationship between the variables. In this case, the points are neither close nor far from each other. Therefore, there is a moderate relationship between the variables.

  • Correlation

The correlation between two variables measures the strength and direction of the relationship between the variables. The strength and direction of the variables have already been established in the previous paragraphs. Therefore, we conclude that there is a moderate positive relationship between the variables.

(g) Outliers

The extreme points in a scatter plot identify outliers. In this case, there are four outliers. The box plot has also demonstrated this.

Question Four

(a) Compute ¯x and ¯y

Mean for estimated cost is given by the sum of all the observations divided by the number of observations.

¯x            = 3,455,000/20

=172750

The mean for the contract value is given by the sum of all the observations divided by the number of observations.

¯y            =4,530,000/20

=226,500

  • Compute sx and sy

The standard deviation of the variables is given by taking the square root of the sum of all the deviations from the mean and dividing by the number of observations less by one.

The standard deviation for the estimated cost is

Sd           = (107,323,750,000/19) ^1/2

= 75157.2912

The standard deviation for the contract value is

Sd           = (209,836,250,000/19) ^1/2

= 105090.4998

  • Compute r

The correlation coefficient is given by the following formula.

Estimated cost (Y) Contract value (X) XY X2 Y2
85,000 100,000 8500000000 7,225,000,000 10,000,000,000
70,000 120,000 8400000000 4,900,000,000 14,400,000,000
110,000 150,000 16500000000 12,100,000,000 22,500,000,000
90,000 80,000 7200000000 8,100,000,000 6,400,000,000
130,000 180,000 23400000000 16,900,000,000 32,400,000,000
160,000 190,000 30400000000 25,600,000,000 36,100,000,000
160,000 200,000 32000000000 25,600,000,000 40,000,000,000
280,000 350,000 98000000000 78,400,000,000 122,500,000,000
130,000 180,000 23400000000 16,900,000,000 32,400,000,000
320,000 380,000 121600000000 102,400,000,000 144,400,000,000
310,000 360,000 111600000000 96,100,000,000 129,600,000,000
305,000 370,000 112850000000 93,025,000,000 136,900,000,000
180,000 200,000 36000000000 32,400,000,000 40,000,000,000
170,000 250,000 42500000000 28,900,000,000 62,500,000,000
160,000 300,000 48000000000 25,600,000,000 90,000,000,000
110,000 160,000 17600000000 12,100,000,000 25,600,000,000
150,000 210,000 31500000000 22,500,000,000 44,100,000,000
180,000 230,000 41400000000 32,400,000,000 52,900,000,000
175,000 250,000 43750000000 30,625,000,000 62,500,000,000
180,000 270,000 48600000000 32,400,000,000 72,900,000,000
3,455,000 4,530,000 903,200,000,000 704,175,000,000 1,178,100,000,000

= 0.94439147

Compute a and b

a              = -6958.173

b             = 0.793

(e) Construct the respective Least Squares line and plot it over your scatter plot.

Estimated Cost = -6958.173 + 0.793 contract value

(f) Compute the respective R2 and interpret your results.

= 0.89187525

This implies that 89 percent of the variation in expected cost is explained by the variation in the contract value.

(g) For your model, compute and plot the residuals vs. x. Describe what you observe from this plot.

The residual plot above indicates that the data has a constant and independent variance because the plots are consistent regardless of the contract value. It is also clear that the data follows a normal distribution form the normal probability plot below.

(h) Are there any outliers? If so, are they high leverage and/or influential?

There are outliers in the data but they are neither high leveraged or influential.

Based on your model, make 3 predictions for your response variable

Using the following equation Estimated Cost = -6958.173 + 0.793 contract value

The predicted value for three values is indicated in the table below.

Contract Value 276000 302000 144000
Predicte Estimated Cost 212023.9716 232652.7243 107293.3807

We can write this or a similar paper for you! Simply fill the order form!

Unlike most other websites we deliver what we promise;

  • Our Support Staff are online 24/7
  • Our Writers are available 24/7
  • Most Urgent order is delivered with 6 Hrs
  • 100% Original Assignment Plagiarism report can be sent to you upon request.

GET 15 % DISCOUNT TODAY use the discount code PAPER15 at the order form.

Type of paper Academic level Subject area
Number of pages Paper urgency Cost per page:
 Total: