Data Interpretation Practicum in Discriminant Analysis Order Instructions: Data Interpretation Practicum
This week, you will run either correlation, regression, or discriminant analysis on your chosen data. This Application requires you to engage in data interpretation and to select the appropriate analyses for your hypotheses and for the data that you have at your disposal. Toward that end, you should consider which analyses will inform the reader and allow you to pursue your questions.
1. Your submission to your Instructor should include the SPSS output file of your selected statistical analysis in a Word document, along with each of the following elements:
2. Your SPSS output, including graphical representations;
3. Your narrative interpretation; the governing assumptions of the analyses you ran;
4. The viable and nonviable hypotheses (null and alternative);
5. And finally the relevant values (such as a P-value indicating statistical significance or a lack thereof).
Be sure to indicate to your Instructor why you selected the analyses that you did. In other words, why did you select to engage in discriminant analysis, regression, or correlation? How is this analysis related to the hypothesis?
Data Interpretation Practicum in Discriminant Analysis Sample Answer
Introduction
The primary objective of this research study is to analyze the provided data to draw inference about the safety of people at different working sites. Thus, the analysis will revolve around finding evidence on whether there exist any causation relationship (correlation) between the rates of injuries in a working site, the gender of a supervisor at the site, the number of employees at the three different sites and the hours the employees are working. This study is significant for it can be applied in the practice of human management to assess the risk factor of employees at different fields
Thus, the fundamental of this analysis will be to find or establish whether the causation of these variables exists and if it exists, to what extent. In light to this, a bivariate correlation analysis will be carried out and inference/conclusion made about the relationship of these variables. in particular, the analysis will be done to find out whether individual supervisor’s genders contributes to the high injury rate in a site, increase the number of employees increases the injury rate and also if the increased number of working hours is positively correlated to the injury rate. The analysis of this study is based on the hypothesis that are:
H0: There is no significance difference in injury rate at a working site and supervisor’s gender, number of employees and the number of hours at work.
H1: There is a significance difference in injury rate at a working site and supervisor’s gender, number of employees and the number of hours at work.
This hypothesis was vital in the formulation of the research question, which can be regarded as the backbone any successful research (Wilcox, 2012). The research inference about the population parameter will be performed at α =0.05 significant level. At the end of the data analysis, a conclusion will be made which sums up all the inferences made.
Analysis
To investigate the data distribution, a descriptive statistics analysis was carried out and the results are as illustrated in Table 1.
Table 1:
Descriptive Statistics |
||||||
Number of employees | Site | Number of hours at work | Injury rate | Supervisors gender | ||
N | Valid | 51 | 51 | 51 | 51 | 51 |
Missing | 0 | 0 | 0 | 0 | 0 | |
Mean | 24.0196 | 2.04 | 49960.7843 | 15.1755 | .47 | |
Std. Deviation | 7.49531 | .799 | 15590.23590 | 17.47447 | .504 | |
Variance | 56.180 | .638 | 243055455.373 | 305.357 | .254 | |
Skewness | .056 | -.072 | .056 | 2.046 | .121 | |
Std. Error of Skewness | .333 | .333 | .333 | .333 | .333 | |
Kurtosis | .506 | -1.419 | .506 | 4.309 | -2.068 | |
Std. Error of Kurtosis | .656 | .656 | .656 | .656 | .656 |
The summary table in Figure 1 shows that all the variables had a positive skewness except the site. This simply means that they are asymmetric and have a long tail to the right (Ho, & Carol, 2015). On the nature of the curve relative to the standardized normal curve, the number of employees, the number of hours at work, and injury rates have a positive kurtosis that indicates that these variables have a more picked plot relative to the normal curve (Wilcox, 2012)..
To find a model that can be used to determine the injury rate at different working sites, a regression analysis was performed and the results are as summarized in Table 2 and Table 3.
Table 2:
ANOVAa |
||||||
Model | Sum of Squares | Df | Mean Square | F | Sig. | |
1 | Regression | 6244.698 | 3 | 2081.566 | 10.843 | .000b |
Residual | 9023.158 | 47 | 191.982 | |||
Total | 15267.856 | 50 | ||||
a. Dependent Variable: injury rate | ||||||
b. Predictors: (Constant), number of hours at work, site, supervisors gender |
The p-value, in this case, is less than the set level of significance α = 0.05. Thus, this is a clear indication that there exists a significant evidence to reject the null hypothesis.
Table 3:
Coefficientsa |
||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | 50.082 | 7.842 | 6.387 | .000 | |
Site | .308 | 2.481 | .014 | .124 | .902 | |
supervisors gender | 2.259 | 4.013 | .065 | .563 | .576 | |
number of hours at work | -.001 | .000 | -.654 | -5.604 | .000 | |
a. Dependent Variable: injury rate |
A fitted regression model that can be used to estimate the risk rate at different working site, when different supervisors gender is in charge and different working hours is as given below;
Injury rate = 50.082 + 0.308* (Site) + 2.259* (Supervisors gender) – 0.001*(Number of hours at work)
Table 4:
Model Summary |
||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
1 | .640a | .409 | .371 | 13.85576 |
a. Predictors: (Constant), number of hours at work, site, supervisors gender |
From the model summary table, the coefficient of determinant r2 shows that only 37.1% of the variation can be explained by the fitted regression model. Nevertheless, a larger portion of 62.1% remain unexplained, hence this model is not best determining or estimating the injury rate.
Table 5:
Paired Samples Correlations |
||||
N | Correlation | Sig. | ||
Pair 1 | Injury rate & number of employees | 51 | -.636 | .000 |
Pair 2 | Injury rate & number of hours at work | 51 | -.636 | .000 |
Pair 3 | Injury rate & supervisors gender | 51 | -.090 | .532 |
Pair 4 | Injury rate & site | 51 | -.074 | .606 |
This result indicates that there exists a strong negative correlation between injury rate & a number of hours at work, and injury rate & number of employees, which is statistically significant since p-value < 0.001. Nevertheless, Injury rate & supervisor’s gender and Injury rate & site have a weak negative correlation which is not statistically significant since p-value > α = 0.05.
Data Interpretation Practicum in Discriminant Analysis Conclusion
The analysis illustrates clearly that the objectives set at the beginning of the paper have been achieved and the hypothesis tested and inference made about the sample population. Their results showed that there existed a statistically significant difference between the given variables and this lead to the rejection of the null hypothesis. Using this token, a conclusion was made that there is a significant difference in injury rate at a working site and supervisor’s gender, a number of employees and the number of hours at work.
Data Interpretation Practicum in Discriminant Analysis References
Ho, A. D., & Carol, C. Y. (2015). Descriptive Statistics for Modern Test Score Distributions Skewness, Kurtosis, Discreteness, and Ceiling Effects. Educational and Psychological Measurement, 75(3), 365-388.
Lowry, R. (2014). Concepts and applications of inferential statistics.
O’Leary, Z. (2013). The essential guide to doing your research project. Sage.
Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing. Academic Press.