Epidemiology and biostatistics

Epidemiology and biostatistics

Project description

please answer all questions thoroughly. Please give a detailed rational for all multiple choice answers selected. please show all work on the calculations.

part of the assignment is on SPSS but i changed it to excel to upload here. if there is an email i can send the original to let me know because the excel may not have

all the data needed. thank you

Assignment 2

This Assignment is formed of two sections.

The FIRST SECTION includes questions from “Grove: Statistics for Health Care Research: A practical Workbook, Graded Questions” and other sources.

SGrove, S. K. (2007). Statistics for Health Care Research: A Practical Workbook. Philadelphia, PA: Saunders. ISBN:

9781455709960

== The SECOND SECTION includes questions related to analyzing and interpreting the data. Download and use the file “Sample_Data_Assignment2_Blackboard.sav”

This assignment includes 40 points and represents 20% of the course grade. Each question is one point.

Please write your name on the top of each page.

Section I: Grove: Statistics for Health Care Research

A practical Workbook

Instructions: For each exercise section – questions to be graded, in the book “statistics for health care research, a practical workbook”, answer only the specified questions written in bold. Show your calculations and provide rationale for your answers for full credit.

For example: for exercise 1, “questions to be graded”, you need to answer only questions 5 and 7.

Below are the list of the exercise numbers, title and the specific questions that you will need to answer. You need to read the sections in the workbook in order to answer the questions.

Exercise 1: Identifying Level of Measurement: Nominal

Answer questions: 5 and 7

5. What number and percentage of the 44 depressed subjects were treated with antidepressant medications? Do you think an adequate number received treatment with medication? Provide a rationale for your answer.

7. The researchers excluded persons from the study who had a history of psychiatric illness. Provide a rationale for excluding these persons.

Exercise 2: Identifying Level of Measurement: Ordinal

Answer questions: 2 and 7

2. What statistics were used to describe the demographic variable Estimated Yearly Family Income in this study? Were these appropriate?

7. Should the demographic variable Educational level be analyzed with parametric or non parametric statistical analysis techniques? Provide a rationale for your answer.

Exercise 3: Identifying Level of Measurement: Interval/Ratio

Answer question: 6, 9

6. Looking at Table II, what descriptive analysis techniques were performed on the interval/ratio data?

9. Are there significant differences between the intervention and the control groups for any of the variables in Table I? Provide a rationale for your answer.

Exercise 4: Understanding Percentages

Answer questions: 4 and 6

4. What number and percentage (%) of the total number of respondents had a current CRC test?

6. Explain why the number of total subjects’ data in Table 2 is for 859 subjects when the total sample for the study was 869 subjects.

Exercise 5: Frequency Distributions with Percentages

Answer questions: 4 and 10

4. What level of education achieved by the mothers is the mode for this variable? Document your answer as both a frequency and percentage.

10. Do you think that this study and its results can be generalized to the United States? Provide a rationale for your answer.

Exercise 6: Cumulative Percentages and Percentile Ranks

Answer questions: 5 and 9

5. What number and percentage of nurses documented a different pain score from the grimacing patient’s self-reported pain score of 8?

9. Is this study only applicable to the elderly population? Do you think younger patients’ self-reports of pain are believed and their pain appropriately treated?

Exercise 7: Interpreting Histograms

Answer question: 7, 9

7. In Figure 2, which variable is placed on the x-axis? Which variable is placed on the y-axis?

9. Examine Figures 1 and 2 and compare their distribution patterns. Are the distribution patterns similar? Provide a rationale for you answer.

Exercise 8: Interpreting Line Graphs

Answer questions: 6 and 10

6. The breastfeeding rate post-intervention was greater than the pre-intervention rate over the 12 months of the study. Is this statement true or false? Provide a rationale for your answer.

10. What implications for practice do you note from these study results?

Exercise 11: Using Statistics to Describe a Study Sample

Answer questions: 3 and 8

3. What other statistic could have been used to describe the length of labor? Provide a rationale for your answer.

8. Can the findings from this study be generalized to Black women? Provide a rationale for your answer.

Exercise 15: Measurement of Central Tendency: Mean, Median, and Mode

Answer questions: 1 and 9

1. The following list represents the number of nursing students enrolled in a particular nursing program between the years of 2001 and 2007, respectively: 563, 593,

606, 520, 563, 610, and 577. Determine the mean, median, and mode of the number of the nursing students enrolled in the above program between 2001 and 2007. Show your calculations.

9. Assuming that ? = 0.01, which nursing specialties demonstrated a significant change in popularity between the stages 1 and 2 of the research questionnaire administration? Provide a rationale for your response.

Exercise 16: Mean and Standard Deviation

Answer questions: 1, 4

1. The researchers analyzed the data they collected as though it were at what level of measurement?

a. Nominal

b. Ordinal

c. Interval/ratio

d. Experimental

4. Compare the mean baseline and posttest depression scores of the control group. Do these scores strengthen or weaken the validity of the research results? Provide a rationale for your answer.

Exercise 19: Determining Skewness of a Distribution

Answer questions: 1, 3

1. The age distribution of people diagnosed with cystic fibrosis is most likely to be:

a. negatively skewed.

b. normally distributed.

c. positively skewed.

d. bimodal.

3. Does a set of scores with most of its values above the mean have a negatively or positively skewed distribution? Provide a rationale for your answer.

Exercise 22: Scatterplot

Answer questions: 2 and 7

2. What type of relationship does Figure 22–2 illustrate? Provide a rationale for your answer.

7. Does Figure 1 from the Hitchings and Moynihan (1998) study have any outliers? Provide a rationale for your answer.

Exercise 23: Pearson’s Product-Moment Correlation Coefficient

Answer questions: 4 and 10

4. Without using numbers, describe the relationship between the Hamstring strength index 120°/s and the Triple hop index.

10. Consider the relationship reported for the Quadriceps strength index 120º/s and the Hop index (r = 0.744**, p = 0.000). What do these r and p values indicate

related to statistical significance and clinical importance? [alpha is set at 0.05].

Exercise 29: t-Test for Independent Groups I

Answer questions: 2 and 5

2. t = –3.15 describes the difference between women and men for what variable in this study? Is this value significant? Provide a rationale for your answer. [alpha is set at 0.05]

5. Consider t = –2.50 and t = –2.74. Which t ratio has the smaller p value? Provide a rationale for your answer. What does this result mean?

Exercise 36: Analysis of Variance (ANOVA) I

Answer questions: 3 and 6

3. The researchers stated that the participants in the intervention group reported a reduction in mobility difficulty at week 12. Was this result statistically

significant, and if so at what probability?

6. Can ANOVA be used to test proposed relationships or predicted correlations between variables in a single group? Provide a rationale for your answer.

Section II: USING THE DATA SET DOWNLOADED FROM BLACKBOARD

“Sample_Data_Assignment2_Blackboard.sav”

1. Do frequency for the following variables and interpret the findings:

[1 POINT]

Age category, Gender, Bp (Blood pressure), Active (physically active), dhosp (died in hospital).

2. Do descriptive statistics and histogram with normal distribution and interpret the results for the following variables: [1 POINT]

Age (years), cost (total cost of hospitalization and rehabilitation), and log_COST (log transformed data to change it to normal distribution)

– Are these variables normally distributed?

3. Is there a difference between those who died in the hospital and those who did not die in the hospital in age (age at admission in years)? [1 POINT]

a. What statistical test you will use? Why?

b. Is there a statistically significant difference between the groups? Explain and interpret the finding

4. Is there a difference between the blood pressure groups in age (age at admission in years)? [1 POINT]

a. What statistical test you will use? Why?

b. Is there a statistically significant difference between the groups? Explain and interpret the finding

5. Is there a correlation between age (age at admission in years) and cost (total cost of hospitalization and rehabilitation)? [1 POINT]

a. Run correlation between the two variables and report the correlation coefficient (r) [direction and strength] and interpret the results.

6. Are there relationship between patient’s death in the hospital (dhosp) and the following variables: Bp (Blood pressure), Active (physically active). [DO

CROSS-TABLES] [1 POINT]

a. What statistical test you will use? Why?

b. Is there a statistically significant difference between the groups? Explain and interpret the finding

7. Write a summary report for the results of the study and its impact on nursing practice (i.e., summarize the findings from question 1 to 6) [Two POINTS]

**SAMPLE ANSWER**

Assignment on Epidemiology a7 Biostatistics

EXERCISE 1

5 What number and percentage of the 44 depressed subjects were treated with antidepressant, do you think adequate number received treatment medication

Of the 44 participants who were depressed subjects were treated with anti-depressants only 13% were reported to be using anti-depressants 37% were not being treated with any convectional interventions like exercise and herbal, which one would say was not significant enough to attribute antidepressants as an intervention strategy, also chi-square didnt attribute antidepressant as significant intervention strategy

- The researcher excluded person from the study who had history of psychiatric illness

In a clinical trial, the investigators must specify **Inclusion and exclusion criteria** for participation in the study. Inclusion criteria are characteristics that the prospective subjects must have if they are to be included in the study, while exclusion criteria are those characteristics that disqualify prospective subjects from inclusion in the study. **Inclusion and exclusion criteria** may include factors such as age, sex, race, ethnicity, type and stage of disease, the subject’s previous treatment history, and the presence or absence (as in the case of the “healthy” or “control” subject) of other medical, psychosocial, or emotional conditions. One would say in this case

Exercise 2

- What statistics were used to describe the demographic variable Estimated Year Family Income

Measure of central tendency, distribution and dispersion

7 should demographic variable education be analyzed with parametric or non parametric statistical technique

Education level here is ordinal variable therefore it will use non parametric statistical technique, Nonparametric methods are useful for analysis of nominal or ordinal data. They are also useful whenever questions occur concerning the underlying assumptions of a counterpart parametric procedure for interval or ratio data. In general, parametric procedures will have nonparametric counterparts, although the hypothesis tested will not always be exactly the same. For example, a parametric two-sample test for differences in means, may have a counterpart nonparametric test which is a two-sample test for differences in medians.

Exercise 3

- Looking at Table 1, what descriptive analysis techniques were performed on interval and ratio data

The interval /ratio data was analysed using Descriptive statistics which analysis of data that helps to describe, show or summarize data in a meaningful way such that, for example, patterns might emerge from the data. Descriptive statistics do not, however, allow us to make conclusions beyond the data we have analysed or reach conclusions regarding any hypotheses we might have made. They are simply a way to describe our data., which consisted of Measures of central tendency: these are ways of describing the central position of a frequency distribution for a group of data. In this case, the frequency distribution is simply the distribution and pattern of marks scored by the 100 students from the lowest to the highest. We can describe this central position using a number of statistics, including the mode, median, and mean. You can read about measures of central tendency and Measures of spread: these are ways of summarizing a group of data by describing how spread out the scores are. Measures of spread help us to summarize how spread out these scores are. To describe this spread, a number of statistics are available to us, including the range, quartiles, absolute deviation, variance and standard deviation.

9.0 Are there significant difference between the intervention and the control groups of the variables

Chi square did not show any significance difference between the intervention and control group as the P<0.05 which rule out any significance difference

Exercise 4

- what number or percentage(%) of the total number of respondents used CRT .

Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample. enerally the class interval or class width is the same for all classes. The classes all taken together must cover at least the distance from the lowest value (minimum) in the data set up to the highest (maximum) value, In case of our set of data the cumulative frequency and the percentage is 45% and the frequency is 24

6 Explain why the number of total subjects in Table 2 is for 859 subjects when the total subjects of the sample is stated as 869

Because of missing value and exclusion criteria

Exercise5

: Frequency Distributions w ith Percentages

Answer questions: 4 and 10

- what level of educational achievement by the mother is the mode

We can easily the following formula is used to identify the modal group (the group with the highest frequency), which is **11 – 15 schooling years**

Estimated Mode = L + |
f_{m} − f_{m-1} |
× w |

(f_{m} − f_{m-1}) + (f_{m} − f_{m+1}) |

where:

- L is the lower class boundary of the modal group
- f
_{m-1}is the frequency of the group before the modal group - f
_{m}is the frequency of the modal group - f
_{m+1}is the frequency of the group after the modal group - w is the group width

10 Do you think the sample can be generalized for the population of Whole of USA

Indeed NO s because the sample is not inclusive of all demographics although and seems not to include other races like Africa America, Hispanic among others also the sample too small

Exercise 6

Cumulative percentage and Percentile ranks

5.0 what number and percentage of nurses documented a different pain score from grimacing patients

25%

9.is this study only applicable to elderly population

The study is all inclusive and can be validly be used by elderly as well as younger patient

Exercise 7

In fig 2 which value is placed in Y axis and X axis

Y axis is the dependent variable while X axis is the independent variable

- Examining figure 1 and 2 and compare their distribution patterns, are they similar in the pattern

Figure 1 shows a normally distributed data while Figure 2 show left skewed data, When you have a normally distributed sample you can legitimately use both the mean or the median as your measure of central tendency. In fact, in any symmetrical distribution the mean, median and mode are equal. However, in this situation, the mean is widely preferred as the best measure of central tendency as it is the measure that includes all the values in the data set for its calculation, and any change in any of the scores will affect the value of the mean. This is not the case with the median or mode.

Exercise 8

- The breast feeding rate post intervention score were better than pre-intervention score as shown by the mean as well as standard deviation
- The implications that it is necessary to advocate or promote breast feeding as post intervention as opposed to pre-intervention

Exercise 11

3 What other statistics could be used to determine length of labour

Mean, Mode, maximum, variance and Standard deviation

8.0

Can the findings from the study be generalized to include all black women

Yes, because the sample was adequate and it will be inclusive

Exercise 11

Determine mode, median and mean of the following nursing students enrolled in year 2001 to 2007

563, 593, 606, 520, 563, 610 and 577

Mean

To find the Mean, add up all the numbers, then divide by how many numbers there are:

563+593+606+520+563+610+577=576

Mode

To find the Mode, or modal value, place the numbers in value order then count how many of each number. The Mode is the number which appears most often (you can have more than one mode),in this case 563 appears twice

Median

To find the Median, place the numbers in value order and find the middle number (or the mean of the middle two numbers). In this case the mean of the 10^{th} and 11^{th} values:

563, 563,520, 577,593,606, 610

9 Assuming an alpha=0.01 which nursing specialty demonstrated a significant change in popularity between 1 and 2 in questionnaire administration, Cronbach’s alpha determines the internal consistency or average correlation of items in a survey instrument to gauge its reliability. Computation of alpha is based on the reliability of a test relative to other tests with same number of items, and measuring the same construct of interest Alpha coefficient ranges in value from 0 to 1 and may be used to describe the reliability

Exercise 16

- The researchers analysed the data they collected as though it were at what level of measurement

Ordinal scale

4.- comparing the mean baseline and post test depression scores of control group, it is very clear that including the control group intervention strengthen the experiment because it give an opportunity to analyze the experiment holistically reducing bias in interpretation

Exercise 19 Skewness of a distribution

Bimodal

A histogram with two peaks is called “bimodal” since it has two values or data ranges that appear most often in the data. In a process that is repeated over time, we typically expect the data to appear in the familiar, bell-shaped curve of the normal distribution. Thus, the bimodal histogram can signal something out of the ordinary. When viewing this histogram, the data looks quite different – in fact, this second histogram almost seems to have a roughly normal distribution (or slightly skewed distribution) with a single peak

- Negatively Skewed

negatively skewed distribution, the mode is higher than the median which is higher than the mean therefore in our case the data set has most of the score above the mean, meaning most of the 3rd moment about the mean is called skewness .In a negatively skewed distribution the tail of a distribution points toward the low scores

Exercise 22

The relationship is a positive significant relationship, where the dependent variable is influencing independent variable positively, increase in dependent variable leads to increase in independent variable

- The figure 1 shows an extreme value, else called an outlier, which can be seen in the presence of a very large mean, and therefore interfering with normal distribution

Exercise 23 Pearson products-moment

There is a significant association between strength index 120/ s and triple hop index with p value less tha o.o5 The Pearson product-moment correlation coefficient is a measure of the strength and direction of association that exists between two variables measured on at least an interval scale

10 The R is a measure of the correlation between the observed value and the predicted value of the criterion variable. R Square (R2) is the square of this measure of correlation and indicates the proportion of the variance in the criterion variable which is accounted for by our model. In essence, this is a measure of how good a prediction of the criterion variable we can make by knowing the predictor variables, in this case are indicate 66% of the association

Exercise 29 ttest for independent groups

3.0 The ttest of -3.15 is significant at p<0.05 indicating a statistical significant difference between women and men, first because it is lower than critical value of the study, For a two-tailed test if the calculated value of t exceeds the tabled value, then report the p value in the table. For a one-tailed test, the p value is divided by two. So ‘p < 0.05’ becomes ‘p < 0.025.”

The table should include values for p=0.1 so that a one-tailed test can be conducted at the p=0.05 level, Negative t-values: The sign of a t-value tells us the direction of the difference in sample means,

6.0

We do report t test value as an absolute value, so whether negative or positive does not matter here therefore test with absolute value 2.50 is smaller than 2.74. Case I represents the null hypothesis (H_{O}: µ_{1} = µ_{2}) indicating that the mean of group one equals the mean of group two; both samples come from the same population. This would signify that the drug had no effect on blood pressure. The difference in the means is small, suggesting that they come from the same population. Case II represents the alternate hypothesis (H_{A}:µ_{1}≠ µ_{2}), indicating that the mean of group one does not equal the mean of group two; the two sample means are from different populations. The difference in the means is too large to come from one population in most cases. Hence the means are probably coming from two different populations. A t-test decides which of these hypotheses to accept.

Exercise 36 ANOVA

3

Participants in the intervention group reported a reduction in mobility difficulty at 12 weeks, ans this is significant as shown by P vale of <0.05

6.0 The one-way analysis of variance (ANOVA) is used to determine whether there are any significant differences between the means of three or more independent (unrelated) groups therefore not appropriate in the case of one group

We can write this or a similar paper for you! Simply fill the order form!