Changing Variables in String Data to Numerical Sample Answer

First, five variables were in string data type and had to be changed to numerical.

The variables were ID_Num, Age_BP, Systolic_E, diastolic, and BMI. After changing their data type, the incorrectly inputted data will be automatically be deleted. First, in the ID_Num there was an incorrect entry of “1o0” which was corrected by inputting number 11. On the Age_BP there was a wrong data entry “25y” that was collected by deleting the “y.” Although it is not clarified on the coding of the sex, there should be only two categories, either {0 and 1, or 1, and 2}. Therefore, the third category should be collected using the original data set.

In the systolic variable, the first three entries are incorrectly entered with alphanumeric letters. The letters s, sy, and sys were removed to have the values 139, 170, and 151 respectively. A similar correction was made on the diastolic column where the letters “d,” “di” and “dia” were removed to have the values 81, 110, and 109. Also, the bask slash “/” was removed to have 85. On the BMI column, the value “.25” was changed to be 25, and “223lbs” was corrected using the original data since there is a close association between the values.

In the death age column, the blue “-81” was changed to 81, and in the diabetic column, the value “-1” was changed to 1. For the place column, the values of 1.5, and 3.5 was referred to the original data collected for correction purposes. The data was saved and are as illustrated in the screenshot below. Notably, the black data points are to be corrected using original data collected for accuracy.

