.

Predicting Readmission of Diabetic Patient Solution

Answer:

In this project, we can define our problem like how many patients has diabetes or not, another question we can ask that how many males has these diseases or how many females has this kind of disease. Another problem we can frame that how many patients will be re-admitted to hospital or not. We can define this kind of problem and we can answer them using the analysis of this dataset. by analysing the data, we can also formulate the different problems, but we can say that whatever kind of problem we are making, we will first analyse our dataset and then we will answer them by proper statistical methods.

Approach of the pre-processing:

For solving the above-mentioned problem first, we will download the dataset from the given link in the project. After downloading the dataset, we will check the attributes that which kind of attribute is present in the dataset. Based on these attributes we formulate our problems and based on the problem we try to find the exact solution of the problem.

After downloading the dataset, we check the missing value and any inappropriate values in the attributes if we find like this. Then, there is concept in the analytics to remove the missing and inappropriate values i.e. is known as pre-processing phase. So, we will apply the data pre-processing on the dataset. After performing the pre-processing on the given dataset our data have no missing and inappropriate values. For the pre-processing of the data we have removed the missing values and the special character from the dataset. After removing the special characters from the data our data looks like as follows:

statistics-assignment-solution-13-img1

In the above diagram we can see that there are no special characters present in the dataset. In this dataset we can take care of the special characters by removing them from the data and this process of removing the data is known as pre-processing phase of the data. The data which is provided here are belongs from the healthcare industry. In this we can take or select the attribute on the basis of the problem. It means that what kind of questions we have raised. So, while analysing the data we have left some attributes. We have not considered it while finding the solution of the problem. In this we have included few attributes which are more selective and important for our analysis. The attribute which we have not selected are weight, payer code etc. Like this we have pre-processed the dataset. we can also use the PROC SQL to analyse the dataset. In the below diagram last column values shows the number of missing values which is showing 0. Hence data has no missing value.

statistics-assignment-solution-13-img2

After cleaning the data, we will apply the statistical concepts on the data to get the proper statistical values. These values will contain the descriptive values about the dataset. These values will tell about the nature of the data. Due to these values we will make our all decisions based on these data. Based on the data we will try to understand the patient conditions. As we have data given which belongs from the healthcare industry. This will assist in designing follow-up services specifically catering to chronic illnesses, thereby reducing the costs and improving the health care services. Some statistical results are as follows:

statistics-assignment-solution-13-img3 statistics-assignment-solution-13-img4
.