In this assignment we consider data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes their blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The current assignment involves data collected on a random sample of 748 donors. The data was obtained from the UCI Machine Learning Repository. This data was assembled by Prof. I-Cheng Yeh.

The file "**transfusion.csv**" contains the data. The file contains 5 variables:

**recency**= The number of months since the last donation. (numeric)**frequency**= The total number of donations. (numeric)**monetary**= Total blood donated (in c.c.). (numeric)**time**= The number of months since the first donation. (numeric)**march2007**= An indicator. Indicates those that donated blood in March, 2007. (factor)

In this assignment we consider the variables **frequency **and **monetary**.

*Descriptive Statistics*

Save the data set in your computer and read it into R. Compute the mean, median, the interquartile range, the standard deviation of the variable **frequency **and plot it's histogram. In Tasks 1-3 you are asked to describe the distribution of this variable on the basis of the computations and the plot.

*Estimating Parameters*

In Tasks 4-6 you are asked to estimate the expectation and standard deviation of the variable **frequency**. An estimator is used to estimate the expectation. This estimator has a standard deviation. You are required to estimate this standard error, which is the standard deviation of the estimator. You are required to describe which estimator was used for each estimation task.

*Estimating the MSE*

Consider the variable **monetary**. We assume that the distribution of this variable is Exponential(λ) and are interested in the estimation of the parameter λ. The proposed estimator is 1/X, where X is the sample average. In Tasks 7-8 you are required to estimate the value of the parameter and estimate the mean square error (MSE) of the estimator.

You may apply a method called **The Bootstrap** in order to estimate the MSE. The bootstrap method initiates by estimating the parameter λ. It proceeds with a simulation to compute the MSE, with λ equal to the value estimated from the provided data.

For the assignment you should complete the following 8 tasks. Tasks 1-3 refer to the **descriptive statistics** problem presented above, Tasks 4-6 refer to the problem of **estimating parameters** and Tasks 7-8 refer to the task of estimating the parameter of an Exponential distribution and **estimating the MSE** of the estimators. Your answers should be short and clear. We recommend that you copy and paste the tasks below into the form titled "Submit your Assignment using this Form". You can then write you answers to the tasks in the designated positions that are marked in the text:

__Tasks__

*Descriptive Statistics:*

1. The distribution of the variable "frequency" is:

__ Skewed to the left, __ Symmetric, __ **Skewed to the right**.

**Mark the most appropriate option and explain your selection**

**Since the mean is greater than the median, the distribution of the variable "frequency" is skewed to the right. The histogram is also indicated the same.**

2. The number of outlier observations in the variable "frequency" is: ** 45**.

**Explain each step in the computation of the number of outlier observations**

**There are 45 observations above the upper fence, i.e. Q3 + 1.5 IQR = 7 + 1.5*5 = 14.5, therefore, all those 45 observations were treated as outliers.**

3. Which of the following theoretical models is most appropriate to describe the distribution of the variable "frequency"?

__ Binomial, __ Poisson, __ Uniform, __ **Exponential**, __ Normal.

**Mark the most appropriate option and explain your selection**

**The theoretical models most appropriate to describe the distribution of the variable "frequency" is Exponential distribution. From the histogram it can be observed that the range 0-5 had very huge frequency, then the next class 5-10 had almost half of the frequency of the first class, ans so on. This kind of huge decrease in the frequency indicated that the distribution of the variable "frequency" is matched with Exponential model is the most appropriate.**

*Estimating Parameters:*

4. The estimated value of the expectation of the measurement "frequency" is: ** 5.5147**.

**Explain your answer**

**The sample mean is an estimate of the population mean. Therefore, the estimated value of the expectation of the measurement "frequency" is the sample mean of the variable “frequency” i.e. 5.5147.**

5. The estimated value of the standard deviation of the measurement "frequency" is: ** 5.8393**.

**Explain your answer**

**The sample standard deviation is an estimate of the population standard deviation. Therefore, the estimated value of the standard deviation of the measurement "frequency" is the sample standard deviation of the variable “frequency” i.e. 5.8393.**

6. The estimated value of the standard deviation of the estimator that produced the estimate in 4. is: ** 0.2135**.

The estimated value of the standard deviation of the estimator that produced the estimate in 4. Is,

**Sample standard deviation / Sqrt (n) = 5.8393 / Sqrt (748) = 0.2135**

**Explain your answer**

*Estimating the MSE:*

7. The estimated value of λ for the variable "monetary" is: ** 0.000725**.

**Attach the R code for conducting the computation**

The R code for conducting the computation is,

# Estimating the parameter Lambda

x_bar = mean(d1$monetary)

x_bar

Lambda_hat = 1/x_bar

Lambda_hat

8. The estimated value of the MSE of the estimator of λ is: ** 0.000000248**.

**Attach the R code for conducting the computatio**n

The R code for conducting the computation is,

# Estimating the MSE

simdata = rexp(n = 5000, rate = Lambda_hat)

matrixdata = matrix(simdata, nrow = 1000, ncol = 5)

Lambda.exp = 1/apply(matrixdata, 1, mean)

bias.exp = 1/apply(matrixdata, 1, mean) - Lambda_hat

bias <- mean(bias.exp)

var <- var(Lambda.exp) * (999/1000)

mse <- bias^2 + var

rbind(bias, var, mse)

hihi

Earn back money you have spent on downloaded sample

To export a reference to this article please select a referencing stye below.

Assignment Hippo (2021) . Retrive from http://www.assignmenthippo.com/sample-assignment/uci-machine-learning-repository

"." Assignment Hippo ,2021, http://www.assignmenthippo.com/sample-assignment/uci-machine-learning-repository

Assignment Hippo (2021) . Available from: http://www.assignmenthippo.com/sample-assignment/uci-machine-learning-repository

[Accessed 07/03/2021].

Assignment Hippo . ''(Assignment Hippo,2021) http://www.assignmenthippo.com/sample-assignment/uci-machine-learning-repository accessed 07/03/2021.

Want to order fresh copy of the **Sample Template Answers? ** online or do you need the old solutions for Sample Template, contact our customer support or talk to us to get the answers of it.

Our motto is deliver assignment on Time. Our Expert writers deliver quality assignments to the students.

Get reliable and unique assignments by using our 100% plagiarism-free.

Get connected 24*7 with our Live Chat support executives to receive instant solutions for your assignment.

Get Help with all the subjects like: Programming, Accounting, Finance, Engineering, Law and Marketing.

Get premium service at a pocket-friendly rate at AssignmentHippo

Tap to ChatGet instant assignment help