.

ADMS2320 Assignment Help

ADMS2320 Course Help

Assignment 2

Case 1: Home Delivery

According to the latest census, the number of households in a large metropolitan area is 425,000. The home-delivery department of the local newspaper reports that 104,320 households receive daily home delivery. To increase home-delivery sales, the marketing department launches an expensive advertising campaign. A financial analyst tells the publisher that for the campaign to be successful, home-delivery sales must increase to more than 110,000 households. Anxious to see whether the campaign is working, the publisher authorizes a telephone survey of 400 households within 1 week of the beginning of the campaign and asks each household head whether he or she has the newspaper delivered. The result is shown in the Excel file, home_delivery.xls. The responses were recorded where 2 = yes and 1 = no. Solve the following questions manually. (Calculations are to be done to the 4th decimal place.)

  1. Do these data indicate that the campaign increases home-delivery sales? Solve this question using the rejection region method. Assume that 5% .
  2. Do these data allow the publisher to conclude that the campaign is successful? Solve this question using the p-value approach. Assume that 5% .
  3. Estimate with 95% confidence the total households receiving daily home delivery.

Case 2: Veggie Cuisine

Kate Smith is the owner of Veggie Cuisine, an upscale coffee shop, selling the finest veggie sandwiches, soups and excellent coffee. To date, Kate has been preparing all the sandwiches, soups and drinks herself, based on her instincts and experience. However, the long hours and the success of the business have necessitated hiring some staff.

Onion Soup, one of the soups Kate’s sells is an exquisitely rich flavour soup with a French touch. Kate’s experience is that the ideal temperature for onion soup is 142°C (hot but not “too” hot).

While training one of the new cooks, Kate took a random sample of thirty one bowls of onion soup and measured the temperature of each bowl using an instant-read digital kitchen thermometer that is accurate to the nearest whole degree. The data is in the file veggie.xls, on the “Onion soup” tab.

The brew time of Espresso is another one of the most difficult tasks for a barista. That is, brewing the perfect cup of espresso is a hard task. The quality of espresso is a function of the fineness of the coffee grind, the temperature of the water and the amount of coffee released into the filter basket (all machine settings) plus the amount of pressure used in tamping (pressing down) the coffee into the filter. This last variable is the human factor that baristas need to learn. If not enough pressure is used, the coffee is packed too loosely and the brew time will be too short causing the Espresso to have a weak flavour. If too much pressure is used, the coffee is packed too tight, causing the brew time to be too long, resulting is a coffee that is excessively acidic and bitter. During one shift, Kate timed a random sample of thirty two espressos that one of the new baristas was brewing. The data in seconds is in the file veggie.xls, on the “Espresso” tab.

  1. At α=0.10 does the sample data provide evidence to show that the true mean temperature of the onion soup differs from 142 degrees?
  2. At α=0.10 does the sample data provide evidence to show that the true standard deviation of the onion soup is greater than 3 degrees?
  3. Estimate with 95% confidence the barista’s mean brew times of the Espresso. Does your interval show that the brew times are consistent with the ideal brew time which is 22 seconds? What will be your conclusion?
  4. Estimate with 95% confidence the barista’s variability of the brew times of the Espresso.

Case 3: Cooking Device

A research and development team at a California based appliance firm claims to have developed a new cooking device that consumes an average of no more than 350 W. From previous studies, it is believed that power consumption for these types of cooking devices is normally distributed with a standard deviation of 19.5 W. A consumer watch group, tasked with validating the claims of such firms, suspects the actual average is more than 350 W. The consumer watch group took a sample of these cooking devices which can be found in the file cooking device.xls. (Calculations are to be done to the 4th decimal place.)

  1. Test at the 99% confidence level whether the cooking device consumes more than the appliance firm indicates.
  2. b) Determine the 95% confidence interval estimate of the population mean and describe the results.

Case #1 – Micro Systems Inc.

Creativity, innovation and productivity from its employees is essential for the development and progression of Micro Systems. Satisfaction levels within the workplace, along with other criteria, have long been attributed to creativity, innovation and productivity. Micro would like to understand whether the company as a whole is doing well in this department. It is interested in the development of its employees and wants to better understand the current satisfaction level of its employees in terms of career progression, training and employee attitude towards their benefits.

A survey was developed and distributed to 2000 employees. Included with the survey was a letter that discussed Micro’s philosophy; the survey asked several key questions regarding the current level of satisfaction. Of the 2000 surveys mailed, 1214 employees responded.

To get help with the analysis of the survey data, Micro approached ADMS 2320 classes with the hopes of having a statistics student serve as an intern with the company. The interns first task was to key the data into a file that could be analyzed using a spreadsheet or a statistical software package. The survey contained seven questions that were keyed into eight columns as follows:

Column A: Respondent number
Column B: Employee Level
5= Senior Management, 4=Mid-level Manager, 3=Supervisor, 2=Developer, 1=Finance
Column C: satisfaction with career progression
Column D: Satisfaction with Micro’s training programs for staff
Column E: Satisfaction with Micro’s employee benefits program
Columns C, D and E were recorded on an ordinal scale as follows:
1= Very unsatisfied, 2=unsatisfied, 3=neutral, 4=satisfied, 5=very satisfied
Column F: Number of years the respondent has been an employee
Column G: Gender (recorded as 1=male, 2= female)
Column H: Employee age

You have been hired as the local intern at Micro and the Manager has asked you to summarize this data using descriptive graphical and numerical techniques. She has limited you to using charts, graphs and tables to help Micro’s human resources team understand their employees better.

REQUIRED:

Use the appropriate descriptive graphical and numerical techniques to answer the following questions:

  1. Overall analysis of information (total satisfaction levels, total employee level, demographics of gender related to employee level, etc...).
  2. Cross-related analysis of satisfaction information (Satisfaction levels in relation to other information in the case, for example, Satisfaction Level and Gender may be ONE consideration).
  3. Overall, one paragraph write-up of key findings – in point form The Data file for this case is Micro_Systems.xls

Case 2: Tax Code

The Governor’s Office Manager of the State of Arkansas, Mr. Fox wants to know if people think that the US tax code is spread fairly across income groups, age groups and education levels. The information will help him prepare the governor’s speech in the White House on April 2011. MR. Fox ask Fleishman-Hillard Inc. to run a survey that will sample 1000 people from Arkansas, of different age groups, which will ask the following question:

“Do you think the U.S. Tax Code is spread fairly across income groups, age groups and education levels?”

The respondents are asked to indicate whether they think the Tax Code is fairly spread or unfairly spread between the groups, and to also indicate in which of the following age groups they classify themselves to be: 18 - 24 25 - 34 35 – 44 45 – 54 55 - 64 65 or older

The responses to the survey, along with the age groups, can be found in the data file Survey_Data.xls file.

In order to better understand people’s attitude towards the Tax Code, Mr. Fox wants to know for each age category, what is the probability that the people felt that the U.S. Tax Code is unfairly spread across income groups, age groups and education levels.

Show all your work to support your calculation.

Case 1 - US Cell phone carrier

In 2011 the companies that dominate the U.S cell phone carrier market were AT&T, Sprint, Verizon, and T-Mobile. The following table shows the results of a survey that asked 40 people to name the company that provides their current cell phone service. This data can be found in the file Cell_Phone_Carrier.xls posted on the course website with the assignment files.

Sprint Verizon Sprint T-Mobile
AT&T Verizon Sprint Verizon
AT&T AT&T Sprint T-Mobile
AT&T T-Mobile Verizon AT&T
T-Mobile Sprint AT&T AT&T
Sprint T-Mobile AT&T AT&T
Sprint Sprint AT&T Sprint
AT&T Sprint AT&T Sprint
T-Mobile Verizon Verizon Sprint
Verizon Verizon T-Mobile AT&T

In classNamewe introduced a few types of tools that will allow you to describe the data above. Choose the best two tabular and two graphical tools to describe the data above and analyze the results verbally

Case 2 - College Salaries

You have decided that you want to become a College professor. You want to identify factors that will lead you to success in this profession. For that purpose you use the information that was obtained from a random sample of 100 professors at an Eastern Ontario College. It is shown in data file posted on the course web site (College_Salaries.xls).

The data are:

  1. Salary: 2011 salary
  2. Seniority: Number of years since first time academic appointment
  3. Doctorate: 1 = holds a PhD or equivalent 0 = does not
  4. Teaching: Average teaching evaluation score out of 5.0
  5. Citations: Number of times any publication of the author’s is cited in another refereed publication
  6. Gender: 1 = male 0 = female

Required:

  1. Use the appropriate graphical techniques, along with the appropriate numerical techniques, to describe the salaries, seniority, teaching evaluations and citation count.
  2. Do the measures used in part(a) appear to differ for:
    1. males and females; and
    2. non-doctorates and holders of doctorates?
  3. Use the appropriate graphical techniques, along with the appropriate numerical techniques, to identify and describe the possible relationship between
    1. salary and number of citations; and
    2. salary and teaching evaluations.
  4. Which of these three variables (seniority, number of citations and teaching evaluations) have the strongest relationship with salary?
  5. Do these measures appear to be different based on gender or on a PhD degree?

Case 3

Greiner et al. (2004) examined the impact of occupational stressors on hypertension among urban transit operators. You wish to validate this study using a sample of 200 transit workers in another city. You administer a stress scale and find a mean score of 18.5 stress scale points with a standard deviation of 4.5 The distribution is normal. Set X=stress scale score.

  1. Any worker who scores 14 or higher on the stress scale is eligible to participate in an intensive interview portion of your study. Given your statistics, how many of the 200 workers in your sample will be given an intensive interview.
  2. What is the probability that the next worker tested will score 10 or below?
  3. Those workers scoring in the highest 15 percent on the stress scale are to be given extensive cardiac testing. What stress score qualifies a participant for these services? Please show all formulas and calculations.

Case 1

People have debating for years as to whether children who are in child care facilities while their parents work experience negative effects. A recent study discussed in the March 2011 issue of Developing Psychology, of 10,000 children found “no permanent negative effects caused by their mothers absence.” In fact, the study indicated that there might be some positive benefits from day care experience. To investigate this premise a non-profit organization called Child Connections conducted a small study in which children were observed playing in a neutral setting (not at home or a day care.) Over a period of 20 hours of observation, 15 children who did not go to day care and 21 children who had spent much time in day care were observed. The variable of interest was the total time of play in which each child was actively interacting with other students. Assume the data are normal. Data are listed on worksheet “Case 1” of the EXCEL file.

  1. Child Connections leaders claimed that the children who had been in day care have a higher average time in interactive situations than the stay-at-home children. Test this claim at the 5% significance level.
  2. Estimate the average interaction time for children who did attend daycare with 90% confidence. Interpret this estimate.
  3. Estimate the average interaction time for children who did not go to daycare with 99% confidence. Interpret this estimate.
  4. Estimate the average difference in interaction time between children who do go to daycare and children who do not with 95% confidence. Interpret this estimate.

Case 2

The general trend over the last century is that each generation is more educated than its predecessor. Has this trend continued? To answer this question, determine whether there is sufficient evidence that Americans (EDUC on worksheet “Case 2” of the data file) are more educated than their parents (PAEDUC for fathers and MAEDUC for mothers)? (use 95% confidence level)

Case 3

Quick Lube is a company that offers an oil-change service while the customer waits. Its market has been broken down into the following segments:

  1. Working men and women too busy to wait at a dealer or service center
  2. Spouses who work in the home
  3. Retired persons
  4. Other

A random sample of car owners was drawn. All owners classified their market segment and also reported whether they usually use such services as Quick Lube (1 = yes, and 2 = no). These data are stored in stacked format on worksheet “Case 3” of the EXCEL file.

  1. Determine whether members of segment 1 are more likely than members of segment 4 to respond that they usually use the service such as Quick Lube?
  2. Can we infer that retired persons and spouses who work in the home differ in their use of services such as Quick Lube?

Case 1: Goesling

Goesling (2001) examined the phenomenon of world income inequality, both within and between nations around the world. Goesling is more interested in measuring the descriptive statistics and particularly in determining the modal monthly income. Suppose the following are a sample of monthly incomes for residents of the United States:

$2,347, $2,434, $1,636, $1,963, $2,358, $1,968, $2,683.

Required:

  1. Compute the mean, median, and modal monthly income.
  2. Compute the range.
  3. Compute the standard deviation.
  4. Compute the variance and Coefficient of Variation (CV) and interpret the significance of the result of CV in this case

Case 2: Hotel

A suburban hotel derives its gross income from both its hotel and restaurant operations. The owners are interested in getting a statistical understanding of these component pieces of their operation and wonder if there is a relationship between the number of rooms occupied on a nightly basis and the revenue per day in the restaurant. Below is a sample of 25 days (Monday through Thursday) from last year showing the restaurant income at breakfast and number of rooms occupied

  1. For both the # of rooms occupied and breakfast income from the restaurant determine:
    1. The mean, median and mode
    2. The variance, standard deviation, and range.
  2. What is the coefficient of correlation between the variables and what does this tell you?

Case 3: Smoking

The following table shows the number of persons 70 years old in terms of their gender and smoking habits.

Smoking Habit
Gender Current Smoker Former Smoker Never Smoked
Male 7,000 16,000 9,000

Female 6,000 9,000 33,000

  1. Convert the table above into a probability table, showing joint and marginal probabilities.
  2. Are gender and smoking habit dependent or independent events?
  3. What is the probability that a 70-year-old person is male, given that a person is a current smoker?

Case 4: Market Segmentation

Data was collected for a Consumer Electronic Study using a web survey of a sample of IP Addresses in the Greater Toronto area. The marketing manager for a major electronic retailer wants to know how important quality is to her customers. A recent market report that is based on past research indicated that 30.10% of all consumers in Ontario are more interested in less expensive products than high-priced quality products. The marketing manager suspects that customers from her store are different and that customers of different age groups might have different views as well. As a result, she surveys a sample of 1000 customers of different age groups and asks the following question:

“I would generally buy quality higher-priced premium goods than less expensive generic goods.”

The respondents are asked to indicate whether they “agree” with the statement (meaning they are willing to pay more money for quality goods) or “disagree” with the statement (meaning that they would prefer to purchase lower quality products at cheaper prices), and to also indicate in which of the following age groups they classify themselves to be:

  • 18 - 24
  • 25 - 34
  • 35 – 44
  • 45 – 54
  • 55 - 64
  • 65 or older

The responses to the survey along with the age groups can be found in the data file Electronic_Segmentation.xls file.

Required:In order to better understand customer attitude towards price and quality, the manager wants to know given each age category, what is the probability that the customer agrees that higher price and quality is more important than less quality for cheaper price?

Case 5: Analysis of Mutual Fund Managers

There are thousands of mutual funds available. There is no shortage of sources of information about them. Newspapers regularly report the value of each unit, mutual fund companies and brokers advertise extensively, and there are books on the subject. Many of the advertisements imply that individuals should invest in the advertiser’s mutual fund because it has performed well in the past. Unfortunately there is little evidence to infer that past performance is a predictor of the future. However, it may be possible to acquire useful information by examining the managers of mutual funds. Several researchers have studied the issue. One project gathered data concerning the performance of 2,029 funds.

The performance of each fund was measured by its risk-adjusted excess return, which is the difference between the return on investment of the fund and a return that is considered a standard. The standard is based on a variety of variables including the risk—free rate.

There are four variables that describe the fund manager. They are age, tenure (how many years the manager has been in charge), whether the manager had an MBA (1 = yes, 0 = no), and a measure of the quality of the manager’s education [the average Scholastic Achievement Test (SAT) score of students at the university where the manager received his or her undergraduate degree].

This case is based on Judith Chevalier and Glenn Ellison, “Are Some Mutual Fund Managers Better Than Others? Crosssectional Patterns in Behavior and Performance,’ Working Paper 5852, National Bureau of Economic Research.

The data are found in Mutual.xls:

  • Column 1 Return - Performance of the fund
  • Column 2 SAT - Measure of the manager’s education (average SAT score of students at the university where the manager received his/her undergraduate degree)
  • Column 3 Code (MBA) o 0 – manager has no MBA o 1 – manager has MBA
  • Column 4 Age - Age of Manager
  • Column 5 Tenure - How many years manager has been in charge Required: An analysis of the data
  1. Use the appropriate graphical techniques to compare the Performance of funds managed by those who have an MBA to those who do not have an MBA. Comment on the distributions (limit your comments to 1 to 2 sentences.)
  2. Use appropriate numerical descriptive statistics
    1. to describe the Performance of funds managed by managers who have an MBA to those who do not have an MBA.
    2. to compare the performance of funds based on the years a manager has been in charge, more specifically - compare two groups – Managers with less than 5 years tenure to Managers with 5 and more years experience.
    3. Comment on the results found in a) and b)
  3. Use the appropriate graphical and numerical technique to describe the relationship between rate of return and the SAT score of students where the manager received his/her undergraduate degree. Comment on your findings (limit to one sentence).

Case 1: Rocky - Health and Fitness club

In the file SurveyResults.xls, you will find the results of over 1200 questionnaires sent out by Rocky health and fitness club to its members.

The columns are as follows:

Column A: Weight and Exercise Equipment Satisfaction
Column B: Club Staff Satisfaction
Column C: Exercise Programs Satisfaction
Column D: Overall Club Satisfaction
Scale for Columns A – D:
5 = Very Satisfied, 4 = Satisfied, 3 =Neutral, 2 = Dissatisfied, 1 = Very Dissatisfied
Column E: Years with the club
Column F: Gender (1 = Male, 2 = Female)
Column G: Number of Visits per Week
Column H: Age of Customer

Management is interested in the following:

  1. Use proper descriptive statistical techniques (Tables/Appropriate charts/ Appropriate Numerical values) to describe the following key variables;
    1. Overall Club Satisfaction,
    2. Years with the club,
    3. Number of Visits per Week
    4. Age of Customers.
  2. A graph and a table to present a comparison of the customer’s gender versus Exercise Programs Satisfaction.
  3. A graph and a table to present a comparison of the customer’s gender versus Overall Club Satisfaction.
  4. The appropriate chart and numerical value/s to explore the relationship between Years with the club and customer’s age.

**For each of the following requirements, add comments about the findings

Case 2: Guilty or Not

  1. Your initial belief is that a defendant in a court case is guilty with probability 0.5. A witness comes forward claiming he saw the defendant committed the crime. You know the witness is not totally reliable and tells the truth with probability p. Calculate the posterior probability that the defendant is guilty, based on the witness’s evidence.
  2. A second witness, equally unreliable, comes forward and claims she saw the defendant committed the crime. Assuming the witnesses are not colluding, what is your posterior probability of guilt?
  3. In total, equally unreliable witnesses claim that they saw the defendant committed the crime. If there is no collusion among them, what is your posterior probability of guilt?
  4. Compare the answers to a), b) and c). How do you explain this result?

Case 3: Violence on TV

In a newspaper poll concerning violence on television, 600 people were asked, “what is your opinion of the amount of violence on prime time television – is there too much violence on television?” Their responses are indicated in the table below.

Yes (Y) No (N) Don’t know Total
Men (M) 162 95 23 280
Women (W) 256 45 19 320
Total 418 140 42 600

Suppose we label the events in the following manner: W is the event that a response is from a woman, M is the event that a response is from a man, Y is the event that a response is yes, and N is the event that a response is no.

Use the given table to find the following probabilities and state the events in words from c) to f).

  1. P(N)
  2. P(W)
  3. P(N|W)
  4. P(W|N)
  5. P(W or N)
  6. P(M and Y)

Case 4: Registered Voters

A researcher is conducting a survey of county residents in a Midwestern state. Foolishly, she uses lists of registered voters as a sampling frame and randomly draws phone numbers and addresses from the lists. As it turns out, thirty-seven percent of eligible adults are not registered to vote. In addition, of those registered to vote, seven percent do not have accurate addresses or telephone contact information.

What percent of the county’s adults have a chance of being contacted by the researcher? Describe your results.

PRODUCTION BUSINESS INDUSTRY

The motion picture industry is a competitive business. Many studios produce more than 500 motion pictures each year, and the financial success of each picture varies considerably. Success of the picture mainly depends on the business it produced in the opening week.

Data consists of common variables used to measure the success of a motion picture, which are opening weekend gross sales ($ millions), the total gross sales ($ millions), the number of theatres the movie was shown in, the numbers of weeks the motion picture was in the top 60, and type of picture (1 = Action, 2 = Horror, 3 = Romance, 4 = Family, 5 = Comedy).

Data collected for 120 movies in the year 2009. Data file “Data.xls” has been provided.

Use the appropriate graphical techniques, along with the appropriate numerical descriptive statistics to learn how these variables contribute to the success of a motion picture.

At least the following should be included:

  1. Numerical descriptive statistics, tabular, graphical summaries for each of the variables along with a discussion of what each summary tells about the motion picture industry. (Hint: categorized each type of movie)
  2. What motion pictures, if any, should be considered high-performance outliers? Explain?
  3. Draw and discuss scatter diagram to explore the relationship between total gross sales and other appropriate variables.
  4. Compute and Interpret numerical descriptive statistics showing the relationship between total gross sales and other appropriate variables.

Case 2

To investigate how college students use selected education-related technology in the classroom, an online survey method was used to gather data for this study. The data collection took place in March/April 2011 at an American university in New York State and a total of 230 responses were obtained. The responses to the survey, along with the gender, classNamestanding, number of times the student accessed the course website per week and the functional use of the laptop in className, can be found in the data file Survey-Data.xls.

  1. Find the cross table values and percentages by gender and the number of times the student accessed the course website per week (i.e. number and percentage of males and females whom accessed the site 1-3, 4-7, etc, times per week); by gender and the functional use of the laptop in className; by classNamestanding and the number of times the student accessed the course website per week; and by classNamestanding and the functional use of the laptop in className.
  2. Then, in order to understand students’ attitude towards the number of times the student accessed the course website per week, evaluate the probabilities that the gender is female across the number of times they access the course website; 1-3, 4-7, 8-11 and 12+. For example, given the student is a female, what is the probability that they will access the site 1-3 times per week, 4-7 times per week, etc...
  3. And also in order to understand students’ attitude towards the functional use of the laptop in className, evaluate the probabilities that the student is junior across laptop use in className(research classNamerelated material, check e-mail, play games, get on social media, not bring laptop to classNameand complete work for other classes). For example, given the student is a junior, what is the probability that they use the laptop for research classNamerelated material, check e-mail, etc...

Case 1: BeyondTheRack

BeyondTheRack (BTR), a division of UK Clothing Group, is a British chain of women’s apparel stores operating across the United Kingdom.

The chain recently ran a promotion in which discount coupons were sent to customers of other UK Clothing Group stores. Data collected for a sample of 100 in-store credit card transactions at BTR stores during one day while the promotion was running are contained in the file named BTR.xls.

The proprietary card method of payment refers to charges made using a UK Clothing Group charge card. Customers who made a purchase using a discount coupon are referred to as promotional customers and customers who made a purchase but did not use a discount coupon are referred to as regular customers. Because the promotional coupons were not sent to regular BTR stores customers, management considers the sales made to people presenting the promotional coupons as sales it would not otherwise make. Of course, BTR also hopes that the promotional customers will continue to shop at its stores. Most of the variables shown in the file BTR.xls are self-explanatory, but two of the variables require some clarification:

Items: The total number of items purchased
Net Sales: The total amount ($) charged to the credit card

BTR’s management would like to use this sample data to learn about its customer’s base and to evaluate the promotion involving discount coupons.

Required:

Use the tabular and graphical methods of descriptive statistics to help management develop a customer profile and to evaluate the promotional campaign.

You should provide the following:

  1. Relative frequency distributions for the key variables;# of Items, Net sales, Method of Payment, Gender, Marital Status & Age (Tables & Appropriate charts). Also add the appropriate numerical descriptive statistics values.
  2. A frequency distribution and appropriate chart(s) of the type of customer versus method of payment.
  3. The appropriate chart to explore the relationship between net sales and customer age.

**For each of the following requirements, add comments about the findings Notes:

  • Organization: Your assignment must be well-organized. Students have lost marks in the past when we were not able to find questions or certain parts of questions.

Case 2: MoneyBall

Over the Christmas holiday, a stats professor, Dr. Bright, was enjoying some well-deserved rest once the marking, from the previous semester’s courses, had been completed. With the snow piled high and the thermometer plummeting, he decided to stay cozy and catch up on a number of films that he hadn’t seen in a long time. One of the films on his list was Moneyball, the story of how Billy Beane turned the baseball world on its head by introducing, wait for it – objective statistical analysis. This was baseball heresy and the ‘traditional’ baseball world was salivating at the thought of the monumental catastrophe awaiting Billy and the Oakland A’s.

Seeing as the film was both immensely entertaining and filled with the fun and practical implementation of stats (has a better movie ever been made?), he decided to research the validity of the claims made in the film regarding the # of wins, the size of a team’s payroll and the impact of playing Moneyball. In the movie, a Yale economics grad Peter Brand (in real life it was actually Paul DePodesta – he didn’t want his real name used in the film, and he actually went to Harvard, not Yale) spent countless hours examining reams of data about the OBA (on base average i.e percentage) of thousands of professional baseball players in both the major and minor leagues.

After being highly entertained, Dr Bright went and looked at the data since the 2002 season (the season in which the film took place) and discovered the following:

42% of teams now use the Moneyball approach. The league can be divided into 2 classes of teams – the rich ($100+ Million in annual payroll) and the poor (?) with smaller payrolls (the smallest being the Houston Astros at just over $24M, in 2013). 74% of the teams using the Moneyball strategy are ‘poor’, whereas 74% of the teams not using the Moneyball strategy are rich.

  1. What percentage of teams have payrolls below $100M?
  2. If you randomly select a team from amongst the ‘poor’ what is the probability it employs the Moneyball strategy?
  3. If you randomly select from the set of ‘rich teams’ what is the probability it employs the Moneyball strategy?
  4. What is the probability that either a team plays Moneyball and has a payroll below $100M, or is rich and doesn’t play Moneyball?

Case 3: Geography and Education

A firm selling products geared to those with higher education hired a demographer to tell them more about geographical location and education achievement. The demographer provided the firm with the following joint probabilities.

Education Maritimes Quebec Ontario Man/Sask Alta/BC
Less the
secondary
school diploma
.008 .031 .030 .021 .019
Secondary
diploma
.042 .072 .108 .059 .043
Some postsecondary
schooling
.008 .032 .021 .016 .019
Undergraduate
degree
.032 .059 .152 .038 .080
Post-graduate
degree
.010 .018 .031 .008 .043

Required:

  1. What is the probability that someone living in Ontario has an undergraduate degree?
  2. What is the probability that someone is living in Alberta or British Columbia?
  3. What is the probability that someone living in the Maritimes does not have a secondary school diploma?
  4. What is the probability that someone living in Quebec has a University degree?
  5. What is the probability that someone living west of the Ottawa River (Ontario, Manitoba, Saskatchewan, Alberta, and British Columbia) has a post-graduate degree? Compare that to the probability for someone living east of the Ottawa River (Quebec and Maritimes) having a post-graduate degree?
.