# STAT 200 Week 7 Homework Problems

10.1.2

Table #10.1.6 contains the value of the house and the amount of rental income in a year that the house brings in ("Capital and rental," 2013).  Create a scatter plot and find a regression equation between house value and rental income.  Then use the regression equation to find the rental income a house worth \$230,000 and for a house worth \$400,000.  Which rental income that you calculated do you think is closer to the true rental income?  Why?

# Table #10.1.6: Data of House Value versus Rental

 Value Rental Value Rental Value Rental Value Rental 81000 6656 77000 4576 75000 7280 67500 6864 95000 7904 94000 8736 90000 6240 85000 7072 121000 12064 115000 7904 110000 7072 104000 7904 135000 8320 130000 9776 126000 6240 125000 7904 145000 8320 140000 9568 140000 9152 135000 7488 165000 13312 165000 8528 155000 7488 148000 8320 178000 11856 174000 10400 170000 9568 170000 12688 200000 12272 200000 10608 194000 11232 190000 8320 214000 8528 208000 10400 200000 10400 200000 8320 240000 10192 240000 12064 240000 11648 225000 12480 289000 11648 270000 12896 262000 10192 244500 11232 325000 12480 310000 12480 303000 12272 300000 12480

Scatterplot with regression equation:

Hence line of best fit can be given by:

Y = 0.0244x + 5363.9

Now if we take x = \$230,000

Predicted Rental = 0.0244*230000 + 5363.9 = \$10,975.90

Again if we take x = \$400,000

Predicted Rental = 0.0244*400000 + 5363.9 = \$15123.9

It looks like rental income calculated for \$230,000 will be much closer to the actual rental. The reason may be that, \$230,000 is within the range of original data but \$400,000 is outside the scope of original data.

10.1.4

The World Bank collected data on the percentage of GDP that a country spends on health expenditures ("Health expenditure," 2013) and also the percentage of women receiving prenatal care("Pregnant woman receiving," 2013).  The data for the countries where this information are available for the year 2011 is in table #10.1.8.  Create a scatter plot of the data and find a regression equation between percentage spent on health expenditure and the percentage of women receiving prenatal care.  Then use the regression equation to find the percent of women receiving prenatal care for a country that spends 5.0% of GDP on health expenditure and for a country that spends 12.0% of GDP.  Which prenatal care percentage that you calculated do you think is closer to the true percentage?  Why?

## Table #10.1.8: Data of Health Expenditure versus Prenatal Care

 Health Expenditure (% of GDP) Prenatal Care (%) 9.6 47.9 3.7 54.6 5.2 93.7 5.2 84.7 10.0 100.0 4.7 42.5 4.8 96.4 6.0 77.1 5.4 58.3 4.8 95.4 4.1 78.0 6.0 93.3 9.5 93.3 6.8 93.7 6.1 89.8

Scatterplot along with line of best fit:

The line of best fit is:

Y = 1.6606x + 69.739

Now if we take health expenditure x = 5%,

Predicted Prenatal care = 1.6606 * 5 + 69.739 = 78.042%

Again, if health expenditure x = 12%,

Predicted Prenatal care = 1.6606 * 12 + 69.739 = 89.67%

Itlooks like the prenatal care for 5% health expenditure is much closer to the actual value as 5% is within the range of original data whereas 12% is outside the scope of original data.

10.2.2

Table #10.1.6 contains the value of the house and the amount of rental income in a year that the house brings in ("Capital and rental," 2013).  Find the correlation coefficient and coefficient of determination and then interpret both.

### Table #10.1.6: Data of House Value versus Rental

 Value Rental Value Rental Value Rental Value Rental 81000 6656 77000 4576 75000 7280 67500 6864 95000 7904 94000 8736 90000 6240 85000 7072 121000 12064 115000 7904 110000 7072 104000 7904 135000 8320 130000 9776 126000 6240 125000 7904 145000 8320 140000 9568 140000 9152 135000 7488 165000 13312 165000 8528 155000 7488 148000 8320 178000 11856 174000 10400 170000 9568 170000 12688 200000 12272 200000 10608 194000 11232 190000 8320 214000 8528 208000 10400 200000 10400 200000 8320 240000 10192 240000 12064 240000 11648 225000 12480 289000 11648 270000 12896 262000 10192 244500 11232 325000 12480 310000 12480 303000 12272 300000 12480

I used excel data analysis tool to calculate the correlation coefficient between house value and rental.

According to the above output, Pearson correlation coefficient r = 0.764716

It looks like a strong positive linear relation exist between house value and rental.

Coefficient of determination r2 = 0.76472 = 0.585

So, 58.5% variation in rental can be explained by the variation in house value. This percentage is good enough to use the obtained regression equation to predict the rental income with help of value of houses.

10.2.4

The World Bank collected data on the percentage of GDP that a country spends on health expenditures ("Health expenditure," 2013) and also the percentage of women receiving prenatal care("Pregnant woman receiving," 2013).  The data for the countries where this information is available for the year 2011 are in table #10.1.8.  Find the correlation coefficient and coefficient of determination and then interpret both.

#### Table #10.1.8: Data of Health Expenditure versus Prenatal Care

 Health Expenditure (% of GDP) Prenatal Care (%) 9.6 47.9 3.7 54.6 5.2 93.7 5.2 84.7 10.0 100.0 4.7 42.5 4.8 96.4 6.0 77.1 5.4 58.3 4.8 95.4 4.1 78.0 6.0 93.3 9.5 93.3 6.8 93.7 6.1 89.8

I used excel data analysis tool to calculate the correlation coefficient between health expenditure and parental care.

According to the above output, Pearson correlation coefficient r = 0.1715

It looks like a weak positive linear relation exist between health expenditure and parental care.

Coefficient of determination r2 = 0.17152 = 0.0294

So, only 2.94% variation in parental care can be explained by the health expenditure. So as the proportion is very low, it is not good to use the regression equation to predict parental care with help of health expenditure.

10.3.2

Table #10.1.6 contains the value of the house and the amount of rental income in a year that the house brings in ("Capital and rental," 2013).

Test at the 5% level for a positive correlation between house value and rental amount.

Table #10.1.6: Data of House Value versus Rental

 Value Rental Value Rental Value Rental Value Rental 81000 6656 77000 4576 75000 7280 67500 6864 95000 7904 94000 8736 90000 6240 85000 7072 121000 12064 115000 7904 110000 7072 104000 7904 135000 8320 130000 9776 126000 6240 125000 7904 145000 8320 140000 9568 140000 9152 135000 7488 165000 13312 165000 8528 155000 7488 148000 8320 178000 11856 174000 10400 170000 9568 170000 12688 200000 12272 200000 10608 194000 11232 190000 8320 214000 8528 208000 10400 200000 10400 200000 8320 240000 10192 240000 12064 240000 11648 225000 12480 289000 11648 270000 12896 262000 10192 244500 11232 325000 12480 310000 12480 303000 12272 300000 12480

Hypothesis for correlation is positive:

Null Hypothesis: H0: ? = 0

Alternate Hypothesis: H1: ?> 0 (claim)

Level of significance? = 0.05

Excel output for regression analysis:

 SUMMARY OUTPUT Regression Statistics Multiple R 0.764716 R Square 0.58479 Adjusted R Square 0.575764 Standard Error 1441.625 Observations 48 ANOVA df SS MS F Significance F Regression 1 1.35E+08 1.35E+08 64.78736 2.5E-10 Residual 46 95600982 2078282 Total 47 2.3E+08 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 5363.865 567.2408 9.456062 2.34E-12 4222.068 6505.661 4222.068 6505.661 Value 0.024358 0.003026 8.04906 2.5E-10 0.018267 0.03045 0.018267 0.03045

T test = 8.04906

P = 0 (approximately)

P < 0>

10.3.4

The World Bank collected data on the percentage of GDP that a country spends on health expenditures ("Health expenditure," 2013) and also the percentage of women receiving prenatal care("Pregnant woman receiving," 2013).  The data for the countries where this information is available for the year 2011 are in table #10.1.8.

Test at the 5% level for a correlation between percentage spent on health expenditure and the percentage of women receiving prenatal care.

Table #10.1.8: Data of Health Expenditure versus Prenatal Care

 Health Expenditure (% of GDP) Prenatal Care (%) 9.6 47.9 3.7 54.6 5.2 93.7 5.2 84.7 10.0 100.0 4.7 42.5 4.8 96.4 6.0 77.1 5.4 58.3 4.8 95.4 4.1 78.0 6.0 93.3 9.5 93.3 6.8 93.7 6.1 89.8

Null Hypothesis: H0: ? = 0

Alternate Hypothesis: H1: ?? 0 (claim)

Level of significance = 0.05

Excel output for regression analysis:

 SUMMARY OUTPUT Regression Statistics Multiple R 0.171505 R Square 0.029414 Adjusted R Square -0.04525 Standard Error 19.92675 Observations 15 ANOVA df SS MS F Significance F Regression 1 156.4362 156.4362 0.393971 0.541089 Residual 13 5161.981 397.0755 Total 14 5318.417 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 69.7394 17.00601 4.100869 0.001251 33.00015 106.4786 33.00015 106.4786 Health Expenditure (% of GDP) 1.660599 2.645652 0.627671 0.541089 -4.05498 7.376182 -4.05498 7.376182

According to above output T test = 0.6277

P = 0.5411

As P >0.05 (level of significance), we fail to reject the null hypothesis. So, insufficient evidence to support the claim that there is a significant correlation between percentage spent on health expenditure and the percentage of women receiving prenatal care.

11.1.2

Researchers watched groups of dolphins off the coast of Ireland in 1998 to determine what activities the dolphins partake in at certain times of the day ("Activities of dolphin," 2013).  The numbers in table #11.1.6 represent the number of groups of dolphins that were partaking in an activity at certain times of days.  Is there enough evidence to show that the activity and the time period are independent for dolphins?  Test at the 1% level.

Table #11.1.6: Dolphin Activity

 Activity Period Row Total Morning Noon Afternoon Evening Travel 6 6 14 13 39 Feed 28 4 0 56 88 Social 38 5 9 10 62 Column Total 72 15 23 79 189

Null Hypothesis: the activity and the time period are independent of each other for dolphins.

Alternate Hypothesis: the activity and the time period are dependent on each other for dolphins.

Level of significance ?= 0.01

 Expected value table Period Row Activity Morning Noon Afternoon Evening Total Travel 14.857 3.095 4.746 16.302 39 Feed 33.524 6.984 10.709 36.783 88 Social 23.619 4.921 7.545 25.915 62 Column Total 72 15 23 79 189

Chi-Square test:

 Statistic DF Value P-value Chi-square 6 68.464567 <0>

Chi square = 68.465

P = 0

As P <0>

Hence, sufficient evidence to support the claim that the activity and the time period are dependent on each other for dolphins.

11.1.4

A person’s educational attainment and age group was collected by the U.S. Census Bureau in 1984 to see if age group and educational attainment are related.  The counts in thousands are in table #11.1.8 ("Education by age," 2013).  Do the data show that educational attainment and age are independent?  Test at the 5% level.

Table #11.1.8: Educational Attainment and Age Group

 Education Age Group Row Total 25-34 35-44 45-54 55-64 >64 Did not complete HS 5416 5030 5777 7606 13746 37575 Competed HS 16431 1855 9435 8795 7558 44074 College 1-3 years 8555 5576 3124 2524 2503 22282 College 4 or more years 9771 7596 3904 3109 2483 26863 Column Total 40173 20057 22240 22034 26290 130794

Null Hypothesis: educational attainment and age are independent of each other

Alternate Hypothesis: educational attainment and age are dependent on each other

Level of significance ?= 0.05

 Expected value Age Group Row Total Education 25-34 35-44 45-54 55-64 >64 Did not complete HS 11541.05 5762.05 6389.19 6330.01 7552.69 37575 Competed HS 13537.20 6758.66 7494.27 7424.86 8859.01 44074 College 1-3 years 6843.85 3416.90 3788.80 3753.70 4478.75 22282 College 4 or more years 8250.89 4119.39 4567.74 4525.43 5399.55 26863 Column Total 40173 20057 22240 22034 26290 130794

Chi-Square test:

 Statistic DF Value P-value Chi-square 12 9.9325602e-16 1

P value = 1

As P > 0.05, we fail to reject the null hypothesis. So, insufficient evidence to support the claim that educational attainment and age are dependent on each other.

11.2.4

In Africa in 2011, the number of deaths of a female from cardiovascular disease for different age groups are in table #11.2.6 ("Global health observatory," 2013).  In addition, the proportion of deaths of females from all causes for the same age groups are also in table #11.2.6.  Do the data show that the death from cardiovascular disease are in the same proportion as all deaths for the different age groups?  Test at the 5% level.

Table #11.2.6: Deaths of Females for Different Age Groups

 Age 5-14 15-29 30-49 50-69 Total Cardiovascular Frequency 8 16 56 433 513 All Cause Proportion 0.10 0.12 0.26 0.52

Null Hypothesis: The death from cardiovascular disease are in similar proportion as all deaths for the different age groups

Alternate Hypothesis: the death from cardiovascular disease are not in similar proportion as all deaths for the different age groups

Level of significance = 0.05

 Age Observed Frequency (O) Proportion Expected Frequency (E) (O-E)^2/E 5   -   14 8 0.1 51.3 36.55 15 - 29 16 0.12 61.56 33.72 30  - 49 56 0.26 133.38 44.89 50  - 69 433 0.52 266.76 103.60 total 513 total 218.76

Chi square = 218.76

P = 0

As P < level>

11.2.6

A project conducted by the Australian Federal Office of Road Safety asked people many questions about their cars.  One question was the reason that a person chooses a given car, and that data is in table #11.2.8 ("Car preferences," 2013).

Table #11.2.8: Reason for Choosing a Car

 Safety Reliability Cost Performance Comfort Looks 84 62 46 34 47 27

Do the data show that the frequencies observed substantiate the claim that the reasons for choosing a car are equally likely?  Test at the 5% level.

Null Hypothesis: Reason of choosing car is equally likely for all reasons

Alternate Hypothesis: Reason of choosing car is not equally likely for all reasons

Level of significance = 0.05

 Reason Observed Frequency (O) Expected Frequency (E) (O-E)^2/E Safety 84 50 23.12 Reliability 62 50 2.88 cost 46 50 0.32 Performance 34 50 5.12 Comfort 47 50 0.18 Looks 27 50 10.58 Total 300 Sum 42.2

Chi square = sum of(O-E)^2/E =42.2

DF = 6 – 1 = 5

P = 0

As P < level>

##### No Need To Pay Extra
• Turnitin Report

\$10.00

\$9.00
Per Page
• Consultation with Expert

\$35.00
Per Hour
• Live Session 1-on-1

\$40.00
Per 30 min.
• Quality Check

\$25.00

### Free

New Special Offer