STAT 200 Week 7 Homework Problems

 

10.1.2

Table #10.1.6 contains the value of the house and the amount of rental income in a year that the house brings in ("Capital and rental," 2013).  Create a scatter plot and find a regression equation between house value and rental income.  Then use the regression equation to find the rental income a house worth $230,000 and for a house worth $400,000.  Which rental income that you calculated do you think is closer to the true rental income?  Why?

 

Table #10.1.6: Data of House Value versus Rental

Value

Rental

Value

Rental

Value

Rental

Value

Rental

81000

6656

77000

4576

75000

7280

67500

6864

95000

7904

94000

8736

90000

6240

85000

7072

121000

12064

115000

7904

110000

7072

104000

7904

135000

8320

130000

9776

126000

6240

125000

7904

145000

8320

140000

9568

140000

9152

135000

7488

165000

13312

165000

8528

155000

7488

148000

8320

178000

11856

174000

10400

170000

9568

170000

12688

200000

12272

200000

10608

194000

11232

190000

8320

214000

8528

208000

10400

200000

10400

200000

8320

240000

10192

240000

12064

240000

11648

225000

12480

289000

11648

270000

12896

262000

10192

244500

11232

325000

12480

310000

12480

303000

12272

300000

12480

 

Scatterplot with regression equation:

 

Hence line of best fit can be given by:

Y = 0.0244x + 5363.9

Now if we take x = $230,000

Predicted Rental = 0.0244*230000 + 5363.9 = $10,975.90

Again if we take x = $400,000

Predicted Rental = 0.0244*400000 + 5363.9 = $15123.9

It looks like rental income calculated for $230,000 will be much closer to the actual rental. The reason may be that, $230,000 is within the range of original data but $400,000 is outside the scope of original data.

 

10.1.4

The World Bank collected data on the percentage of GDP that a country spends on health expenditures ("Health expenditure," 2013) and also the percentage of women receiving prenatal care("Pregnant woman receiving," 2013).  The data for the countries where this information are available for the year 2011 is in table #10.1.8.  Create a scatter plot of the data and find a regression equation between percentage spent on health expenditure and the percentage of women receiving prenatal care.  Then use the regression equation to find the percent of women receiving prenatal care for a country that spends 5.0% of GDP on health expenditure and for a country that spends 12.0% of GDP.  Which prenatal care percentage that you calculated do you think is closer to the true percentage?  Why?

 

Table #10.1.8: Data of Health Expenditure versus Prenatal Care

Health Expenditure (% of GDP)

Prenatal Care (%)

9.6

47.9

3.7

54.6

5.2

93.7

5.2

84.7

10.0

100.0

4.7

42.5

4.8

96.4

6.0

77.1

5.4

58.3

4.8

95.4

4.1

78.0

6.0

93.3

9.5

93.3

6.8

93.7

6.1

89.8

 

Scatterplot along with line of best fit:

 

The line of best fit is:

Y = 1.6606x + 69.739

Now if we take health expenditure x = 5%,

Predicted Prenatal care = 1.6606 * 5 + 69.739 = 78.042%

Again, if health expenditure x = 12%,

Predicted Prenatal care = 1.6606 * 12 + 69.739 = 89.67%

Itlooks like the prenatal care for 5% health expenditure is much closer to the actual value as 5% is within the range of original data whereas 12% is outside the scope of original data.

 

10.2.2

Table #10.1.6 contains the value of the house and the amount of rental income in a year that the house brings in ("Capital and rental," 2013).  Find the correlation coefficient and coefficient of determination and then interpret both.

 

Table #10.1.6: Data of House Value versus Rental

Value

Rental

Value

Rental

Value

Rental

Value

Rental

81000

6656

77000

4576

75000

7280

67500

6864

95000

7904

94000

8736

90000

6240

85000

7072

121000

12064

115000

7904

110000

7072

104000

7904

135000

8320

130000

9776

126000

6240

125000

7904

145000

8320

140000

9568

140000

9152

135000

7488

165000

13312

165000

8528

155000

7488

148000

8320

178000

11856

174000

10400

170000

9568

170000

12688

200000

12272

200000

10608

194000

11232

190000

8320

214000

8528

208000

10400

200000

10400

200000

8320

240000

10192

240000

12064

240000

11648

225000

12480

289000

11648

270000

12896

262000

10192

244500

11232

325000

12480

310000

12480

303000

12272

300000

12480

 

I used excel data analysis tool to calculate the correlation coefficient between house value and rental.

 

According to the above output, Pearson correlation coefficient r = 0.764716

It looks like a strong positive linear relation exist between house value and rental.

Coefficient of determination r2 = 0.76472 = 0.585

So, 58.5% variation in rental can be explained by the variation in house value. This percentage is good enough to use the obtained regression equation to predict the rental income with help of value of houses.

 

10.2.4

The World Bank collected data on the percentage of GDP that a country spends on health expenditures ("Health expenditure," 2013) and also the percentage of women receiving prenatal care("Pregnant woman receiving," 2013).  The data for the countries where this information is available for the year 2011 are in table #10.1.8.  Find the correlation coefficient and coefficient of determination and then interpret both.

 

Table #10.1.8: Data of Health Expenditure versus Prenatal Care

Health Expenditure (% of GDP)

Prenatal Care (%)

9.6

47.9

3.7

54.6

5.2

93.7

5.2

84.7

10.0

100.0

4.7

42.5

4.8

96.4

6.0

77.1

5.4

58.3

4.8

95.4

4.1

78.0

6.0

93.3

9.5

93.3

6.8

93.7

6.1

89.8

 

I used excel data analysis tool to calculate the correlation coefficient between health expenditure and parental care.

 

According to the above output, Pearson correlation coefficient r = 0.1715

It looks like a weak positive linear relation exist between health expenditure and parental care.

Coefficient of determination r2 = 0.17152 = 0.0294

So, only 2.94% variation in parental care can be explained by the health expenditure. So as the proportion is very low, it is not good to use the regression equation to predict parental care with help of health expenditure.

 

10.3.2

Table #10.1.6 contains the value of the house and the amount of rental income in a year that the house brings in ("Capital and rental," 2013). 

Test at the 5% level for a positive correlation between house value and rental amount. 

 

 

Table #10.1.6: Data of House Value versus Rental

Value

Rental

Value

Rental

Value

Rental

Value

Rental

81000

6656

77000

4576

75000

7280

67500

6864

95000

7904

94000

8736

90000

6240

85000

7072

121000

12064

115000

7904

110000

7072

104000

7904

135000

8320

130000

9776

126000

6240

125000

7904

145000

8320

140000

9568

140000

9152

135000

7488

165000

13312

165000

8528

155000

7488

148000

8320

178000

11856

174000

10400

170000

9568

170000

12688

200000

12272

200000

10608

194000

11232

190000

8320

214000

8528

208000

10400

200000

10400

200000

8320

240000

10192

240000

12064

240000

11648

225000

12480

289000

11648

270000

12896

262000

10192

244500

11232

325000

12480

310000

12480

303000

12272

300000

12480

 

Hypothesis for correlation is positive:

Null Hypothesis: H0: ? = 0

Alternate Hypothesis: H1: ?> 0 (claim)

Level of significance? = 0.05

Excel output for regression analysis:

SUMMARY OUTPUT

             
                 

Regression Statistics

             

Multiple R

0.764716

             

R Square

0.58479

             

Adjusted R Square

0.575764

             

Standard Error

1441.625

             

Observations

48

             
                 

ANOVA

               

 

df

SS

MS

F

Significance F

     

Regression

1

1.35E+08

1.35E+08

64.78736

2.5E-10

     

Residual

46

95600982

2078282

         

Total

47

2.3E+08

 

 

 

     
                 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

5363.865

567.2408

9.456062

2.34E-12

4222.068

6505.661

4222.068

6505.661

Value

0.024358

0.003026

8.04906

2.5E-10

0.018267

0.03045

0.018267

0.03045

 

T test = 8.04906

P = 0 (approximately)

P < 0>

 

10.3.4

The World Bank collected data on the percentage of GDP that a country spends on health expenditures ("Health expenditure," 2013) and also the percentage of women receiving prenatal care("Pregnant woman receiving," 2013).  The data for the countries where this information is available for the year 2011 are in table #10.1.8. 

Test at the 5% level for a correlation between percentage spent on health expenditure and the percentage of women receiving prenatal care. 

 

Table #10.1.8: Data of Health Expenditure versus Prenatal Care

Health Expenditure (% of GDP)

Prenatal Care (%)

9.6

47.9

3.7

54.6

5.2

93.7

5.2

84.7

10.0

100.0

4.7

42.5

4.8

96.4

6.0

77.1

5.4

58.3

4.8

95.4

4.1

78.0

6.0

93.3

9.5

93.3

6.8

93.7

6.1

89.8

 

 

Null Hypothesis: H0: ? = 0

Alternate Hypothesis: H1: ?? 0 (claim)

Level of significance = 0.05

Excel output for regression analysis:

SUMMARY OUTPUT

             
                 

Regression Statistics

             

Multiple R

0.171505

             

R Square

0.029414

             

Adjusted R Square

-0.04525

             

Standard Error

19.92675

             

Observations

15

             
                 

ANOVA

               

 

df

SS

MS

F

Significance F

     

Regression

1

156.4362

156.4362

0.393971

0.541089

     

Residual

13

5161.981

397.0755

         

Total

14

5318.417

 

 

 

     
                 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

69.7394

17.00601

4.100869

0.001251

33.00015

106.4786

33.00015

106.4786

Health Expenditure (% of GDP)

1.660599

2.645652

0.627671

0.541089

-4.05498

7.376182

-4.05498

7.376182

 

According to above output T test = 0.6277

P = 0.5411

As P >0.05 (level of significance), we fail to reject the null hypothesis. So, insufficient evidence to support the claim that there is a significant correlation between percentage spent on health expenditure and the percentage of women receiving prenatal care. 

 

 

 

11.1.2

Researchers watched groups of dolphins off the coast of Ireland in 1998 to determine what activities the dolphins partake in at certain times of the day ("Activities of dolphin," 2013).  The numbers in table #11.1.6 represent the number of groups of dolphins that were partaking in an activity at certain times of days.  Is there enough evidence to show that the activity and the time period are independent for dolphins?  Test at the 1% level.

Table #11.1.6: Dolphin Activity

 

Activity

Period

Row

Total

Morning

Noon

Afternoon

Evening

Travel

6

6

14

13

39

Feed

28

4

0

56

88

Social

38

5

9

10

62

Column Total

72

15

23

79

189

 

Null Hypothesis: the activity and the time period are independent of each other for dolphins.

Alternate Hypothesis: the activity and the time period are dependent on each other for dolphins.

Level of significance ?= 0.01

Expected value table

           

 

Period

Row

Activity

Morning

Noon

Afternoon

Evening

Total

Travel

14.857

3.095

4.746

16.302

39

Feed

33.524

6.984

10.709

36.783

88

Social

23.619

4.921

7.545

25.915

62

Column Total

72

15

23

79

189

 

Chi-Square test:

 

Statistic

DF

Value

P-value

Chi-square

6

68.464567

<0>

 

Chi square = 68.465

P = 0

As P <0>

Hence, sufficient evidence to support the claim that the activity and the time period are dependent on each other for dolphins.

 

 

11.1.4

A person’s educational attainment and age group was collected by the U.S. Census Bureau in 1984 to see if age group and educational attainment are related.  The counts in thousands are in table #11.1.8 ("Education by age," 2013).  Do the data show that educational attainment and age are independent?  Test at the 5% level.

Table #11.1.8: Educational Attainment and Age Group

 

Education

Age Group

Row Total

25-34

35-44

45-54

55-64

>64

Did not complete HS

5416

5030

5777

7606

13746

37575

Competed HS

16431

1855

9435

8795

7558

44074

College 1-3 years

8555

5576

3124

2524

2503

22282

College 4 or more years

9771

7596

3904

3109

2483

26863

Column Total

40173

20057

22240

22034

26290

130794

 

Null Hypothesis: educational attainment and age are independent of each other

Alternate Hypothesis: educational attainment and age are dependent on each other

Level of significance ?= 0.05

Expected value

 

Age Group

Row Total

Education

25-34

35-44

45-54

55-64

>64

Did not complete HS

11541.05

5762.05

6389.19

6330.01

7552.69

37575

Competed HS

13537.20

6758.66

7494.27

7424.86

8859.01

44074

College 1-3 years

6843.85

3416.90

3788.80

3753.70

4478.75

22282

College 4 or more years

8250.89

4119.39

4567.74

4525.43

5399.55

26863

Column Total

40173

20057

22240

22034

26290

130794

 

Chi-Square test:

 

Statistic

DF

Value

P-value

Chi-square

12

9.9325602e-16

1

 

P value = 1

As P > 0.05, we fail to reject the null hypothesis. So, insufficient evidence to support the claim that educational attainment and age are dependent on each other.

 

11.2.4

In Africa in 2011, the number of deaths of a female from cardiovascular disease for different age groups are in table #11.2.6 ("Global health observatory," 2013).  In addition, the proportion of deaths of females from all causes for the same age groups are also in table #11.2.6.  Do the data show that the death from cardiovascular disease are in the same proportion as all deaths for the different age groups?  Test at the 5% level.

Table #11.2.6: Deaths of Females for Different Age Groups

Age

5-14

15-29

30-49

50-69

Total

Cardiovascular Frequency

8

16

56

433

513

All Cause Proportion

0.10

0.12

0.26

0.52

 

 

Null Hypothesis: The death from cardiovascular disease are in similar proportion as all deaths for the different age groups

Alternate Hypothesis: the death from cardiovascular disease are not in similar proportion as all deaths for the different age groups

Level of significance = 0.05

Age

Observed Frequency (O)

Proportion

Expected Frequency (E)

(O-E)^2/E

  5   -   14

8

0.1

51.3

36.55

15 - 29

16

0.12

61.56

33.72

30  - 49

56

0.26

133.38

44.89

50  - 69

433

0.52

266.76

103.60

total

513

 

total

218.76

 

Chi square = 218.76

P = 0

As P < level>

 

 

11.2.6

A project conducted by the Australian Federal Office of Road Safety asked people many questions about their cars.  One question was the reason that a person chooses a given car, and that data is in table #11.2.8 ("Car preferences," 2013).

Table #11.2.8: Reason for Choosing a Car

Safety

Reliability

Cost

Performance

Comfort

Looks

84

62

46

34

47

27

Do the data show that the frequencies observed substantiate the claim that the reasons for choosing a car are equally likely?  Test at the 5% level.

Null Hypothesis: Reason of choosing car is equally likely for all reasons

Alternate Hypothesis: Reason of choosing car is not equally likely for all reasons

Level of significance = 0.05

Reason

Observed Frequency (O)

Expected Frequency (E)

(O-E)^2/E

Safety

84

50

23.12

Reliability

62

50

2.88

cost

46

50

0.32

Performance

34

50

5.12

Comfort

47

50

0.18

Looks

27

50

10.58

Total

300

Sum

42.2

 

Chi square = sum of(O-E)^2/E =42.2

DF = 6 – 1 = 5

P = 0

As P < level>

 

No Need To Pay Extra
  • Turnitin Report

    $10.00
  • Proofreading and Editing

    $9.00
    Per Page
  • Consultation with Expert

    $35.00
    Per Hour
  • Live Session 1-on-1

    $40.00
    Per 30 min.
  • Quality Check

    $25.00
  • Total

    Free

New Special Offer

Get 25% Off

review