STAT 200 Week 6 Homework Problems
9.1.2
Many high school students take the AP tests in different subject areas. In 2007, of the 144,796 students who took the biology exam, 84,199 of them were female. In that same year, of the 211,693 students who took the Calculus AB exam 102,598 of them were female ("AP exam scores," 2013). Estimate the difference in the proportion of female students taking the biology exam and female students taking the calculus AB exam using a 90% confidence level.
Suppose that, P1 = proportion of female students in biology exam
P2 = proportion of female students in Calculus exam.
Now the sample proportion for both samples as:
p1 = 84199/144796 = 0.5815
p2 = 102598/211693 = 0.4847
After that we need to calculate the pooled proportion p = (x1 + x2)/(n1 + n2)
Pooled proportion p = (84199 + 102598)/(144796 + 211693) = 0.524
Critical z for 90% confidence level = 1.645
E = critical z * standard error
Margin of error E = 1.645 * 0.524(1-0.524)(1144796+1211693)=0.0028
Hence 90% confidence interval
= ((0.5815 – 0.4847) – 0.0028, (0.5815 – 0.4847) + 0.0028)
= (0.094, 0.0996)
Hence, we can be 90% confident that the true difference in the proportion of female students taking the biology exam and female students taking the calculus AB exam will lie within (0.094, 0.0996)
9.1.5
Are there more children diagnosed with Autism Spectrum Disorder (ASD) in states that have larger urban areas over states that are mostly rural? In the state of Pennsylvania, a fairly urban state, there are 245 eight-year-old diagnosed with ASD out of 18,440 eight-year-old evaluated. In the state of Utah, a fairly rural state, there are 45 eight-year-old diagnosed with ASD out of 2,123 eight years old evaluated ("Autism and developmental," 2008). Is there enough evidence to show that the proportion of children diagnosed with ASD in Pennsylvania is more than the proportion in Utah? Test at the 1% level.
Suppose that P1 is the proportion of children diagnosed for Pennsylvania and P2 is the proportion of children diagnosed for Utah.
Null hypothesis H0: P1 = P2
Alternate Hypothesis: H1: P1 > P2 (claim)
Now given that, level of significance is 0.01
Critical z for right tailed test is 2.33.
Hence, rejection region is z > 2.33
Pooled proportion = (245 + 45)/ (18440 + 2123) = 0.0141
Standard error of proportion = 0.0141(1-0.0141)(118440+12123) = 0.0027
Calculation of z test,
Z test = 24518440-4521230.0027 = -2.93
Z test is less than 2.33. So, it is not in the rejection region. Hence, we fail to reject the null hypothesis. Hence, insufficient evidence that the proportion of children diagnosed with ASD in Pennsylvania is more than the proportion in Utah.
9.2.3
All Fresh Seafood is a wholesale fish company based on the east coast of the U.S. Catalina Offshore Products is a wholesale fish company based on the west coast of the U.S. Table #9.2.5 contains prices from both companies for specific fish types ("Seafood online," 2013) ("Buy sushi grade," 2013). Do the data provide enough evidence to show that a west coast fish wholesaler is more expensive than an east coast wholesaler? Test at the 5% level.
Table #9.2.5: Wholesale Prices of Fish in Dollars
Fish |
All Fresh Seafood Prices |
Catalina Offshore Products Prices |
Cod |
19.99 |
17.99 |
Tilapi |
6.00 |
13.99 |
Farmed Salmon |
19.99 |
22.99 |
Organic Salmon |
24.99 |
24.99 |
Grouper Fillet |
29.99 |
19.99 |
Tuna |
28.99 |
31.99 |
Swordfish |
23.99 |
23.99 |
Sea Bass |
32.99 |
23.99 |
Striped Bass |
29.99 |
14.99 |
Suppose that µd = Mean of paired differences for east cost – west cost
Null Hypothesis: H0: µd = 0
Alternate Hypothesis: H1: µd < 0>
Given that, level of significance is 0.05
Degree of freedom DF = 9 – 1 = 8
Critical t for the left tailed test is -1.86
Rejection region is t < -1.86.
I used a data analysis tool in excel to perform paired t-test:
t-Test: Paired Two Sample for Means |
|||
|
All Fresh Seafood Prices |
Catalina Offshore Products Prices |
|
Mean |
24.10222 |
21.65667 |
|
Variance |
66.81584 |
31.25 |
|
Observations |
9 |
9 |
|
Pearson Correlation |
0.473953 |
||
Hypothesized Mean Difference |
0 |
||
df |
8 |
||
t Stat |
0.991517 |
||
P(T<=t) one-tail |
0.175236 |
||
t Critical one-tail |
1.859548 |
||
P(T<=t) two-tail |
0.350472 |
||
t Critical two-tail |
2.306004 |
|
T test = 0.992
Here, t-test is not in the rejection region. So, we cannot reject the null hypothesis. So, insufficient evidence to support the claim that a west coast fish wholesaler is more expensive than an east coast wholesaler.
9.2.6
The British Department of Transportation studied to see if people avoid driving on Friday the 13th. They did a traffic count on a Friday and then again on a Friday the 13th at the same two locations ("Friday the 13th," 2013). The data for each location on the two different dates is in table #9.2.6. Estimate the mean difference in traffic count between the 6th and the 13th using a 90% level.
Table #9.2.6: Traffic Count
Dates |
6th |
13th |
1990, July |
139246 |
138548 |
1990, July |
134012 |
132908 |
1991, September |
137055 |
136018 |
1991, September |
133732 |
131843 |
1991, December |
123552 |
121641 |
1991, December |
121139 |
118723 |
1992, March |
128293 |
125532 |
1992, March |
124631 |
120249 |
1992, November |
124609 |
122770 |
1992, November |
117584 |
117263 |
Calculation of paired differences:
Dates |
6th |
13th |
d |
1990, July |
139246 |
138548 |
698 |
1990, July |
134012 |
132908 |
1104 |
1991, September |
137055 |
136018 |
1037 |
1991, September |
133732 |
131843 |
1889 |
1991, December |
123552 |
121641 |
1911 |
1991, December |
121139 |
118723 |
2416 |
1992, March |
128293 |
125532 |
2761 |
1992, March |
124631 |
120249 |
4382 |
1992, November |
124609 |
122770 |
1839 |
1992, November |
117584 |
117263 |
321 |
Now using excel calculator,
Mean of differences µd = 1835.8
Standard deviation s = 1176.014
Critical t for DF 9 and confidence level 90% = 1.833
So, margin of error for confidence interval calculation is E = 1.833*1176.014/SQRT(10) = 681.67
90% confidence interval = (1835.8 – 681.67, 1835.8 + 681.67) = (1154.13, 2517.47)
We, can be 90% confident that true mean difference in traffic count between the 6th and the 13th will lie within (1154.13, 2517.47)
9.3.1
The income of males in each state of the United States, including the District of Columbia and Puerto Rico, are given in table #9.3.3, and the income of females is given in table #9.3.4 ("Median income of," 2013). Is there enough evidence to show that the mean income of males is more than of females? Test at the 1% level.
Table #9.3.3: Data of Income for Males
$42,951 |
$52,379 |
$42,544 |
$37,488 |
$49,281 |
$50,987 |
$60,705 |
$50,411 |
$66,760 |
$40,951 |
$43,902 |
$45,494 |
$41,528 |
$50,746 |
$45,183 |
$43,624 |
$43,993 |
$41,612 |
$46,313 |
$43,944 |
$56,708 |
$60,264 |
$50,053 |
$50,580 |
$40,202 |
$43,146 |
$41,635 |
$42,182 |
$41,803 |
$53,033 |
$60,568 |
$41,037 |
$50,388 |
$41,950 |
$44,660 |
$46,176 |
$41,420 |
$45,976 |
$47,956 |
$22,529 |
$48,842 |
$41,464 |
$40,285 |
$41,309 |
$43,160 |
$47,573 |
$44,057 |
$52,805 |
$53,046 |
$42,125 |
$46,214 |
$51,630 |
|
|
|
|
Table #9.3.4: Data of Income for Females
$31,862 |
$40,550 |
$36,048 |
$30,752 |
$41,817 |
$40,236 |
$47,476 |
$40,500 |
$60,332 |
$33,823 |
$35,438 |
$37,242 |
$31,238 |
$39,150 |
$34,023 |
$33,745 |
$33,269 |
$32,684 |
$31,844 |
$34,599 |
$48,748 |
$46,185 |
$36,931 |
$40,416 |
$29,548 |
$33,865 |
$31,067 |
$33,424 |
$35,484 |
$41,021 |
$47,155 |
$32,316 |
$42,113 |
$33,459 |
$32,462 |
$35,746 |
$31,274 |
$36,027 |
$37,089 |
$22,117 |
$41,412 |
$31,330 |
$31,329 |
$33,184 |
$35,301 |
$32,843 |
$38,177 |
$40,969 |
$40,993 |
$29,688 |
$35,890 |
$34,381 |
|
|
|
|
Suppose that mean income for male is µ1 and mean income for female is µ2
Null Hypothesis: H0: µ1 = µ2
Alternate Hypothesis: H1: µ1 > µ2 (claim)
Given that Level of significance ? = 0.01
I used excel to perform independent sample t test.
t-Test: Two-Sample Assuming Unequal Variances |
||
|
Males |
Females |
Mean |
46446.38 |
36511 |
Variance |
49473354 |
37676539 |
Observations |
52 |
52 |
Hypothesized Mean Difference |
0 |
|
df |
100 |
|
t Stat |
7.67455 |
|
P(T<=t) one-tail |
5.65E-12 |
|
t Critical one-tail |
2.364217 |
|
P(T<=t) two-tail |
1.13E-11 |
|
t Critical two-tail |
2.625891 |
|
T test = 7.675
P = 0
As P < level xss=removed>the mean income of males is more than of females.
9.3.3
A study was conducted that measured the total brain volume (TVB) (in ) of patients that had schizophrenia and patients that are considered normal. Table #9.3.5 contains the TVB of the normal patients and table #9.3.6 contains the TVB of schizophrenia patients ("SOCR data oct2009," 2013). Is there enough evidence to show that the patients with schizophrenia have less TBV on average than a patient that is considered normal? Test at the 10% level.
Table #9.3.5: Total Brain Volume (in ) of Normal Patients
1663407 |
1583940 |
1299470 |
1535137 |
1431890 |
1578698 |
1453510 |
1650348 |
1288971 |
1366346 |
1326402 |
1503005 |
1474790 |
1317156 |
1441045 |
1463498 |
1650207 |
1523045 |
1441636 |
1432033 |
1420416 |
1480171 |
1360810 |
1410213 |
1574808 |
1502702 |
1203344 |
1319737 |
1688990 |
1292641 |
1512571 |
1635918 |
|
|
|
|
Table #9.3.6: Total Brain Volume (in ) of Schizophrenia Patients
1331777 |
1487886 |
1066075 |
1297327 |
1499983 |
1861991 |
1368378 |
1476891 |
1443775 |
1337827 |
1658258 |
1588132 |
1690182 |
1569413 |
1177002 |
1387893 |
1483763 |
1688950 |
1563593 |
1317885 |
1420249 |
1363859 |
1238979 |
1286638 |
1325525 |
1588573 |
1476254 |
1648209 |
1354054 |
1354649 |
1636119 |
|
|
|
|
|
Suppose that mean volume for normal patients is µ1 and mean volume for Schizophrenia patient is µ2
Null Hypothesis: H0: µ1 = µ2
Alternate Hypothesis: H1: µ1 > µ2 (claim)
I used excel to perform independent sample t test.
t-Test: Two-Sample Assuming Unequal Variances |
||
|
Normal |
Schizophrenia |
Mean |
1463339 |
1451293 |
Variance |
1.57E+10 |
2.96E+10 |
Observations |
32 |
31 |
Hypothesized Mean Difference |
0 |
|
df |
55 |
|
t Stat |
0.316843 |
|
P(T<=t) one-tail |
0.376281 |
|
t Critical one-tail |
1.297134 |
|
P(T<=t) two-tail |
0.752562 |
|
t Critical two-tail |
1.673034 |
|
P = 0.3168
As P > level of significance 0.10, we fail to reject the null hypothesis.
Hence insufficient evidence to support the claim that the patients with schizophrenia have less TBV on average than a patient that is considered normal.
9.3.4
A study was conducted that measured the total brain volume (TBV) (in ) of patients that had schizophrenia and patients that are considered normal. Table #9.3.5 contains the TBV of the normal patients and table #9.3.6 contains the TBV of schizophrenia patients ("SOCR data oct2009," 2013). Compute a 90% confidence interval for thedifference in TBV of normal patients and patients with Schizophrenia.
Suppose that mean volume for normal patients is µ1 and mean volume for Schizophrenia patient is µ2
Output for two sample t test confidence interval calculator:
Two sample T confidence interval:
?1 : Mean of Normal
?2 : Mean of Schizophrenia
?1 - ?2 : Difference between two means
(without pooled variances)
90% confidence interval results:
Difference |
Sample Diff. |
Std. Err. |
DF |
L. Limit |
U. Limit |
?1 - ?2 |
12046.025 |
38018.926 |
54.816618 |
-51564.575 |
75656.626 |
90% confidence interval = (-51564.6, 75656.6)
Hence, we can be 90% confident that the true mean difference in TVB of normal patients and patients with Schizophrenia will lie within (-51564.6, 75656.6)
As 0 is within the confidence interval, there is no any significant difference between TBV of normal patients and patients with Schizophrenia.
9.3.8
The number of cell phones per 100 residents in countries in Europe is given in table #9.3.9 for the year 2010. The number of cell phones per 100 residents in countries of the Americas is given in table #9.3.10 also for the year 2010 ("Population reference bureau," 2013). Find the 98% confidence interval for the difference in a mean number of cell phones per 100 residents in Europe and the Americas.
Table #9.3.9: Number of Cell Phones per 100 Residents in Europe
100 |
76 |
100 |
130 |
75 |
84 |
112 |
84 |
138 |
133 |
118 |
134 |
126 |
188 |
129 |
93 |
64 |
128 |
124 |
122 |
109 |
121 |
127 |
152 |
96 |
63 |
99 |
95 |
151 |
147 |
123 |
95 |
67 |
67 |
118 |
125 |
110 |
115 |
140 |
115 |
141 |
77 |
98 |
102 |
102 |
112 |
118 |
118 |
54 |
23 |
121 |
126 |
47 |
|
Table #9.3.10: Number of Cell Phones per 100 Residents in the Americas
158 |
117 |
106 |
159 |
53 |
50 |
78 |
66 |
88 |
92 |
42 |
3 |
150 |
72 |
86 |
113 |
50 |
58 |
70 |
109 |
37 |
32 |
85 |
101 |
75 |
69 |
55 |
115 |
95 |
73 |
86 |
157 |
100 |
119 |
81 |
113 |
87 |
105 |
96 |
|
|
|
Output for two-sample t-test confidence interval calculator:
Two sample T confidence interval:
?1 : Mean of Europe
?2 : Mean of America
?1 - ?2 : Difference between two means
(without pooled variances)
98% confidence interval results:
Difference |
Sample Diff. |
Std. Err. |
DF |
L. Limit |
U. Limit |
?1 - ?2 |
20.945815 |
6.9736187 |
74.029011 |
4.3640739 |
37.527556 |
98% confidence interval = (4.364, 37.528)
Hence, we can be 98% confident that true mean difference in mean number of cell phones per 100 residents in Europe and the Americas.
As the confidence interval don’t include 0, there is a significant difference between mean number of cell phones per 100 residents in Europe and the Americas.
Mean number of cell phones per 100 residents in Europe is significantly more than Americans.
11.3.2
Levi-Strauss Co manufactures clothing. The quality control department measures weekly values of different suppliers for the percentage difference of waste between the layout on the computer and the actual waste when the clothing is made (called run-up). The data is in table #11.3.3, and there are some negative values because sometimes the supplier is able to layout the pattern better than the computer ("Waste run up," 2013). Do the data show that there is a difference between some of the suppliers? Test at the 1% level.
Table #11.3.3: Run-ups for Different Plants Making Levi Strauss Clothing
Plant 1 |
Plant 2 |
Plant 3 |
Plant 4 |
Plant 5 |
1.2 |
16.4 |
12.1 |
11.5 |
24 |
10.1 |
-6 |
9.7 |
10.2 |
-3.7 |
-2 |
-11.6 |
7.4 |
3.8 |
8.2 |
1.5 |
-1.3 |
-2.1 |
8.3 |
9.2 |
-3 |
4 |
10.1 |
6.6 |
-9.3 |
-0.7 |
17 |
4.7 |
10.2 |
8 |
3.2 |
3.8 |
4.6 |
8.8 |
15.8 |
2.7 |
4.3 |
3.9 |
2.7 |
22.3 |
-3.2 |
10.4 |
3.6 |
5.1 |
3.1 |
-1.7 |
4.2 |
9.6 |
11.2 |
16.8 |
2.4 |
8.5 |
9.8 |
5.9 |
11.3 |
0.3 |
6.3 |
6.5 |
13 |
12.3 |
3.5 |
9 |
5.7 |
6.8 |
16.9 |
-0.8 |
7.1 |
5.1 |
14.5 |
|
19.4 |
4.3 |
3.4 |
5.2 |
|
2.8 |
19.7 |
-0.8 |
7.3 |
|
13 |
3 |
-3.9 |
7.1 |
|
42.7 |
7.6 |
0.9 |
3.4 |
|
1.4 |
70.2 |
1.5 |
0.7 |
|
3 |
8.5 |
|
|
|
2.4 |
6 |
|
|
|
1.3 |
2.9 |
|
|
|
Null Hypothesis: Mean Run-ups for all the 5 plants are equal to each other.
Alternate Hypothesis: Mean Run-ups for at least one plant is different from others.
Level of significance ? = 0.01
Excel output for one way ANOVA:
Anova: Single Factor |
||||||
SUMMARY |
||||||
Groups |
Count |
Sum |
Average |
Variance |
||
Plant 1 |
22 |
99.5 |
4.522727 |
100.6418 |
||
Plant 2 |
22 |
194.3 |
8.831818 |
235.7289 |
||
Plant 3 |
19 |
91.8 |
4.831579 |
19.38784 |
||
Plant 4 |
19 |
142.3 |
7.489474 |
13.37433 |
||
Plant 5 |
13 |
134.9 |
10.37692 |
91.29859 |
||
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
450.9207 |
4 |
112.7302 |
1.159631 |
0.334012 |
3.534992 |
Within Groups |
8749.088 |
90 |
97.21209 |
|||
Total |
9200.009 |
94 |
|
|
|
|
P = 0.334
As P > level of significance 0.01, we fail to reject the null hypothesis.
So insufficient evidence to support the claim that there is a difference between Run-ups for some of the suppliers.
11.3.4
A study was undertaken to see how accurate food labeling for calories on food that is considered reduced-calorie. The group measured the number of calories for each item of food and then found the percent difference between measured and labeled food, . The group also looked at food that was nationally advertised, regionally distributed, or locally prepared. The data is in table #11.3.5 ("Calories datafile," 2013). Do the data indicate that at least two of the mean percent differences between the three groups are different? Test at the 10% level.
Table #11.3.5: Percent Differences Between Measured and Labeled Food
National Advertised |
Regionally Distributed |
Locally Prepared |
2 |
41 |
15 |
-28 |
46 |
60 |
-6 |
2 |
250 |
8 |
25 |
145 |
6 |
39 |
6 |
-1 |
16.5 |
80 |
10 |
17 |
95 |
13 |
28 |
3 |
15 |
-3 |
|
-4 |
14 |
|
-4 |
34 |
|
-18 |
42 |
|
10 |
|
|
5 |
|
|
3 |
|
|
-7 |
|
|
3 |
|
|
-0.5 |
|
|
-10 |
|
|
6 |
|
|
Null Hypothesis: Mean Percent Differences Between Measured and Labeled Food for all the 3 groups are equal to each other.
Alternate Hypothesis: Mean Percent Differences Between Measured and Labeled Food for at least one group is different from others.
Level of significance = 0.10
Excel output for data tools one way ANOVA:
Anova: Single Factor |
||||||
SUMMARY |
||||||
Groups |
Count |
Sum |
Average |
Variance |
||
National Advertised |
20 |
2.5 |
0.125 |
110.6809 |
||
Regionally Distributed |
12 |
301.5 |
25.125 |
258.3693 |
||
Locally Prepared |
8 |
654 |
81.75 |
7050.786 |
||
ANOVA |
||||||
Source of Variation |
SS |
df |
MS |
F |
P-value |
F crit |
Between Groups |
38095.9 |
2 |
19047.95 |
12.97915 |
5.36E-05 |
2.452014 |
Within Groups |
54300.5 |
37 |
1467.581 |
|||
Total |
92396.4 |
39 |
|
|
|
|
P = 0
As P < level>
The Bottom Line -
Best Assignment Expert is a reputable and trustworthy source for assistance with stat 200 assignment 1. A subfield of arithmetic known as statistics is concerned with data gathering, evaluation, evaluation, organization, and representation. According to the eminent American scientist John Tukey, it is a discipline that relies heavily on mathematical equations. Although numbers may seem like a completely hackneyed topic, the more you learn about it, the more engaging it becomes. Unaware of it, you use data in your daily life to support claims like "Infants require more rest and sleep." Or "If you start studying early, you could complete the course." To calculate chance in complicated situations, the partner topic of likelihood, which scientifically represents chance, is used by statistics for generating projections. In many academic fields today, including economics, health, law, architecture, psychiatry, and mathematics, to mention a few, statistics has developed into an important instrument. In many parts of life, including commerce, politics, and trade, statistics are essential. Understanding and using statistical thinking has become very desirable due to the increasing application of statistics in so many facets of our life. Even if you don't directly apply statistical approaches, this is still important in life.