Statistics and Research Method for Business Decision Making
Question 1
a) Appropriate Graphical Technique for Comparing the Amount of CO2 Emissions in the year 2009 and 2013
Here, bar diagram has been used to compare the amount of carbon die oxide emissions in the year 2009 and 2013. It is obtained from the graph that, the amount of CO2 emission was higher in China for both the years, in the year 2013 the amount reaches its maximum. The data shows that, Australia faced the lowest amount of CO2 emission in both the years that is 2009 and 2013. Furthermore, it is noted from the data that, the amount of CO2 emission reduces over the year in almost every country.
b) Appropriate Graphical Technique for Comparing the Percentage Value of the Amount of CO2 Emissions in 2009 and 2013
The percentage value of the amount of CO2 emissions in 2019 and 2013 has been drawn with the help of a line diagram. It is observed that, CO2 emissions percentage reaches its maximum in the year 2013. However, the percentage of CO2 emissions reduces over the year in all the countries mentioned.
c) Here, a comparison of the amount of CO2 emission has been observed for United States, China, Russia, Japan, India, Germany, Canada, United Kingdom, South Korea, Italy, Iran, South Africa, France, Saudi Arabia and Australia. From the above two diagram, it is noted that the amount of CO2 emission decreases from the 2009 to 2013 in all most every countries. However, in case of percentage of CO2 emissions also decreases in almost every country from the year 2009 to the year 2013.
Question 2
Classes |
Frequency |
Relative Frequency |
Cumulative Frequency |
Cumulative Relative Frequency |
35-44 |
3 |
0.075 |
3 |
0.075 |
45-54 |
4 |
0.1 |
7 |
0.175 |
55-64 |
9 |
0.225 |
16 |
0.4 |
65-74 |
18 |
0.45 |
34 |
0.85 |
75-84 |
4 |
0.1 |
38 |
0.95 |
85-94 |
1 |
0.025 |
39 |
0.975 |
95-104 |
1 |
0.025 |
40 |
1 |
a) Appropriate Frequency Distribution
Frequency Distribution, frequency distribution table helps to identify the frequencies of various outcomes of a sample. More precisely, frequency table summarizes all the distribution values included in a sample (Piepho, 2017). It is a useful technique as it helps researcher to understand the occurrence of mean within a dataset. Here, the last value should always be the sum of total observations. Relative frequency can be obtained by dividing the frequency by the sum of the total observations.
Relative Frequency= Frequency / Total Frequency
Classes |
Frequency |
Cumulative Frequency |
35-44 |
3 |
3 |
45-54 |
4 |
7 |
55-64 |
9 |
16 |
65-74 |
18 |
34 |
75-84 |
4 |
38 |
85-94 |
1 |
39 |
95-104 |
1 |
40 |
b) Appropriate Cumulative Frequency Distribution
Cumulative frequency distribution is a type of frequency distribution; it is the sum of classes (Nakagawa et al., 2017). This cumulative frequency can be calculated by adding every frequency present in a frequency distribution. Cumulative relative frequency can be obtained by dividing cumulative frequency for one observation by total of cumulative frequency.
Cumulative Frequency = F1 + (F1+ F2) + (F1+ F2 + F3) and so on.
Relative Cumulative Frequency = Cumulative Frequency / Total Cumulative Frequency
Classes |
Frequency |
Cumulative Frequency |
Cumulative Relative Frequency |
35-44 |
3 |
3 |
0.075 |
45-54 |
4 |
7 |
0.175 |
55-64 |
9 |
16 |
0.4 |
65-74 |
18 |
34 |
0.85 |
75-84 |
4 |
38 |
0.95 |
85-94 |
1 |
39 |
0.974 |
95-104 |
1 |
40 |
1 |
c) Relative Frequency Histogram
Relative frequency histogram is one type of graph that shows the percentage of the frequencies. Here, the data shows that the amount of time required for assembly line workers so that they can complete a weld at a car assembly plant and the number of worker here is 40. It is obtained from the data that, relative frequency histogram is highest in the class 65 to 74, that is 0.45.
d) Ogive
Ogive graph plots cumulative frequency on the horizontal axis that is y axis and class boundaries along with the vertical axis. An ogive is a graph that plots the cumulative frequencies and allows estimating number of observations (Piepho, 2018). Here, cumulative frequency has been plotted in the horizontal axis.
e) The Proportion of Data Less than 65
The proportion of data less than 65 is 15%. It has been obtained from the above table.
f) The Proportion of Data More than 75
The proportion of data more than 75 is 66%, it has been obtained from the above table.
Question 3
SUMMARY OUTPUT |
||||||||
Regression Statistics |
||||||||
Multiple R |
0.038875 |
|||||||
R Square |
0.001511 |
|||||||
Adjusted R Square |
-0.05104 |
|||||||
Standard Error |
1268.43 |
|||||||
Observations |
21 |
|||||||
ANOVA |
||||||||
|
df |
SS |
MS |
F |
Significance F |
|||
Regression |
1 |
46268.68 |
46268.68 |
0.028758 |
0.867132 |
|||
Residual |
19 |
30569396 |
1608916 |
|||||
Total |
20 |
30615664 |
|
|
|
|||
|
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
Lower 95.0% |
Upper 95.0% |
Intercept |
3874.286 |
696.832 |
5.559857 |
2.31E-05 |
2415.8 |
5332.773 |
2415.8 |
5332.773 |
Rate of Inflation |
40.30771 |
237.6901 |
0.169581 |
0.867132 |
-457.183 |
537.7989 |
-457.183 |
537.7989 |
- Descriptive Measures (Time Series Data)
For understanding the graphical measures used for time series data a simple line diagram has been used. Line diagram is the best suited for understanding the trend line of time series data. Here, two variables that are rate of inflation and all ordinaries index has been drawn with the help of line diagram. It is observed from the data that, the trend line of rate of inflation is upward rising and it increases over the years. Whereas, the variable all ordinaries index is follows a straight line trend that is horizontal to the y axis.
b) Scatter Plot
It is observed from the data that, there exist no trend line between the variables rate of inflation and all ordinaries index. Scatter diagram is a set of points that significantly portrayals the relationship between rate of inflation and all ordinaries index (Barnes and Barnes, 2015). It can be stated that there is very poor correlation between rate of inflation and all ordinaries index. Here, all ordinaries index has been plotted to horizontal axis and rate of inflation has been plotted to x-axis.
c) Numerical Summary Report
Rate of Inflation |
|
All-Ordinaries Index |
|
Mean |
2.690476 |
Mean |
3982.733 |
Standard Error |
0.260394 |
Standard Error |
269.9897 |
Median |
2.5 |
Median |
4127.6 |
Mode |
2.4 |
Mode |
#N/A |
Standard Deviation |
1.193275 |
Standard Deviation |
1237.248 |
Sample Variance |
1.423905 |
Sample Variance |
1530783 |
Kurtosis |
2.038282 |
Kurtosis |
-1.01001 |
Skewness |
0.798408 |
Skewness |
0.182747 |
Range |
5.6 |
Range |
4336.8 |
Minimum |
0.3 |
Minimum |
2000.8 |
Maximum |
5.9 |
Maximum |
6337.6 |
Sum |
56.5 |
Sum |
83637.4 |
Count |
21 |
Count |
21 |
The summary report includes summary measures of rate of inflation and all ordinaries index. It is obtained from the summary measures that, the mean, median, range, variance and standard deviation of rate of inflation are 2.690476, 2.5, 5.6, 1.423905 and 1.193275 respectively. On the other hand, the mean, median, range, variance and standard deviation of all ordinaries index are 3982.733, 4127.6, 4336.8, 1237.248 and 1530783 respectively. The three quartiles of rate of inflation are first quartile (Q1) = 2.4, second quartile (Q3) = 2.5 and third quartile (Q3) = 2.9. The first quartile (Q1) of all ordinaries index is 3032, second quartile (Q2) of all ordinaries index is 4127.6 and third quartile (Q3) of ordinaries index is 4933.5.
Coefficient of Variation = (Standard Deviation / Mean)*100
The value of coefficient of variation (COV) for rate of inflation is 44.35180694 and for all ordinaries index is 31.0653047.
d) Correlation Coefficient
|
Rate of Inflation |
All-Ordinaries Index |
Rate of Inflation |
1 |
|
All-Ordinaries Index |
0.038875 |
1 |
Coefficient of Correlation (r ) between rate of inflation and all ordinaries index is 1. This correlation shows the relation between the variables rate of inflation and all ordinaries index. Thus, it can be stated that there is a strong relationship between both the variables rate of inflation and all ordinaries index.
e) Estimating Regression Line
Simple Linear Regression Equation is,
All Ordinaries Index = 3874.286 + 40.30771 (Rate of Inflation)
Here, all ordinaries index is the dependent variable and rate of inflation is the independent variable. It is obtained from the regression equation that there is a strong positive correlation between all ordinaries index and rate of inflation.
f) Estimating Coefficient of Determination
The coefficient of Determination is the R2 value; it is obtained from the ANOVA table of the regression analysis. It is obtained from the data that the value of R2 is 0 this indicates that rate of all ordinaries index that is the dependent variable cannot be predicted with the help of all ordinaries index (independent variable).
g) Testing of Significance
Null Hypothesis (H0): All ordinaries index has no positive relation with the rate of inflation.
Alterative Hypothesis (H1): All ordinaries index has a positive relation with the rate of inflation
Testing of significance can be done with the help of p value; p value helps to determine the significance of the entire result. P value comes in between the numbers 0 and 1, if the value is less than 0.05, then test result would not be significant and if the value is greater than 0.05, then result would be significant (Altman and Krzywinski, 2015). In this case, the p value obtained is 0.01 that is lower than the value 0.05, therefore, it can be stated that both the variables are not significant.
h) Standard Error
Standard error of estimate measures the variation of a variable that can been occurred around the computed regression line. Variable prediction accuracy can be measured with the help of standard error of estimates that has been generated from the regression line. However, the value of standard error of estimate obtained here is 1.22. This value indicates that the fitness of linear regression model is high and both variables are significantly related.
References
Altman, N. and Krzywinski, M., 2015. Points of significance: simple linear regression.
Barnes, E.A. and Barnes, R.J., 2015. Estimating linear trends: Simple linear regression versus epoch differences. Journal of Climate, 28(24), pp.9969-9976.
Halsey, L.G., Curran-Everett, D., Vowler, S.L. and Drummond, G.B., 2015. The fickle P value generates irreproducible results. Nature methods, 12(3), p.179.
Nakagawa, S., Johnson, P.C. and Schielzeth, H., 2017. The coefficient of determination R 2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of the Royal Society Interface, 14(134), p.20170213.
Piepho, H.P., 2018. A Coefficient of Determination (R2) for Linear Mixed Models. arXiv preprint arXiv:1805.01124.