Statistics and Research Method for Business Decision Making
Question 1
a) Appropriate Graphical Technique for Comparing the Amount of CO_{2} Emissions in the year 2009 and 2013
Here, bar diagram has been used to compare the amount of carbon die oxide emissions in the year 2009 and 2013. It is obtained from the graph that, the amount of CO_{2 }emission was higher in China for both the years, in the year 2013 the amount reaches its maximum. The data shows that, Australia faced the lowest amount of CO_{2} emission in both the years that is 2009 and 2013. Furthermore, it is noted from the data that, the amount of CO_{2} emission reduces over the year in almost every country.
b) Appropriate Graphical Technique for Comparing the Percentage Value of the Amount of CO_{2 }Emissions in 2009 and 2013
The percentage value of the amount of CO_{2} emissions in 2019 and 2013 has been drawn with the help of a line diagram. It is observed that, CO_{2} emissions percentage reaches its maximum in the year 2013. However, the percentage of CO_{2 }emissions reduces over the year in all the countries mentioned.
c) Here, a comparison of the amount of CO_{2 }emission has been observed for United States, China, Russia, Japan, India, Germany, Canada, United Kingdom, South Korea, Italy, Iran, South Africa, France, Saudi Arabia and Australia. From the above two diagram, it is noted that the amount of CO_{2 }emission decreases from the 2009 to 2013 in all most every countries. However, in case of percentage of CO_{2} emissions also decreases in almost every country from the year 2009 to the year 2013.
Question 2
Classes 
Frequency 
Relative Frequency 
Cumulative Frequency 
Cumulative Relative Frequency 
3544 
3 
0.075 
3 
0.075 
4554 
4 
0.1 
7 
0.175 
5564 
9 
0.225 
16 
0.4 
6574 
18 
0.45 
34 
0.85 
7584 
4 
0.1 
38 
0.95 
8594 
1 
0.025 
39 
0.975 
95104 
1 
0.025 
40 
1 
a) Appropriate Frequency Distribution
Frequency Distribution, frequency distribution table helps to identify the frequencies of various outcomes of a sample. More precisely, frequency table summarizes all the distribution values included in a sample (Piepho, 2017). It is a useful technique as it helps researcher to understand the occurrence of mean within a dataset. Here, the last value should always be the sum of total observations. Relative frequency can be obtained by dividing the frequency by the sum of the total observations.
Relative Frequency= Frequency / Total Frequency
Classes 
Frequency 
Cumulative Frequency 
3544 
3 
3 
4554 
4 
7 
5564 
9 
16 
6574 
18 
34 
7584 
4 
38 
8594 
1 
39 
95104 
1 
40 
b) Appropriate Cumulative Frequency Distribution
Cumulative frequency distribution is a type of frequency distribution; it is the sum of classes (Nakagawa et al., 2017). This cumulative frequency can be calculated by adding every frequency present in a frequency distribution. Cumulative relative frequency can be obtained by dividing cumulative frequency for one observation by total of cumulative frequency.
Cumulative Frequency = F_{1 }+ (F_{1}+ F_{2}) + (F_{1}+ F_{2} + F_{3}) and so on.
Relative Cumulative Frequency = Cumulative Frequency / Total Cumulative Frequency
Classes 
Frequency 
Cumulative Frequency 
Cumulative Relative Frequency 
3544 
3 
3 
0.075 
4554 
4 
7 
0.175 
5564 
9 
16 
0.4 
6574 
18 
34 
0.85 
7584 
4 
38 
0.95 
8594 
1 
39 
0.974 
95104 
1 
40 
1 
c) Relative Frequency Histogram
Relative frequency histogram is one type of graph that shows the percentage of the frequencies. Here, the data shows that the amount of time required for assembly line workers so that they can complete a weld at a car assembly plant and the number of worker here is 40. It is obtained from the data that, relative frequency histogram is highest in the class 65 to 74, that is 0.45.
d) Ogive
Ogive graph plots cumulative frequency on the horizontal axis that is y axis and class boundaries along with the vertical axis. An ogive is a graph that plots the cumulative frequencies and allows estimating number of observations (Piepho, 2018). Here, cumulative frequency has been plotted in the horizontal axis.
e) The Proportion of Data Less than 65
The proportion of data less than 65 is 15%. It has been obtained from the above table.
f) The Proportion of Data More than 75
The proportion of data more than 75 is 66%, it has been obtained from the above table.
Question 3
SUMMARY OUTPUT 

Regression Statistics 

Multiple R 
0.038875 

R Square 
0.001511 

Adjusted R Square 
0.05104 

Standard Error 
1268.43 

Observations 
21 

ANOVA 


df 
SS 
MS 
F 
Significance F 

Regression 
1 
46268.68 
46268.68 
0.028758 
0.867132 

Residual 
19 
30569396 
1608916 

Total 
20 
30615664 





Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 95.0% 
Upper 95.0% 
Intercept 
3874.286 
696.832 
5.559857 
2.31E05 
2415.8 
5332.773 
2415.8 
5332.773 
Rate of Inflation 
40.30771 
237.6901 
0.169581 
0.867132 
457.183 
537.7989 
457.183 
537.7989 
 Descriptive Measures (Time Series Data)
For understanding the graphical measures used for time series data a simple line diagram has been used. Line diagram is the best suited for understanding the trend line of time series data. Here, two variables that are rate of inflation and all ordinaries index has been drawn with the help of line diagram. It is observed from the data that, the trend line of rate of inflation is upward rising and it increases over the years. Whereas, the variable all ordinaries index is follows a straight line trend that is horizontal to the y axis.
b) Scatter Plot
It is observed from the data that, there exist no trend line between the variables rate of inflation and all ordinaries index. Scatter diagram is a set of points that significantly portrayals the relationship between rate of inflation and all ordinaries index (Barnes and Barnes, 2015). It can be stated that there is very poor correlation between rate of inflation and all ordinaries index. Here, all ordinaries index has been plotted to horizontal axis and rate of inflation has been plotted to xaxis.
c) Numerical Summary Report
Rate of Inflation 

AllOrdinaries Index 

Mean 
2.690476 
Mean 
3982.733 
Standard Error 
0.260394 
Standard Error 
269.9897 
Median 
2.5 
Median 
4127.6 
Mode 
2.4 
Mode 
#N/A 
Standard Deviation 
1.193275 
Standard Deviation 
1237.248 
Sample Variance 
1.423905 
Sample Variance 
1530783 
Kurtosis 
2.038282 
Kurtosis 
1.01001 
Skewness 
0.798408 
Skewness 
0.182747 
Range 
5.6 
Range 
4336.8 
Minimum 
0.3 
Minimum 
2000.8 
Maximum 
5.9 
Maximum 
6337.6 
Sum 
56.5 
Sum 
83637.4 
Count 
21 
Count 
21 
The summary report includes summary measures of rate of inflation and all ordinaries index. It is obtained from the summary measures that, the mean, median, range, variance and standard deviation of rate of inflation are 2.690476, 2.5, 5.6, 1.423905 and 1.193275 respectively. On the other hand, the mean, median, range, variance and standard deviation of all ordinaries index are 3982.733, 4127.6, 4336.8, 1237.248 and 1530783 respectively. The three quartiles of rate of inflation are first quartile (Q_{1}) = 2.4, second quartile (Q_{3}) = 2.5 and third quartile (Q_{3}) = 2.9. The first quartile (Q_{1}) of all ordinaries index is 3032, second quartile (Q_{2}) of all ordinaries index is 4127.6 and third quartile (Q_{3}) of ordinaries index is 4933.5.
Coefficient of Variation = (Standard Deviation / Mean)*100
The value of coefficient of variation (COV) for rate of inflation is 44.35180694 and for all ordinaries index is 31.0653047.
d) Correlation Coefficient

Rate of Inflation 
AllOrdinaries Index 
Rate of Inflation 
1 

AllOrdinaries Index 
0.038875 
1 
Coefficient of Correlation (r ) between rate of inflation and all ordinaries index is 1. This correlation shows the relation between the variables rate of inflation and all ordinaries index. Thus, it can be stated that there is a strong relationship between both the variables rate of inflation and all ordinaries index.
e) Estimating Regression Line
Simple Linear Regression Equation is,
All Ordinaries Index = 3874.286 + 40.30771 (Rate of Inflation)
Here, all ordinaries index is the dependent variable and rate of inflation is the independent variable. It is obtained from the regression equation that there is a strong positive correlation between all ordinaries index and rate of inflation.
f) Estimating Coefficient of Determination
The coefficient of Determination is the R^{2} value; it is obtained from the ANOVA table of the regression analysis. It is obtained from the data that the value of R^{2} is 0 this indicates that rate of all ordinaries index that is the dependent variable cannot be predicted with the help of all ordinaries index (independent variable).
g) Testing of Significance
Null Hypothesis (H_{0}): All ordinaries index has no positive relation with the rate of inflation.
Alterative Hypothesis (H_{1}): All ordinaries index has a positive relation with the rate of inflation
Testing of significance can be done with the help of p value; p value helps to determine the significance of the entire result. P value comes in between the numbers 0 and 1, if the value is less than 0.05, then test result would not be significant and if the value is greater than 0.05, then result would be significant (Altman and Krzywinski, 2015). In this case, the p value obtained is 0.01 that is lower than the value 0.05, therefore, it can be stated that both the variables are not significant.
h) Standard Error
Standard error of estimate measures the variation of a variable that can been occurred around the computed regression line. Variable prediction accuracy can be measured with the help of standard error of estimates that has been generated from the regression line. However, the value of standard error of estimate obtained here is 1.22. This value indicates that the fitness of linear regression model is high and both variables are significantly related.
References
Altman, N. and Krzywinski, M., 2015. Points of significance: simple linear regression.
Barnes, E.A. and Barnes, R.J., 2015. Estimating linear trends: Simple linear regression versus epoch differences. Journal of Climate, 28(24), pp.99699976.
Halsey, L.G., CurranEverett, D., Vowler, S.L. and Drummond, G.B., 2015. The fickle P value generates irreproducible results. Nature methods, 12(3), p.179.
Nakagawa, S., Johnson, P.C. and Schielzeth, H., 2017. The coefficient of determination R 2 and intraclass correlation coefficient from generalized linear mixedeffects models revisited and expanded. Journal of the Royal Society Interface, 14(134), p.20170213.
Piepho, H.P., 2018. A Coefficient of Determination (R2) for Linear Mixed Models. arXiv preprint arXiv:1805.01124.