The purpose of this statistical project is to discover the historical data of meteorological observations in Australia. The data set was taken from the official site of Australian Government, Bureau of Meteorology. The data represents mean maximum temperature in Melbourne Airport location for years since 1970 till 2014. In this paper we will look at the basic descriptive statistics of the data and show some practice in hypothesis testing.
The purpose of the narrative (descriptive) statistics is the processing of empirical data, their classification, visual representation in the form of graphs and tables, as well as their quantitative description by the main statistical indicators.
Unlike inductive statistics Descriptive statistics makes conclusions about the population based on the results of the study of particular cases. Inductive same statistics contrast suggests that the properties and patterns identified in the study sample objects are also inherent in the population.
Descriptive statistics uses three main methods of data aggregation:
- tabular presentation
- calculation of statistical indicators
The data set of the mean maximum temperature in Melbourne Airport location for years since 1970 till 2014 is given below (retrieved from http://www.bom.gov.au/jsp/ncc/cdio/cvg/av):
The amount of years observed is 43 years.
Descriptive Statistics: Temperature
Variable N N* Mean SE Mean StDev Variance CoefVar Minimum Q1
Temperature 12 0 19,76 1,42 4,92 24,19 24,89 13,10 14,95
Variable Median Q3 Maximum IQR Mode Mode
Temperature 19,75 24,32 26,50 9,37 16,6 2
The average yearly maximum temperature is 19.76 with a standard deviation of 4.92 and standard error of the mean on 1.42
The lowest temperature is usually in July – 13.1 and the maximum is in January – 26.4
The interquartile range is 9.37
We can plot this data on a graph to understand the distribution better:
The distribution seems not to be normal, because it doesn’t fit the bell-shape curve.
The next step of our project is to test a hypothesis.
Statistical hypothesis is a certain assumption about the properties of the probability distribution underlying the observed random phenomena. The observation results are usually presented in the form of implementation of certain swarm set of random variables, finite or infinite. In this case the joint distribution of these random variables is not fully known, and the statistical hypothesis assumes its belonging to a particular class of distributions. In this situation, the task of statistical hypothesis test.
For our paper we want to look at the data of mean maximum temperatures in Melbourne Airport in 2010 and compare this data with the data above to understand if 2010 year was significantly different (significantly unusual) from the average observations or not.
The data for 2010 year is given below:
We follow the usual steps to perform hypothesis testing.
- Set the null and alternative hypothesizes:
H0: μ1=μ2Ha: μ1≠μ2
- Set the level of significance alpha = 0.05
- Perform paired t-test for comparing mean:
We put this data in Minitab and obtain the following:
Two-Sample T-Test and CI: 1970-2014; 2010
Two-sample T for 1970-2014 vs 2010
N Mean StDev SE Mean
1970-2014 12 19,76 4,92 1,4
2010 11 19,68 5,44 1,6
Difference = mu (1970-2014) - mu (2010)
Estimate for difference: 0,08
95% CI for difference: (-4,45; 4,60)
T-Test of difference = 0 (vs not =): T-Value = 0,04 P-Value = 0,972 DF = 20
Since p-value of the test is 0.972 and it is higher than the level of significance, we failed to reject the null hypothesis. The data represented in 2010 year is usual and does not significantly different from the average data in 1970-2014 years (at 5% level of significance).