In this paper we use basic tools of statistics and probability theory to investigate is it reasonable to install such red light cameras in our town. There is some data available from neighboring towns in the form of accident rates per week at various intersections before and after the red light cameras have been installed. We perform statistical hypothesis testing based on this data set.
The sample of 50 observations was collected from neighboring towns. This data is given below in the following table:
If we want to consider is it reasonable to install such red light cameras, we must be sure that on average the amount of accidents per week was reduced after cameras were set up in neighbored towns. That’s why we want to test the following claim: In neighbored towns the accident rates were significantly decreased after installing red light cameras.
Formulate null and alternative hypotheses:
Null hypothesis: there is no significant difference in accident rates before and after cameras installing.
Alternative hypothesis: the accident rates were significantly decreased after installing red light cameras.
Set level of significance alpha at the most common value of 0.05
Now we have to check the appropriate test to perform this analysis. The most useful test for this claim is paired t-test. Paired two sample t-test for the mean is used to test the hypothesis about the difference between the averages for the two data samples. It is not assumed equal variances of the populations from which the data is selected. This test is used in observing natural pairing in samples, for example, when the general population is tested twice (before and after treatment, at the beginning and end of the experiment, etc.).
For the application of this criterion requires that the original data had a normal distribution. In the case of two-sample test for independent samples is also necessary to observe the condition of equality of variances. There are, however, alternative t-test for unequal variances situation. The requirement of normality of data distribution is essential for accurate t-test. However, even if the data distribution is not normal, it is possible to use t-statistics. In many cases, this statistic has asymptotically standard normal distribution - N (0,1), so we can use quantile of this distribution. Often, however, even in this case, a quantile of the standard normal distribution is not, and the corresponding t-distribution as the exact t-test. They are asymptotically equivalent, but the small sample confidence intervals of the Student distribution are wider and more reliable.
We use MS Excel Statistics Tools to perform paired t-test for this data:
Do not forget that our test is one-tailed because we are checking that the accident rates were decreased (not just “are different”). The results of testing are: t-value is 1.676551, p-value is 8.74E-14. Since p-value is lesser than alpha level of significance, we can reject the null hypothesis. We have enough evidence to say that the accident rates were decreased after red light cameras were installed (at 5% level of significance). That’s why we recommend setting up such red light cameras in our town. It will significantly help to reduce the amount of crashes on our roads.
Lehmann E.L. (1992) "Introduction to Neyman and Pearson (1933) On the Problem of the Most Efficient Tests of Statistical Hypotheses". In: Breakthroughs in Statistics, Volume 1, (Eds Kotz, S., Johnson, N.L.), Springer-Verlag. ISBN 0-387-94037-5 (followed by reprinting of the paper)
Neyman, J.; Pearson, E.S. (1933). "On the Problem of the Most Efficient Tests of Statistical Hypotheses". Philosophical Transactions of the Royal Society A 231 (694–706): 289–337. doi:10.1098/rsta.1933.0009.,
Wilson González, Georgina; Kay Sankaran (September 10, 1997). "Hypothesis Testing". Environmental Sampling & Monitoring Primer. Virginia Tech.