## Question 1

The quality-control manager at a memory card manufacturing company needs to determine whether the mean data-transfer rate of a large shipment of cards is equal to 10 MB/s. The population standard deviation is 0.8 MB/s. A random sample of 64 cards indicates a sample mean data-transfer rate of 9.8 MB/s.

a) At the 0.05 level of significance, is there evidence that the mean data transfer rate is different from 10 MB/s?

## Population standard deviation

σ

0.8

## MB/s

Sample mean

9.8

## MB/s

Sample size

n

64

## cards

Significance level

α

0.05

First we formulate the test hypothesis. The null hypothesis is that the population mean data-transfer rate is 10 Mb/s and the alternative hypothesis is that it is different. This is a 2-tailed test.

## H0: μ =

10

## MB/s

Ha: μ ≠

10

## MB/s

Since the population standard deviation is known and the sample size is large, we can use a z-test.

Calculations:

Standard error

s

0.10

## Upper rejection point

z0.025 =

1.96

## Lower rejection point

z0.025 =

-1.96

## z-statistic

z =

-2.00

The critical value for the 95% confidence level in a 2-tailed test is 1.96. The calculated z-statistic is -2 based on the formula above. Because it lies beyond the negative critical value of -1.96, we can reject the null hypothesis that the mean data-transfer rate is not different from 10 MB/s at the 0.05 level of significance. Put differently, we can be confident at the 95% level that the mean data-transfer rate is different from 10 MB/s.

b) Calculate the P-value and interpret its meaning.

In Excel, we get a p-value of 0.0455. This is the probability left in both tails (0.0228 in each). This is the lowest level of significance at which we can reject the null hypothesis. Since the p-value is lower than our required significance level (0.0455<0.05), we can say that at the 5% significance level the data provide sufficient evidence to conclude that the mean transfer-rate differs from 10 MB/s.
c) Construct a 95% confidence interval estimate of the population mean data-transfer rate of the memory cards.
This results in an interval from 9.60 to 10.00. The population mean of 10 MB/s lies just on the border of the confidence interval and we cannot reject the null hypothesis.
d) Compare the results of a) and c). What conclusions do you reach?
Comparing the results of a) and c), it becomes clear what is meant by 5% level of significance. It means that 5 out of 100 times the confidence interval will not contain the true mean of the population.

## Question 2

The operations manager at a light bulb company wants to determine whether there is any difference in the mean life expectancy of LED bulbs manufactured on two different types of machines. The population standard deviation of machine (1) is 29,300 hours and of machine (2) is 34,500 hours. A random sample of 25 light bulbs from machine (1) indicates a sample mean of 100,000 hours, and a similar sample of 25 from machine (2) indicates a sample mean of 95,500 hours.

a) Using the 0.05 level of significance, is there any evidence of a difference in the mean life of bulbs?

## Given:

Machine 1

Machine 2

Population standard deviation

σ

29,300

34,500

## hours

Sample size

n

25

25

## hours

+*++++

Sample mean

X

100,000

95,000

## hours

Significance level

α

0.05

Here we have two independent random samples taken from two populations with known standard deviations of 29,300 hours and 34,500 hours, respectively. Hence, we will use a test concerning the differences between means with a z-statistic, calculated by the formula:

We are interested whether there is any evidence of a difference in the mean life of bulbs at the 0.05 level of significance. Our hypotheses look like this:

## hours

+++*++++

Ha: μ1 ≠ μ2

As in the previous question, the critical value for the 95% confidence level in a 2-tailed test is 1.96. The calculated z-statistic is 0.55 based on the formula above. Therefore, we cannot reject the null hypothesis at the 0.05 level of significance. In other words, the data does not provide evidence that the mean life of bulbs from the two machines is different.

b) Calculate the P- value in a) and interpret its meaning.

The p-value of the z-statistic is 1.42, which is within the [-1.96; 1.96] interval defining the 0.05 significance level in a 2-tailed test (0.025 probability in each tail). Because the p-value is within the interval, we cannot reject the null hypothesis, reaching the same conclusion as in a).

The p-value indicates the smallest level of significance at which the null hypothesis can be rejected. In this case, the p-value is so large that we can be very certain that there is no difference in the mean life of bulbs coming from the different machines based on our samples.

## hours

++++*

Classified ads in the Courier Mail offered several used Toyota Corollas for sale.

a) Assuming a linear relationship between age and advertised price. Use the least-squares method to find the regression coefficients, b0 and b1.

Linear regression allows us to estimate the value of one variable based on another variable. The least-squares method fits a line that minimizes the errors between actual observation and predicted values of the variable.

Inputting the data in Excel allows us to run a regression with Toyota Corolla’s advertised price as a function of the cars age in years. The output from Excel indicates that b0, the intercept coefficient, is equal to $14,286 and b1, the slope coefficient, is equal to -$959.

## hours

+++++

## hours

+++++

## hours

+++++*+++

Regression Statistics

Multiple R

0.971767731

## hours

+++

0.944332523

## hours

+++++++*++++

0.94005041

## hours

++++++++*

816.2135371

## hours

+++++++

15

## hours

df

SS

MS

F

## hours

+++++++++

Regression

1

1.47E+08

1.47E+08

220.5295

1.56E-09

## hours

+++

13

8660659

666204.5

## hours

14

1.56E+08

## hours

+++++++

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

14285.94825

448.6727

31.84047

0.000000

13316.65

15255.25

X Variable 1

-959.0459098

64.58119

-14.8502

0.000000

-1098.57

-819.527

A simpler way to get the regression equation and visualize it at the same it is by asking Excel to construct a scatter plot.

b) Interpret the meaning of b0 and b1 in this problem.

The intercept, b0, is the expected value of the dependent variable when the independent variable, the car’s age in our case, is zero. This means that the expected price of a new Corolla is $14,286 based on our regression. The slope coefficient, b1, is the sensitivity of the dependent variable, the car’s price in our case, to changes in the independent variable, the car’s age in our case. Based on our regression, we can conclude that, for every year of its life, the price of a Toyota Corolla falls by $959.

c) Predict the value of a Corolla if is 9 years old.

## hours

+++++++++++++++*++++++

y = $14,286 – $959x

where y is the price of a Toyota Corolla and x is its age.

Simply plugging the age, gives us an estimated value of $5,665 for the price of a 9-year Toyota Corolla.

d) What is the value of R2 and explain its meaning in this context.

R2 indicates what portion of the variability in the data set that is explained by the regression. It tells us how good our model will be at estimating the dependent variable. Our regression has an R2 of 0.9443. This indicates that 94.43% of the variability in a Toyota Corolla’s price is explained by its age. Hence, we can be reasonably confident that our model will come up with a price that is close to the real price advertised.