18-01-2018

QUESTION-1

In recent times it has been suggested that university graduates can be divided into two groups, those who graduated from a business degree and those who did not.  Suppose you are working for the Government and have been asked to determine whether this is the case.  Of particular importance is the income of a graduate in their first year of employment. It is believed that higher incomes are more likely to be experienced by students who are not business graduates. Below is a statistical summary of a sample of incomes of non-business and business graduates (measured in \$000s)

 Excel Descriptive Statistics Table Statistic Non-Business Business Mean 81 69 Standard Error 1 3 Median 82 60 Mode 82 59 Standard Deviation 5 15 Sample Variance 29 236 Kurtosis -1 -1 Skewness 0 0 Range 17 54 Minimum 72 31 Maximum 89 85 Sum 2420 1771 Count 30 30 1st Quartile 76 51 3rd Quartile 84 71
a)Define the mode, what can be concluded from comparing the modes?
b)Calculate the coefficient of variation of both variables. What can be concluded by comparing these statistics?
c)Compare the distributions by sketching the box and whisker plots for each variable side-by-side.
d)Discuss the distributions of Business Graduates in contrast to Non-Business Graduates
e)It is suggested that the results may be influenced by extreme values.  Using the appropriate measure of central tendency (location) which group typically earns higher incomes?

QUESTION-2
a)You have been asked to verify a claim that the average amount of money spent by visitors to Melbourne Zoo on food and beverages is \$30. A random sample of 50 students indicates that the average amount spent is \$31.50 and the population standard deviation is \$4.65. Using a 0.10 level of significance, conduct a hypothesis test to determine if the population mean amount spent is more than \$30? Interpret your answer.
b)Calculate the p-value and interpret its meaning.
c)A pharmaceutical company is testing a new drug for side effects. One in thee people tested experienced side effects. If the drug is administered to 100 people, find the probability that less than 30 people experience side effects.

QUESTION-3

According to the Government statistics, the average length of stay of Tourists visiting Melbourne was 2.5 days.  Suppose the standard deviation of the length of stay for all tourists is 1 day.  Assuming length of stay is normally distributed, answer the following (remember to show all workings and interpret your answer):

a)What is the probability that the length of stay will be between 1.5 and 3.5 days?

b)What is the average length of stay for the top 15% of tourists who stay the longest (assume a sample size of 100 was taken)?

c)

 Year - Price (kg) Year - Quantity (kg) 2000 2005 2010 2000 2005 2010 Sultanas 3.99 4.5 4.98 20 22 24 Bananas 2.99 3.98 13.49 13 11 10 Rice 12.85 13.99 18.99 10 15 12 Lentils 4.05 4.98 6.95 17 18 19
Table 1: Price and Consumption Data of an average Australian Family
Table 1 lists the average price and quantity consumed for a selection of food products. Calculate the Paasche price indexes using 2000 as the base year for years 2005 and 2010.Remember to interpret your answers.

QUESTION-4

In recent times many global commentators have suggested that Greece will be eventually forced out of the European Union.  Suppose you have been commissioned by the United Nations to survey the view of the various central bank heads.  Accepting this commission you ask the heads of central banks belonging to the world’s 20 largest economies:

Is Greece’s involvement in the European Union untenable?

Of the 20 people surveyed, 12 believed Greece’s involvement in the European Union is untenable.

a)Describe the distribution of sample proportions (assuming p=0.60).  (Remember to discuss whether it is normally distributed)

b)Calculate the 75% confidence interval for the proportion of central bank heads who think Greece’s involvement in the European Union is untenable.

c)Interpret the confidence interval calculated above.

QUESTION-5

A land developer wants to determine how much how much a piece of undeveloped land is worth.  The variables they use are distance from CBD, land size, distance from nearest shopping centre and the nymber of trees currently on the property. A regression was estimated as follows:

 SUMMARY OUTPUT Regression Statistics Multiple R 0.56396 R Square 0.31805 Adjusted R Square 0.30942 Standard Error 35919.47 Observations 321 ANOVA df SS MS F Significance F Regression 4 1.9015E+11 4.75E+10 36.84429 2.77E-25 Residual 316 4.0771E+11 1.29E+09 Total 320 5.9785E+11 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept -23775.40 15145.16 -1.570 0.1175 -53573.5 6022.69 Number of trees -347.92 64.72 -5.376 0.0000 -475.25 -220.59 Land area (size) 0.092 0.052 1.768 0.0780 -0.010 0.195 Distance from CBD 16409.51 2562.04 6.405 0.0000 11368.70 21450.33 Distance from nearest shopping centre 17386.98 6397.81 2.718 0.0069 4799.30 29974.68

a)Write down the regression model.

b)Interpret the coefficients intercept remembering to comment on the direction and significance of each value

c)Does the model explain value variation well?

d)When applying the linear regression technique certain assumptions are made regarding the residuals. List each assumption and briefly explain what it means.

Mode is defined as the value which occurs more frequently in the data set. The mode for non-business is 82 while for business is only 59. P value can be calculated from z table . As per z table p value corresponding to z=2.28 is 0.116 which means that there is only 11.6% chance to have mean more than 30. As the intercept of number of trees is negative that means more trees decrease the net worth of land .. A positive intercept value means more the variable more will be land value. Land area intercept is very less as compared to distance variables meaning land areas do not affect much the value of land while its distance from cbd and shopping Centre are most significant variable having very high coefficient value.

