Question 1.

A group of 44 students was asked to guess, to the nearest metre, the width of the lecture hall in which they were sitting. The true width of the hall was 13.1

metres. The students’ guesses are reported in the first excel spreadsheet in the file “Coursework dataset”. Answer the following questions:

a) Create a frequency table of the values of the guesses. Also, include relative frequencies (percentages) and cumulative percentages in the table. Comments on

your findings.

b) Calculate the sample mean, median and standard deviation of the guesses. Comment on the characteristics of the sample of guesses.

c) Compute the sample skewness and kurtosis of the guesses and comment on the characteristic of the sample that you can learn from these statistics.

Question 2.

The second spreadsheet of the “Coursework dataset” file contains a sample of wealth data for the UK population. The data is presented in a grouped format. Please

answer the following questions regarding this sample:

a) Discuss what would be the best way to present the data graphically.

b) Construct a histogram of wealth and comment on wealth distribution.

c) Compute the relevant measures of central tendency (hint: you’ll need to make an assumption regarding the size of the final class) and a measure of variation for

this data.

d) Discuss and justify which of the metrics of central tendency above would be more appropriate to describe the data.

e) Compute the first and the third quartile of wealth and construct a boxplot.Comment on your results.

f) Compute the sample skewness and kurtosis and comment on the features of the sample distribution. Is it symmetric?

g) Calculate the proportion of population with a wealth above £100,000.

h) Calculate the probability that an individual selected at random will aheva wealth of no more than £ 50,000.

Question 3.

The third spreadsheet of the “Coursework dataset” file contains data on the salaries of employees in a bank, as well as some other variables which might have an

impact on the employee’s salaries. Answer all the following questions:

a) Draw a histogram of SALARY. What does it tell you?

b) For each variable in the data set explain which type of variable it is (categorical, numerical…)

c) Calculate the sample mean, median, standard deviation and first and third quartiles of all the variables in column B to F. Also, draw boxplots of the

variables side by side. What can you infer?

d) Draw a scatterplot with SALARY on the horizontal (X) axis and one of the other variables on the vertical (Y) axis. Do this for all the variables in columns

B to G. Comment on your findings.

e) Prepare a summary table with the correlations between the variables and discuss the extent of these correlations.

