Avd. Matematisk statistik
EXAM IN SF1901 PROBABILITY AND STATISTICS TUESDAY MARCH 14 2017 KL 08.00–13.00.
Examinator: Thomas ¨Onskog, 08 – 790 84 55.
Till˚atna hj¨alpmedel : Formel- och tabellsamling i Matematisk statistik, Mathematics Handbook (Beta), hj¨alpreda f¨or minir¨aknare, minir¨aknare.
Uppgift 1
Consider a binary repetition code: Bits X1, X2, . . . are transmitted from a source med equal pro- bability for 0 and 1, and pass through a communications channel with error probability p < 1/2 (with probability p a 1 is changed to a 0 and vice versa). To correct for any errors that might arise every bit Xi is repeated an odd number of times N - a zero is transmitted as N zeros, a one as N ones - and a majority votes decides how the received sequence is interpreted (decoded): If N = 3 the sequence 001 is interpreted as 0 011 as 1 and so on. Errors arise independent of each other and independently of what is being transmitted.
a) Compute the probability that a sequence is decoded erroneously, that is that a transmitted 1 is interpreted as a 0 or the reverse, in the case N = 4. (4 p)
b) Compute the probability that it was a 1 that was transmitted if you decode the three bits as a 1. (4 p)
c) Find an expression for the probability that a received sequence is decoded erroneously for an arbitrary odd N > 3. (2 p)
Uppgift 2
The sides of an icosaeder are numbered 0, 1, . . . , 9. You suspect that the icosaeder is not fair - not uniform probability for the different outcomes in a roll - and therefore want to investigate the probability p of having 9 come up in a single roll. Find the maximum likelihood estimate of p in the following two cases:
a) 20 independent rolls, in which a 9 is obtained in five rolls. (5 p)
b) Independent rolls are made until 9 has come up four times. This happened in rolls 3, 9, 15 and 18. (5 p)
Uppgift 3
A scale does not only have measurement error but is also subject to a random interference. The result of a weighing of an object with weight µ can therefore be described by a random variable X such that
X = weight + measurement error + interference = µ + + δ,
forts tentamen i SF1901 2017-01-09 2
where the measurement error is N (0, σ) and the interference term is N (µδ, σδ) and independent of . At a weighing in an environment with no interference the result is µ + . To get a handle on the impact of the interference five measurements are made on objects with weights µ1, . . . , µ5 in both an environment with no interference and in an environment where the interference affects the outcome. The results were as follows:
Interference 48.47 51.39 46.87 45.52 53.87 No interference 47.85 52.07 47.47 47.50 55.10 All measurements can be viewed as outcomes of independent random variables.
a) Construct a 95% confidence interval for µδ. (7 p)
b) Test the null hypothesis H0 : µδ = 0 against the alternative hypothesis H1 : µδ > 0 at the level 5%. Conclusions regarding H0 should be stated and thoroughly motivated. (3 p)
Uppgift 4
In a large study scientists want to investigate whether climate can have an effect on the occurrence of asthma. Two large cities, A and B, with different climates but otherwise similar populations (age, ethnicity etc.) were chosen. In total 200 000 people in city A were tested and 100 000 in city B. The number of people with asthma were 13800 in A and 8400 in B; the total population in the two cities were ten million in A and five million in B.
a) Construct a confidence interval with approximate confidence level 99% for the difference in incidence of asthma in the two cities. State and motivate any approximations you make. (7 p)
b) Conduct a hypothesis test at the approximate level 1% to check if climate has a statistically significant effect on the incidence of asthma. Please state your hypotheses and conclusions clearly. (3 p)
Uppgift 5
In american football a “fumble” is when a player during a “play” drops the ball; a game contains roughly 60, 70 “plays” per team. The following table gives the number of fumbles in 55 games (recorded per team).
2 1 2 2 3 1 3 4 3 4 5 5 2 1 3 2 5 2 4 1 2 2 1 0 4 2 4 1 2 0 2 0 3 0 1 2 0 1 2 2 3 5 1 3 2 3 4 5 4 3 6 0 3 1 2 1 2 2 1 2 1 3 2 4 2 4 4 2 0 5 4 3 6 5 3 5 1 3 1 1 3 1 4 3 1 5 1 2 1 3 4 4 4 2 7 4 2 5 3 1 3 6 2 1 1 4 1 2 3 0 The table can be summarized by the following frequency table:
forts tentamen i SF1901 2017-01-09 3
Number of fumbles 0 1 2 3 4 5 6 7
Number of occurrences 8 24 27 20 17 10 3 1
The observed average number of fumbles in a game (per team) was 2.55 To simplify things the ob- servations can be viewed as outcomes of independent and identically distributed random variables.
Moreover, it can be assumed that the probability of a fumble in a “play” is constant.
a) Propose a suitable statistical model, with only one parameter, for the number of fumbles in a game for a team. Please motivate your choice. Hint: Asymptotic results for the binomial distribution may be useful. (2 p)
b) Conduct a statistical test on the level 5% that tests how well the proposed model fits observed data. Please state your hypotheses and conclusions clearly. (8 p)
Uppgift 6
In a processor for acoustic signals one observes a random variable Y that is the absolute value of X ∈ N (0, σ), σ > 0, i.e. Y = |X|. The observations of X are not available. The standard deviation σ is not known and should be estimated on the basis of n independent observations y1, . . . , yn of Y .
a) An intuitively appealing estimate of σ2 is
s∗ = 1 n
n
X
i=1
yi2,
where y1, . . . , yn are independent observations of Y . Decide whether or not s∗ is an unbiased estimate. (4 p)
b) Derive the density function fY of Y . (6 p)
Good luck!