Avd. Matematisk statistik

EXAM IN SF1901 PROBABILITY AND STATISTICS TUESDAY MARCH 14 2017 KL 08.00–13.00.

Examinator: Thomas ¨Onskog, 08 – 790 84 55.

Till˚atna hj¨alpmedel : Formel- och tabellsamling i Matematisk statistik, Mathematics Handbook (Beta), hj¨alpreda f¨or minir¨aknare, minir¨aknare.

Uppgift 1

Consider a binary repetition code: Bits X_{1}, X_{2}, . . . are transmitted from a source med equal pro-
bability for 0 and 1, and pass through a communications channel with error probability p < 1/2
(with probability p a 1 is changed to a 0 and vice versa). To correct for any errors that might
arise every bit X_{i} is repeated an odd number of times N - a zero is transmitted as N zeros, a one
as N ones - and a majority votes decides how the received sequence is interpreted (decoded): If
N = 3 the sequence 001 is interpreted as 0 011 as 1 and so on. Errors arise independent of each
other and independently of what is being transmitted.

a) Compute the probability that a sequence is decoded erroneously, that is that a transmitted 1 is interpreted as a 0 or the reverse, in the case N = 4. (4 p)

b) Compute the probability that it was a 1 that was transmitted if you decode the three bits as a 1. (4 p)

c) Find an expression for the probability that a received sequence is decoded erroneously for an arbitrary odd N > 3. (2 p)

Uppgift 2

The sides of an icosaeder are numbered 0, 1, . . . , 9. You suspect that the icosaeder is not fair - not uniform probability for the different outcomes in a roll - and therefore want to investigate the probability p of having 9 come up in a single roll. Find the maximum likelihood estimate of p in the following two cases:

a) 20 independent rolls, in which a 9 is obtained in five rolls. (5 p)

b) Independent rolls are made until 9 has come up four times. This happened in rolls 3, 9, 15 and 18. (5 p)

Uppgift 3

A scale does not only have measurement error but is also subject to a random interference. The result of a weighing of an object with weight µ can therefore be described by a random variable X such that

X = weight + measurement error + interference = µ + + δ,

forts tentamen i SF1901 2017-01-09 2

where the measurement error is N (0, σ) and the interference term is N (µ_{δ}, σ_{δ}) and independent
of . At a weighing in an environment with no interference the result is µ + . To get a handle on
the impact of the interference five measurements are made on objects with weights µ1, . . . , µ5 in
both an environment with no interference and in an environment where the interference affects
the outcome. The results were as follows:

Interference 48.47 51.39 46.87 45.52 53.87 No interference 47.85 52.07 47.47 47.50 55.10 All measurements can be viewed as outcomes of independent random variables.

a) Construct a 95% confidence interval for µδ. (7 p)

b) Test the null hypothesis H_{0} : µ_{δ} = 0 against the alternative hypothesis H_{1} : µ_{δ} > 0 at the
level 5%. Conclusions regarding H_{0} should be stated and thoroughly motivated. (3 p)

Uppgift 4

In a large study scientists want to investigate whether climate can have an effect on the occurrence of asthma. Two large cities, A and B, with different climates but otherwise similar populations (age, ethnicity etc.) were chosen. In total 200 000 people in city A were tested and 100 000 in city B. The number of people with asthma were 13800 in A and 8400 in B; the total population in the two cities were ten million in A and five million in B.

a) Construct a confidence interval with approximate confidence level 99% for the difference in incidence of asthma in the two cities. State and motivate any approximations you make. (7 p)

b) Conduct a hypothesis test at the approximate level 1% to check if climate has a statistically significant effect on the incidence of asthma. Please state your hypotheses and conclusions clearly. (3 p)

Uppgift 5

In american football a “fumble” is when a player during a “play” drops the ball; a game contains roughly 60, 70 “plays” per team. The following table gives the number of fumbles in 55 games (recorded per team).

2 1 2 2 3 1 3 4 3 4 5 5 2 1 3 2 5 2 4 1 2 2 1 0 4 2 4 1 2 0 2 0 3 0 1 2 0 1 2 2 3 5 1 3 2 3 4 5 4 3 6 0 3 1 2 1 2 2 1 2 1 3 2 4 2 4 4 2 0 5 4 3 6 5 3 5 1 3 1 1 3 1 4 3 1 5 1 2 1 3 4 4 4 2 7 4 2 5 3 1 3 6 2 1 1 4 1 2 3 0 The table can be summarized by the following frequency table:

forts tentamen i SF1901 2017-01-09 3

Number of fumbles 0 1 2 3 4 5 6 7

Number of occurrences 8 24 27 20 17 10 3 1

The observed average number of fumbles in a game (per team) was 2.55 To simplify things the ob- servations can be viewed as outcomes of independent and identically distributed random variables.

Moreover, it can be assumed that the probability of a fumble in a “play” is constant.

a) Propose a suitable statistical model, with only one parameter, for the number of fumbles in a game for a team. Please motivate your choice. Hint: Asymptotic results for the binomial distribution may be useful. (2 p)

b) Conduct a statistical test on the level 5% that tests how well the proposed model fits observed data. Please state your hypotheses and conclusions clearly. (8 p)

Uppgift 6

In a processor for acoustic signals one observes a random variable Y that is the absolute value of
X ∈ N (0, σ), σ > 0, i.e. Y = |X|. The observations of X are not available. The standard deviation
σ is not known and should be estimated on the basis of n independent observations y_{1}, . . . , y_{n} of
Y .

a) An intuitively appealing estimate of σ^{2} is

s^{∗} = 1
n

n

X

i=1

y_{i}^{2},

where y_{1}, . . . , y_{n} are independent observations of Y . Decide whether or not s^{∗} is an unbiased
estimate. (4 p)

b) Derive the density function f_{Y} of Y . (6 p)

Good luck!