statistics questions

114

Upload: neelabh-mishra

Post on 08-Sep-2015

25 views

Category:

Documents


2 download

DESCRIPTION

stats

TRANSCRIPT

  • Examiners Report

  • Mark Scheme

  • Examiners Report

  • Mark Scheme

  • Examiners Report

  • Mark Scheme

  • Final Mark Scheme 2616/01 June 2004

    General Instructions Some marks in the mark scheme are explicitly designated as M, A, B or E. M marks (method) are for an attempt to use a correct method (not merely for stating the method). A marks (accuracy) are for accurate answers and can only be earned if corresponding M mark(s) have been earned. Candidates are expected to give answers to a sensible level of accuracy in the context of the problem in hand. The level of accuracy quoted in the mark scheme will sometimes deliberately be greater than is required, when this facilitates marking. B marks (explanation) are for explanation and/or interpretation. These will frequently be subdividable depending on the thoroughness of the candidates answer. Follow-through marking should normally be used wherever possible there will however be an occasional designation of c.a.o. for correct answer only. Full credit MUST be given when correct alternative methods of solution are used. If errors occur in such methods, the marks awarded should correspond as nearly as possible to equivalent work using the method in the mark scheme. All queries about the marking should have been resolved at the standardising meeting. Assistant Examiners should telephone the Principal Examiner (or Team Leader if appropriate) if further queries arise during the marking. Assistant Examiners may find it helpful to use shorthand symbols as follows:- FT Follow-through marking Correct work after error Incorrect work after error C Condonation of a minor slip BOD Benefit of doubt NOS Not on scheme (to be used sparingly) Work of no value

  • Final Mark Scheme 2616/01 June 2004

    Q1 X ~ N(, 2), Y ~ N(2, 42); T = aX + bY (i) We want = E[aX + bY] M1 = a. + b.2 1 2b = 1 a i.e. b = 12 (1 a) 1 Beware printed answer The Var(T) = a22 + ( ){ }212 1-a (42) M1 Substitution of ( )12b= 1-a reqd = 2{a2 + (1 a)2} = 2{2a2 2a + 1} 1 5 (ii) Consider dda (2a

    2 2a + 1) = 0 M1 i.e. 0 = 4a 2 1 a = 12 1 Beware printed answer Verification that this is a minimum (e.g. trivially by 22dda ) 1

    T = ( )21 1 142 2X+ Y ~N , 2 if all three items are correct;

    award 1 if any two are correct [Both X and 12 Y are u. b. for m and both are Normally distributed all of which

    is also true for T; but] T has smaller variance ( ) ( ) 2 212Var X = ,Var Y = E2 8 (iii) t =7.48 B1 FT if wrong One-sided CI is given by

    -123

    107.48 1.645 M1 (use of 12 2 as Var(T))

    M1 M1 B1 M1 = 7.48 0.63(71) = 6.84(29) A1 C.A.O. 7

  • Final Mark Scheme 2616/01 June 2004

    Q2 A 237 249 213 233 227 236 B 203 222 214 216 230 (i) Wilcoxon rank sum test (or Mann-Whitney from thereof). Ranks are A 10 11 2 8 6 9 M1 for attempt B 1 5 3 4 7 A1 if all correct Rank sum is 20 (from B, otherwise the tables cant be used) (Mann-Whitney is 5) 1 Refer to tables of Wilcoxon rank sum (or Mann-Whitney) statistics. 1 Lower 2 12 % tail is needed. 1 Value for (5, 6) is 18 (or 3 for Mann-Whitney). 1 Result is not significant. 1 Seems medians are the same. 1 8 (ii) Normality of both underlying populations/distributions. 1 n1 = 6 ( ) 2 2n-1 n-1 n nx =232.5 s =143.1 s =11.9624 s =119.25,s =10.9202 n2 = 5 ( ) 2 2n-1 n-1 n ny =217.0 s =100.0 s =10.0 s =80.0,s =8.9443 Pooled s2 = 5143.1+4100.09 =123.94 M1 for any reasonable attempt at pooling (and FT into test) A1 if correct Test statistic is

    M1 ( ) ( ) 1 16 5

    232.5-217.0 -0 15.56.7414123.94 +

    = =2.29 92 A1

    = 11.1330 FT reasonable attempt Refer to t9. 1 May be awarded even if test statistic is wrong. No FT if wrong. Double-tailed 5% point is 2.262. 1 No FT if wrong Significant, seems means differ. 1 8 (iii) If the assumptions for the t procedure are satisfied, it is better (more sensitive/powerful), E2 but if not it might be seriously misleading and the non-par procedure safer. E2 4

  • Final Mark Scheme 2616/01 June 2004

    Q3 (i) H0 : D = 0 (or AFTER = BEFORE) 1 H1 : D < 0 (or AFTER < BEFORE) 1 Where D is the population mean difference after before 1 for verbal defn of [NOTE candidate might of course define D as before after take core that H1 agrees] Requires Normality of population 1 of differences 1. must be clear, or clearly implied The test procedure, and the CI in (ii), MUST be PAIRED COMPARISON t. Differences are [as after before, candidate might use before after] 6 19 13 31 22 2 8 44 11 14 d=-12.4 sn1 = 17.621 ( )2n-1s =310.49 A1 Accept sn = 16.716(5) 2ns =279.44 ONLY if correctly used in sequel. Test statistic is ( )17.621

    10

    -12.4-0 =- 2.22 535 A1 M1, M1, M1 (dont FT to 2nd M1)

    Refer to t9 1 May be awarded even if test statistic is wrong. No FT if wrong Lower s.t. 5% pt is 1.833 1 Sign must agree with H1/test statistic, unless a clear argument based on modulus is used. No FT if wrong. Significant 1. Seems mean afterwards is lower. 1 14 (ii) CI is given by 12.4 2.262 17.621

    10 = 12.4 12.60(4) = (25.00(4), 0.20(4))

    M1 B1 M1 A1 c.a.o. 4 Xero out of 4 if not same dist as for test. Some wrong dist can score max M1 B0 M1 A0. Recovery to t9 is ok. (iii) Any non-parametric procedure 1 Paired Wilcoxon 1 [allow sign test] 2

  • Final Mark Scheme 2616/01 June 2004

    Q4 (i) H0 : no association between age and level of interest. B1 H1 : association between age and level of interest. B1

    oi ei 49 216 265 60.84 204.16 A2 145 435 580 133.16 446.84 194 651 845

    Award A1 if any one is correct. But deduct 1 if not at least 2 dp

    oi ei = 11.84 or 11.34 with Yates correction

    ( )

    ( )

    2x =3.99 71 with Yates

    4.35 73 without Yates

    M1 for either, near-enough correct A1 if Yates used

    Refer to 2 1 [FT if 2 or 3 df averred] 1 Upper 5% point is 3.84 1 Significant 1 Seems there is association 1* Seems under-30s have less interest than would be expected, and over-30s more, then if there were no association. 2* * These 3 marks are not available if H0 H1 (ii)

    Level of interest Great Little

    Total

    Yes 118 314 432 Directly-elected mayor No 49 216 265

    Total 167 530 697 M1 for table with correctly labelled rows and columns. M1 if all margins correctly add up from the individual values. A1, A1, A1, A1 for each individual cell (118, 314, 49, 216). 6 (iii) We do not [at least prima facie] have a random sampler of 697 people who were classified over the ch requires such an assumption. E2 2 4 cells. The usual sample 2 approa

  • Examiners Report

  • 2616 Statistics 4 General Comments Most candidates appeared to be well prepared for this examination and there was no evidence that candidates had insufficient time to complete the paper. In fact, some candidates gave full answers to all four questions. As in previous years candidates performed much more strongly when carrying out the numerical parts of questions than they did when discussing assumptions or analysing results. The two most common examples of this weakness were firstly the assumptions required for the various t-tests to be valid many candidates were not clear about whether parent populations, samples, means or data had to be normally distributed or whether they were looking at one distribution, two distributions or the difference between two distributions. The second weakness was in the contextualisation of the results of a hypothesis test. Many candidates did not make any statement beyond reject , whilst at the other end of the scale, candidates were too definitive, making statements such as reject

    , hence the median strength using process A is greater than the median strength using process B.

    0H

    0H

    Once again, Question 1 on estimation was by far the least popular question. However most candidates who attempted question 1 scored well. Comments on Individual Questions Q.1 This question was only attempted by about 20% of candidates. Virtually all candidates knew what they had to do in part (i) and were able to

    verify the value of b. Most were also able to calculate the variance of T, although poor algebra let down some candidates.

    In part (ii) most candidates used calculus to show that the variance was

    minimised when a = 0.5, although some showed only that the variance had a stationary value. A few candidates used a method involving completing the square.

    Candidates who got this far were almost all able to state the distribution of T

    and explain why it was a better estimator of than either X or Y. Most candidates who attempted part (iii) knew what they were doing but a

    number failed to realise that Var(T) = 221 and a number also did not realise

    that because the value of 2 was known, the normal distribution should be used indeed one candidate used specifically because the sample was small.

    Q.2 This was the most popular question on the paper, being attempted by all but 2

    candidates. Part (i) was obviously familiar ground for most candidates and most scored

    very well here. The method of choice for most candidates was to calculate the

  • Wilcoxon rank sum statistic, covert to the Mann-Whitney statistic and then use the Mann-Whitney tables. Only a small minority of candidates calculated a statistic (Wilcoxon or Mann-Whitney) and then moved directly to the relevant statistical table. However, this part of the question was answered better than any other part of the paper.

    Part (ii) was not answered as well with many candidates not realising that

    Normality of both underlying populations was required. The pooled variance also caused some confusion with some candidates trying to pool standard deviations, some adding variances and others being confused about the use of and/or . 2ns

    21ns

    Once a variance had been obtained, most candidates were then able to

    calculate the test statistic correctly and compared it with the two-tailed value of . 9t

    In both parts (i) and (ii) a significant number of candidates were too definitive

    in their interpretation of the rejection, or otherwise, of the null hypothesis. Answers to part (iii) tended to be too vague with very few candidates

    mentioning the fact that the t-test is a more powerful, or sensitive, test than the non-parametric alternatives, as long as the assumptions are satisfied. However, if the assumptions are not satisfied, results can be seriously misleading.

    Q.3 In part (i) many candidates lost a significant number of marks because they

    did not carefully state their hypotheses or take sufficient care with the distributional assumption. Hypotheses such as the intensity remains the same and the intensity reduces were common. What is required are explicit statements about either the mean of the population of differences, or about the means of the populations before and after. In addition all terms used should be defined. The required distributional assumption was the Normality of the population of differences.

    As with other questions, most candidates were able to carry out the

    calculations competently and most used the correct value of t. Part (ii) was very well done by the majority of candidates, although a few did

    use the Normal distribution. Virtually all candidates correctly named the paired Wilcoxon test in part (iii) Q.4 Most candidates were obviously on comfortable ground here and tended to

    score well. In part (i) most candidates were able to state the hypotheses correctly,

    although some got the hypotheses the wrong way round and some talked about correlation.

    Calculations were inevitably done correctly, but a few candidates only gave

    the expected values to 1 decimal place or even to the nearest integer.

  • Many candidates obviously realised that it would be appropriate to use Yates correction, but few actually did. Of those that did, some were unsure whether to add or subtract 0.5.

    Most candidates correctly used 1 degree of freedom for the test and were

    able to give the correct critical value. A small minority used 2 or 3 degrees of freedom.

    2

    There was a definite improvement on previous years in the discussion of the

    results of the hypothesis test, with many candidates considering the contributions to the statistic, or at the very least considering the differences between observed and expected values.

    2 Most candidates scored full marks in part (ii) Candidates struggled with part (iii), with the most common suggestion being

    about different sample sizes. The actual reason was that we do not have a random sample of people who were classified over the 4 cells.

  • Mark Scheme

  • MEI STATISTICS 4 (2616) JANUARY 2005 SOLUTIONS

    Question 1

    (i) We have :

    M1 might be implicit in sequel M1 for any attempt to find )(E M1 for use of Poisson means A1

    1 M1 for any (reasonable) attempt to find Var M1 for use of Poisson variances A1 - beware printed answer

    8

    (ii) Now

    M1

    A1 1 M1 M1 M1, A1

    7

    )(151

    321 XXX ++=

    )104(151)(E ++=

    15

    )104(15

    1

    )(Var15

    1)(Var

    2

    3212

    =

    ++=

    ++=

    =

    XXX

    unbiasedis

    )10(Poisson ~ Y

    unbiased i.e.

    10.101)(E

    101)(E

    101)

    101(E

    ==== YYY

    nn

    nY

    YY

    1010

    1001

    )(Var100

    1

    )(Var100

    1)101(Var

    ==

    =

    =

    )(10Poisson~)(4Poisson~

    )(Poisson~

    3

    2

    1

    XXX

  • MEI STATISTICS 4 (2616) JANUARY 2005 SOLUTIONS

    (iii)

    So would need

    M1 E1 1 E2 Allow 1 for n 15

    5

    better is101ie

    2 for1510

    Y

    nn

    <

    nZZ =)(Varhave we),(Poisson~For

    nbetter tha be to16n

  • MEI STATISTICS 4 (2616) JANUARY 2005 SOLUTIONS

    Question 2

    (a) MUST be N (0,1) test and CI for comparing means Test statistic is

    Refer to N (0,1) 1% critical point (two-sided) is 2.576 Significant Seems mean waiting times differ CI is given by -1.3 1.96 0.4562 = -1.3 0.894 = (-2.194, -0.406) A1 M1 B1 M1

    1 if both correct. DO NOT allow or similar. Allow verbal statement

    1 if 2,1 are adequately defined in words (population mean times ) M1 A1

    1 No FT if wrong 1 No FT if wrong 1 1 accept (-2.2, -0.4)

    12

    (b) MUST be Wilcoxon rank-sum test (or Mann-Whitney form thereof). It is convenient, and natural, to rank top down Use of Ranks M1 Ranks are: I 4 5 2 7 1 10 II 6 8 11 13 3 9 12

    Rank sum (for I) is 29 (Mann-Whitney is 8)

    Refer to tables of Wilcoxon (or M-W) statistic

    Lower 5% tail is needed

    A1

    1

    1

    1

    210 : =H211 : H

    90)53(

    80)42( 22 +

    913612

    21 XX =

    [For bottom-up rankings W = 55, MW = 34Upper 5% tail W=55, MW = 34]

    4562.01208.0

    )97(84.23.13.1

    ==

  • MEI STATISTICS 4 (2616) JANUARY 2005 SOLUTIONS

    Value for (6,7) is 29 (or 8 if M-W used) Result is significant Seems on the whole there are differences in satisfaction scores

    1

    1

    1

    8

  • MEI STATISTICS 4 (2616) JANUARY 2005 SOLUTIONS

    Question 3

    (a)

    Differences (after before): MUST be PAIRED WILCOXON test. Ranks of |d| are Test statistic is 5 + 1 + 2 = 8 [or 70] Refer to paired Wilcoxon table with n=12 Lower 5% point is 17 [upper is 61] the observed 8 [or 70] is significant Seems coaching programme has improved short-term visual memory

    M1 A1 FT if wrong 1 1 1 1 1

    7

    (b) MUST be PAIRED COMPARISON t test Normality of differences Test statistic (for test of = 2.72 (62) Refer to t11 Upper 5% pt is 1.796 Significant Seems coaching programme has improved short-term visual memory

    1 M1 for use of differences B1 Accept Sn = 9.1248 (Sn2 = 83.85) ONLY if correctly used in sequel M1 A1 1 No FT if wrong 1 No FT if wrong 1 1

    9

    )8182.90(5299.95.7 211 === nn SSd

    is)0against 0 >= DD

    125299.9

    05.7

    893722841522116

    893721241511106

  • MEI STATISTICS 4 (2616) JANUARY 2005 SOLUTIONS

    Look at differences Consider e.g. dotplot

    Bulk of data appear OK [assuming no concern about being integers], but the two large upper outliers cast doubt

    M1 M1, or for any other relevant display/discussion of the data E2 (E0, E1, E2)

    4

  • MEI STATISTICS 4 (2616) JANUARY 2005 SOLUTIONS

    24

    Question 4

    (i) H0: no association (between success of transmission and type of destination) H1: association

    1 1

    2

    (ii) Oi Ei Contributions to X2 X2 = 10.63 (985) awrt 10.64 Refer to Upper 10% point is 7.779 Significant Seems there is association

    A4 - deduct 1 per error Must be to this level of accuracy M1 A2 [give A1 if (10.5, 10.8)]

    2[or zero; FT if wrong, unless 300] 1 1

    1 ZERO if H0 H1

    12

    (iii) The key feature is the behaviour of transmission when intended destinations are universities. There are many more more than one attempt, and many more not successful at all, transmissions than would be expected if there were no association, and many fewer successful at first attempt transmissions. There is little or no suggestion of any other associations.

    E6 (divisible)

    6

    30056921527220213148131421

    1802357100

    46.1308.2248.3696.872.1432.24

    )0(6.33)0(2.55)0(2.91

    2019.30528.08232.08216.10352.04532.03440.30587.08491.0

  • Examiners Report

  • 2616 Mark Scheme June 2005

    Mark Scheme 2616June 2005

  • 2616 Mark Scheme June 2005

    2616 Statistics 4

    Q1 X1, , Xn ~ ind N(, 2) ( ) = 2XXY i

    E(Y) = (n 1)2 Var (Y) = 2(n 1)4 T =kY

    (i) E(T) = k(n 1)2 B1 Var (T) = 2k2(n 1)4 B1 2 (ii) Bias = E(T) 2 M1 = k(n 1)2 2 A1 Allow M1A0 if 2 E(T). 2 (iii) MSE(T) = Variance + bias2 M1 If both terms present, even if wrong. = 2k2(n 1)4 + {k(n 1)2 2}2 A1 If both correct. = 2k2(n 1)4 +{k2(n 1)2 2k(n 1) + 1}4 = 4[2(n 1) +(n 1)2]k2 2 4(n 1)k + 4 A2 Divisible for algebra.

    BEWARE printed answer. 4

    (iv)

    Consider 0d

    )MSE(d =k

    T M1 To include =0, possibly implied.

    [ ] )1(22)1()1(2d

    )MSE(d 424 += nknnk

    T A1 Correct derivative.

    11

    )1()1(210 2

    +=+

    ==

    n

    nnnk

    A1 A1

    Isolate k. BEWARE printed answer.

    Check minimum by considering

    [ ]min 0

    2)1()1(2d

    )(MSE d 242

    2

    >+= nn

    kT

    M1 A1

    Or other methods. (Since n > 1).

    6

    (v)

    With 1

    1+= nk ,

    ++

    ++= 1

    1)1(2

    )1()1()1(2)(MSE 2

    24

    nn

    nnnT

    { }{ }

    1222

    )1(

    12221222)1(

    4

    2

    4

    2222

    4

    +=++=

    +++++++=

    nn

    n

    nnnnnnn

    B2

    Divisible for algebra. Answer not printed.

    2

    (vi) From (ii), we want k(n 1)2 2 = 0 M1 For the converse argument, with no

    11= nk

    A1

    support of only if, award SC B1.

    In this case, MSE(T) = Var(T) M1 Or substitute in expression for MSE in (iii) this is not difficult.

    12 4

    = n A1

    4

    20

  • 2616 Mark Scheme June 2005

    Q2 (i) H0 : = H1 : B1 Both hypotheses. Do not allow any

    other symbols, including, e.g., BA XX = or similar, unless they are

    clearly and explicitly stated to be population means. Allow statements in words (see below).

    Where , are the population mean strengths for processes A and B.

    B1 For adequate verbal definitions of , . Must indicate mean; condone average. Allow absence of population if correct notation is used, otherwise insist on population.

    Normality of both populations. B1 Same variance. B1 4 (ii)

    )443710(,07109,75123,8

    )34089(,2587,6667114,9

    12

    12

    12

    11

    ========

    nn

    nn

    ssyn

    ssxn

    B1

    If all means and variances correct. Accept sns ONLY if correctly used in sequel.

    76929,437595

    80668,5772

    2

    ====

    nn

    nn

    ss

    ss

    Pooled 3497

    1557636982 =+=s M1

    A1

    For any reasonable attempt at pooling (and ft into test and CI). If correct.

    Test statistic is

    )38(891

    7964400512308339

    81

    913497

    751236667114

    =

    ==

    +

    M1 M1 A1

    Overall structure. Allow cs pooled s.

    81

    91 +

    ft cs pooled s2.

    Refer to t15. M1 No ft from here if wrong. Double tail 5% point is 2131. A1 No ft from here if wrong. Not significant. E1 ft only cs test statistic. Seems mean strengths are the same for both

    processes. E1 ft only cs test statistic. Expect reference

    to means and context. 10

    (iii) CI is given by 90833 M1 Must be cs ...)( yx 2947 B1 From t15. 47964 M1 Allow cs pooled s. = 90833 141349 = (2321(8), 505(2)) A1 c.a.o. Must be written as an interval. 4 (iv) Wilcoxon B1 Or Mann-Whitney scores B2. Rank sum test B1 2 20

  • 2616 Mark Scheme June 2005

    Q3 (a) H0 : D = 0 or E = S

    H1 : D 0 or E SB1

    Both hypotheses. Do not allow any other symbols, including, e.g.,

    SE XX = or similar, unless they are clearly and explicitly stated to be population means. Allow statements in words (see below).

    Where D is population mean for Experimental fertilizer population mean for Standard fertilizer.

    B1 For adequate verbal definition of . Must indicate mean; condone average. Allow absence of population if correct notation is used, otherwise insist on population.

    Normality of differences is required.

    B1 Must be explicit about the population.

    MUST be PAIRED COMPARISON t test. Differences are

    06 23 08 06 09 15 14 08 01 02 M1

    13821),75(06681,460 211 === nn ssd B1 Accept ONLY if correctly used in sequel.

    0244.1,01211 2 == nn ss

    Test statistic is

    100668(75)1

    0460 M1 Allow cs d and/or sn1. Allow alternative: 0 (cs 2262)

    10)75(06681 (= 07631) for subsequent

    comparison with d .

    (Or d (cs 2262) 10

    )75(06681 (= 0303, 12231) for comparison with 0.)

    = 136(35) A1 c.a.o. (but ft from here if this is wrong.) Use of D d scores M1A0, but next 4 marks still available.

    Refer to t9. M1 No ft from here if wrong. Double tail 5% point is 2262. A1 No ft from here if wrong. Not significant. E1 ft only cs test statistic. Seems mean yield using experimental fertilizer is

    same as for standard. E1 ft only cs test statistic. Expect reference

    to mean(s) and context. 11

    (b) Now need Normality for yields using experimental

    fertilizer. B1

    For these yields, 64916,08034,4320 211 === nn ssx

    B1

    Accept ONLY if correctly used in sequel.

    9841.14,87093 2 == nn ss

    One-sided CI (lower confidence bound) is given by 2043

    M1 M1

    Mean. Allow cs x . Minus.

    1833 B1 From t9.

    1008034 M1 Allow cs sn-1, or sn / 9 (see above).

    = 2043 236(51) = 1806(49)

    A1 Depends on all 4 preceding marks.

    In repeated sampling, lower confidence bounds obtained in this way would fall below the true mean on 95% of occasions.

    E2 (E0, E1, E2). Comment should refer to lower bound rather than just the confidence interval.

    9

    20

  • 2616 Mark Scheme June 2005

    Q4 (a) Data 29 32 34 38 40 46 51 52 59 63 71 95

    Median 60 Difference 31 28 26 22 20 14 9 8 1 3 11 35 Rank of |diff| 11 10 9 8 7 6 4 3 1 2 5 12

    M1 For differences. ZERO in this section if differences not used.

    M1 For ranks of |difference|. A1 All correct.

    ft from here if ranks wrong.

    T = 2 + 5 + 12 = 19 B1 Or 1 + 3 + 4 + 6 + 7 + 8 + 9 + 10 + 11 = 59

    Refer to tables of Wilcoxon single sample (/paired) statistic.

    M1 No ft from here if wrong.

    Lower (or upper if 59 used) 2% tail is needed. M1 No ft from here if wrong. Value for n = 12 is 13 (or 65 if 59 used). A1 No ft from here if wrong. Result is not significant. E1 ft only cs test statistic. No real evidence that median is not 60. E1 ft only cs test statistic. 9 (b) (i)

    ( )( )

    172807452091800)4(39191)1,0(N)4(65930P

    100)327,62(N80P

    ==

  • 2616 Mark Scheme June 2005

    2616 - Statistics 4 General Comments There were 93 candidates from 20 centres (June 2004: 82 from 20). The overall standard of the scripts seen was pleasing: many candidates were clearly well prepared for this paper. Routine calculations were carried out well but the candidates ability to comment and interpret were a little disappointing at this level.

    Question 1 was by far the least popular question with only about 15 candidates attempting it. Every candidate attempted Question 2; Questions 3 and 4 were equally popular.

    Comments on Individual Questions 1) Estimation theory

    Although this was the least popular question it seemed to have the highest mean mark, with most of those attempting it scoring full or almost full marks. Those who were prepared to try it were likely to be successful as long as their algebra was up to the task. Sometimes the algebra arrived at the correct destination by brute force rather than elegance. There were just two places where marks seemed likely to be lost: part (iv) where some neglected to verify that the required value of k did indeed give a minimum and part (vi) where there was a temptation for some to use the converse argument.

    2) Two sample t test and confidence interval; the strengths of steel rods

    This was the most popular question being attempted by all candidates. It was also a very high scoring question: about half of the entry scored full or almost full marks. (i) The hypotheses were usually stated correctly but there was rather less

    care in providing verbal definitions of the population means. Similarly, the required assumptions were sometimes less than ideal.

    (ii) Most candidates carried out the test competently. There was rarely any

    problem over finding and using the pooled variance. The critical value was almost always correct but on a number of occasions the conclusion was badly expressed.

    (iii) As in part (ii) most candidates had little difficulty here. Just occasionally

    the standard error (which had been correctly constructed in part (ii)) became pooled s

    171 .

    (iv) This part was almost always correct.

  • 2616 Mark Scheme June 2005

    3) Paired sample t test and one-sided confidence interval; comparing

    fertilizers

    (a) The hypotheses were usually stated correctly but candidates were not as careful about defining the symbol . Nor were they sufficiently careful when it came to the distributional assumption.

    However there were only a very few candidates who did not realise that they should carry out a paired test. The vast majority made good progress with the test itself, and only the final conclusion left room for improvement.

    (b) As above, most realised what to do here and the correct value for the

    lower bound was usually found. A small minority tried to construct the confidence interval using the information from the paired test. There was some uncertainty again with the distributional assumption.

    The main area of difficulty was with the interpretation of the interval. Very many comments revealed a flawed understanding of a confidence interval to quite a worrying extent.

    4) Wilcoxon rank sum test for the median; Chi-squared test for goodness of fit;

    waiting times in an airport

    (a) This part of the question was almost always answered well. Many fully correct solutions were seen.

    (b) (i) This part was frequently done correctly. (ii) Most candidates calculated a correct value of X2 (with or without

    grouping) but relatively few were able to identify the correct Chi-squared distribution to look up. Most of those who got this second aspect wrong made no allowance for estimated parameters while a few thought that there were 200 degrees of freedom. Hardly any commented on the fact that the test statistic was significant at any level available to them in the tables.

    Disappointingly few candidates took the trouble to comment at all on the reasons for the poor quality of fit.

    (iii) In this part of the question very few candidates realised that they could

    refer back to the previous part for evidence that the assumption of background Normality was not viable. They knew that Normality was required, but often chose to look at the sample data in part (a), sometimes with the aid of a dot plot. Hardly any candidates included in their discussion the small sample size which might prompt the use of a t test.

    No more than a handful of candidates picked up on the fact that a t test examines the population mean whereas the Wilcoxon test in part (a) examined the median.

    11121314151617181920s405ju.pdf2616.pdf 2616 Statistics 4 2616 - Statistics 4