sta301 final quizz by sarfraz

http://vuattach.ning.com/


FINALTERM EXAMINATION

FALL 2006

STA301 - STATISTICS AND PROBABILITY (Session - 1 )

Marks: 50

Time: 120min

StudentID/LoginID:______________________________

Student Name:______________________________

Center Name/Code:______________________________

Exam Date:Tuesday, February 06, 2007

Please read the following instructions carefully before attempting any of the questions:

1 . Attempt all questions. Marks are written adjacent to each question.

2. Do not ask any question about the contents of this examination from anyone.

a. If you think that there is something wrong with any of the questions, attempt it to the best of your understanding.

b. If you believe that some essential piece of information is missing, make an appropriate assumption and use it to solve the problem.

c. Write all steps, missing steps may lead to deduction of marks.

3. You are allowed to use the calculator & Statistical tables in order to solve the

http://vuattach.ning.com/questions.

4. For your convenience we are providing you the following symbols,

∑ , ∩ , X o r write Mean, s, σ or sd for standard deviation, s 2 2σ or sd 2 or

variance for variance, , log x∑ , for square root or whole square root.

**WARNING: Please note that Virtual University takes serious note of unfair means. Anyone found involved in cheating will get an `F` grade in this course.

For Teacher's use only

Question12345678910

Total

Marks

Question11121314


Marks

Question No: 1 ( Marks: 4 )

Statistics as a subject, in which two of parts is divided? Expalin briefly both of parts.


Differentiate simple and composite hypothesis.


Correct the followings:

µ σ± contains approximately 50% area.2µ σ± contains approximately 90% area.3µ σ± contains approximately 90.88% area.

Question No: 4 ( Marks: 1 ) - Please choose one


The heights in centimeters of 5 students are:

165, 175, 176, 159, 170.

The sample median and sample mean are respectively:

►

170, 169

►

170, 170

►

169, 170

►

176, 169


The characteristic which can not be measured numerically is called:


►

Quantitative variable

►

Qualitative variable

►

Discrete variable

►

Continuous variable


The expected value of the normal distribution is

►

0

http://vuattach.ning.com/►

1

►

µ

►

σ


Normal distribution is

►

Uni-model

►

Bi-modal

►

http://vuattach.ning.com/Multi-model

►

None of these


One sided and two sided critical regions are based on:

►

Level of significance

►

Sample size

►

Null hypothesis

►

http://vuattach.ning.com/Alternative hypothesis


The rule or formula that is used to estimate a population parameter is called:

►

Estimate

►

Estimator

►

Denominator

►

None of these



The probability of rejecting a true null hypothesis is called:

►

Level of significance

►

Type-1 error

►

Type-II error

►

None of above


The value of chi-square can never be


►

Zero

►

Negative

►

Greater than 1

►

None of these


The grade-point averages of college seniors selected at random from the graduating class are as follows:

3.21.92.72.4

2.82.93.83.0

http://vuattach.ning.com/2.53.31.82.5

3.72.82.03.2

2.32.12.51.0

Calculate the standard deviation.


The mean lifetime of electric light bulbs produced by a company has in the past been 1120 hours with a standard deviation of 125 hours. A sample of 8 electric bulbs recently chosen form a supply of newly manufactured bulbs showed a mean lifetime of 1070 hours. Test the hypothesis that mean lifetime of the bulbs has not changed using a level of significance of 0.05.


A random sample of 200 voters is selected and 120 are found to support an annexation suit. Find the 96% confidence interval for the fraction of the voting population favoring the suit.

FINALTERM EXAMINATION Fall 2009

STA301- Statistics and Probability (Session - 1) Time: 120 min

Marks: 70Student Info StudentID:

http://vuattach.ning.com/ Center:

ExamDate: 2/24/2010 12:00:00 AM

For Teacher's Use Only Q No.

1 2 3 4 5 6 7 8 Total

Marks

Q No. 9 10 11 12 13 14 15 16

Marks

Q No. 17 18 19 20 21 22 23 24

Marks

Q No. 25 26 27 28 29 30 31

Marks

http://vuattach.ning.com/ Question No: 1 ( Marks: 1 ) - Please choose one

10!

=………….

► 362880 ► 3628800 ► 362280 ► 362800 Question No: 2 ( Marks: 1 ) - Please choose one

When

E is an impossible event, then P(E) is:

► 2 ► 0 ► 0.5 ► 1


The value

of χ 2 can never be :

► Zero ► Less than 1 ► Greater than 1 ► Negative


The curve

of the F- distribution depends upon: http://vustudents.ning.com

► Degrees of freedom ► Sample size ► Mean ► Variance Question No: 5 ( Marks: 1 ) - Please choose one

If X and Y

are random variables, then ( )E X Y−is equal to:

► ( )) (E X E Y+

► ( )) (E X E Y−

► ( )X E Y−

► ( )E X Y−

http://vuattach.ning.com/Question No: 6 ( Marks: 1 ) - Please choose one

In testing

hypothesis, we always begin it with assuming that:

► Null hypothesis is true ► Alternative hypothesis is true ► Sample size is large ► Population is normal Question No: 7 ( Marks: 1 ) - Please choose one

For the

Poisson distribution P(x) =

0.135 10.135

1!

−l

the mean value is : ► 2

► 5

► 10 ► 0.135


When two

coins are tossed simultaneously, P (one head) is:

►

1

4

►

1

2

►

3

4 ► 1 Question No: 9 ( Marks: 1 ) - Please choose one

From

point estimation, we always get: http://vustudents.ning.com

► Single value

► Two values

► Range of values

► Zero


The


sample variance

22 ( )x xS

n

∑ −= is:

► Unbiased estimator of 2σ

► Biased estimator of 2σ

► Unbiased estimator of µ

► None of these


Var(4X +

5) =__________

► 16 Var (X) ► 16 Var (X) + 5 ► 4 Var (X) + 5 ► 12 Var (X) Question No: 12 ( Marks: 1 ) - Please choose one

When f (x,

y) is bivariate probability density function of continuous r.v.'s X and Y, then

( ),f x y dx dy∞ ∞

−∞ −∞∫ ∫

is equal to:

► 1 ► 0 ► -1 ► ∞ Question No: 13 ( Marks: 1 ) - Please choose one

The area

under a normal curve between 0 and -1.75 is

► .0401

► .5500

► .4599 ► .9599 Question No: 14 ( Marks: 1 ) - Please choose one

When

a fair die is rolled, the sample space consists of: http://vustudents.ning.com

http://vuattach.ning.com/ ► 2 outcomes ► 6 outcomes ► 36 outcomes

► 16 outcomes Question No: 15 ( Marks: 1 ) - Please choose one

When

testing for independence in a contingency table with 3 rows and 4 columns, there are ________ degrees of freedom. ► 5 ► 6 ► 7 ► 12 Question No: 16 ( Marks: 1 ) - Please choose one

The F-

test statistic in one-way ANOVA is: ► SSW / SSE ► MSW / MSE ► SSE / SSW ► MSE / MSW Question No: 17 ( Marks: 1 ) - Please choose one

The

continuity correction factor is used when:

► The sample size is at least 5

► Both nP and n (1-P) are at least 30

► A continuous distribution is used to approximate a discrete distribution

► The standard normal distribution is applied


A

uniform distribution is defined by: http://vustudents.ning.com

► Its largest and smallest value

► Smallest value ► Largest value ► Mid value Question No: 19 ( Marks: 1 ) - Please choose one

Which

graph is made by plotting the mid point and frequencies?

► Frequency polygon

► Ogive

► Histogram

► Frequency curve



In a set of

20 values all the values are 10, what is the value of median?

► 2 ► 5 ► 10 ► 20 Question No: 21 ( Marks: 1 )

If

( )0P X ==

1

8 , ( )1P X ==

3

8 , ( )2P X ==

3

8 and ( 3)P X = =

1

8Then find F (1)


Write

down the formula of mathematical expectation.

e=(w * p) + (-v *1). e Question No: 23 ( Marks: 3 )

Discuss

the statistical independence of two discrete random variables:

http://vustudents.ning.comQuestion No: 24 ( Marks: 3 )

For given

data calculate the mean and standard deviation of sampling distribution of mean if the

sampling is down without replacement.

1000, 25, 68.5, 2.7N n µ σ= = = =


Elaborate

the Least Significant Difference (LSD) Test.


State the

Bayes’ Theorem.

Factory Sample Size Mean Variance A 160 12.80 64 B 220 11.25 47

( ) ≤≤

=elsewhere

xforxf

x

,0

20,2

http://vuattach.ning.com/Question No: 27 ( Marks: 5 )

The

means and variances of the weekly incomes in rupees of two samples of workers are given in the following table, the samples being randomly drawn from two different factories:

http://vustudents.ning.comCalculate the 90% confidence interval for the real difference in the incomes of the workers from the two factories.


From the

given data 1340, 723, .54n x p= = = and 0 0 1 0: 0.5 : 0.5H P against H P= ≠ .Carry out the significance test for the stated hypothesis.


Given the

Probability density function.

Compute the distribution function F(x).


1

f(x,y) (6 – x – y), 0 x 2; 2 y 4,80, elsewhere

= ≤ ≤ ≤ ≤

=a) Verify that f(x,y) is a joint density function.

b) Calculate ,

2

5Y,

2

3XP

≤≤


Let

1 2 3, ,X X X be a random sample of size 3 from a population with mean 2and varianceµ σ

Consider the following two estimators of the mean


1 2 31

1 2 32

32

4

X X XT

X X XT

+ +=

+ +=

http://vustudents.ning.com

Which estimator should be preferred?


STA301- Statistics and Probability (Session - 1) Time: 120 min

Marks: 70Student Info StudentID:

Center:

ExamDate: 2/24/2010 12:00:00 AM


1 2 3 4 5 6 7 8 Total

Marks

Q No. 9 10 11 12 13 14 15 16

Marks

Q No. 17 18 19 20 21 22 23 24

Marks

Q No. 25 26 27 28 29 30 31

Marks


10!

=………….


When


► 2 ► 0 ► 0.5 ► 1


The value




The curve

of the F- distribution depends upon: http://vustudents.ning.com


If X and Y


► ( )) (E X E Y+

► ( )) (E X E Y−

► ( )X E Y−

► ( )E X Y−


In testing



For the


0.135 10.135

1!

−l


► 5

► 10 ► 0.135


When two


►

1

4

►

1

2

►

3


From

point estimation, we always get: http://vustudents.ning.com

► Single value

► Two values

► Range of values

► Zero


The


sample variance

22 ( )x xS

n

∑ −= is:




► None of these


Var(4X +

5) =__________


When f (x,


( ),f x y dx dy∞ ∞

−∞ −∞∫ ∫

is equal to:


The area


► .0401

► .5500


When

a fair die is rolled, the sample space consists of: http://vustudents.ning.com



When


The F-


The







A

uniform distribution is defined by: http://vustudents.ning.com



Which



► Ogive

► Histogram

► Frequency curve



In a set of



If

( )0P X ==

1

8 , ( )1P X ==

3

8 , ( )2P X ==

3

8 and ( 3)P X = =

1

8Then find F (1)


Write



Discuss

the statistical independence of two discrete random variables:

http://vustudents.ning.comQuestion No: 24 ( Marks: 3 )

For given



1000, 25, 68.5, 2.7N n µ σ= = = =


Elaborate



State the

Bayes’ Theorem.


( ) ≤≤

=elsewhere

xforxf

x

,0

20,2


The


http://vustudents.ning.comCalculate the 90% confidence interval for the real difference in the incomes of the workers from the two factories.


From the

given data 1340, 723, .54n x p= = = and 0 0 1 0: 0.5 : 0.5H P against H P= ≠ .Carry out the significance test for the stated hypothesis.


Given the




1


= ≤ ≤ ≤ ≤


b) Calculate ,

2

5Y,

2

3XP

≤≤


Let




1 2 31

1 2 32

32

4

X X XT

X X XT

+ +=

+ +=

http://vustudents.ning.com


Stat301 final term papers


For a

particular data the value of Pearson’s coefficient of skewness is greater then zero. What will be the shape of distribution?

► Negatively skewed

► J-shaped

► Symmetrical

► Positively skewed


In

measures of relative dispersion unit of measurement is:

► Changed ► Vanish ► Does not changed ► Dependent


The F-

distribution always ranges from:

► 0 to 1 ► 0 to -∞ ► -∞ to +∞ ► 0 to +∞ Question No: 4 ( Marks: 1 ) - Please choose one

In chi-

square test of independence the degrees of freedom are:

► n - p ► n - p-1 ► n - p- 2 ► n – 2


The Chi-

Square distribution is continuous distribution ranging from:

► -∞ ≤ χ 2 ≤ ∞ ► -∞ ≤χ 2 ≤1 ► -∞ ≤χ 2 ≤0 ► 0 ≤ χ 2 ≤ ∞ 348 Question No: 6 ( Marks: 1 ) - Please choose one

If X and Y


► ( )) (E X E Y+

► ( )) (E X E Y−

► ( )X E Y−

► ( )E X Y− answr


If ŷ is the

predicted value for a given x-value and b is the y-intercept then the equation of a regression line for an independent variable x and a dependent variable y is:

► ŷ = mx + b, where m = slope ► x = ŷ + mb, where m = slope ► ŷ = x/m + b, where m = slope

► ŷ = x + mb, where m = slope Question No: 8 ( Marks: 1 ) - Please choose one

The

location of the critical region depends upon:

► Null hypothesis ► Alternative hypothesis ► Value of alpha ► Value of test-statistic Question No: 9 ( Marks: 1 ) - Please choose one

The

variance of the t-distribution is give by the formula:

► 22

−=

ννσ

► 2

22

−=

ννσ


► 12

−=

ννσ

► 22

−=

ννσ


Which one

is the correct formula for finding desired sample size?

►

2

2.Z

ne

α σ =

►

2

2.Z

ne

α σ =

►

2

2.Z X

ne

α =

► 2.Z

ne

α σ=


A discrete

probability function f(x) is always:

► Non-negative ► Negative ► One ► Zero Question No: 12 ( Marks: 1 ) - Please choose one

E(4X + 5)

=__________

► 12 E (X) ► 4 E (X) + 5 ► 16 E (X) + 5 ► 16 E (X) Question No: 13 ( Marks: 1 ) - Please choose one

How P(X

+ Y < 1) can be find:

► f(0, 0) + f(0, 1) + f(1, 2)

http://vuattach.ning.com/ ► f(2, 0) + f(0, 1) + f(1, 0) ► f(0, 0) + f(1, 1) + f(1, 0) ► f(0, 0) + f(0, 1) + f(1, 0) Question No: 14 ( Marks: 1 ) - Please choose one

The

( )|1f x =__________:

► ( )1,1f

► ( ),1f x

►

( )( )

,1

1

f x

h

►

( )( )

,1f x

h x


The area


► .0401

► .5500


In normal

distribution M.D. =

► 0.5σ ► 0.75σ ► 0.7979σ ► 0.6445σ Question No: 17 ( Marks: 1 ) - Please choose one

In an

ANOVA test there are 5 observations in each of three treatments. The degrees of freedom in the numerator and denominator respectively are....... ► 2, 4 ► 3, 15 ► 3, 12 ► 2, 12 Question No: 18 ( Marks: 1 ) - Please choose one

A set that contains all possible outcomes of a system is known as


► Finite Set

► Infinite Set ► Universal Set

► No of these Question No: 19 ( Marks: 1 ) - Please choose one

Stem and

leaf is more informative when data is : ► Equal to 100

► Greater Than 100

► Less than 100

► In all situations


A

population that can be defined as the aggregate of all the conceivable ways in which a specified event can happen is known as:

► Infinite population ► Finite population ► Concrete population ► Hypothetical population Question No: 21 ( Marks: 1 )

( )If E T θ≠ , what do you say about the estimator T, where θ is a parameter ?


What is a

binomial experiment?


Formulate the null and alternative hypothesis in each of the following.(1) Average domestic consumption of electricity is 50 units per month.(2) Not more than 30% people pay Zakat (tax).


What

is mathematical expectation of discrete random variable? Question No: 25 ( Marks: 3 )

Why we

prefer to use pooled estimator ˆcp

( ) ≤≤

=elsewhere

xforxf

x

,0

20,2



Differentiate between grouped and ungrouped data.


A

population 2, 4, 6, 8, 10, 12

N=6, n=2

After drawing possible samples, we have calculated sampling mean 7xu = and sampling

variance2 5.833xσ = . Verify

22) , )x xa b

n

σµ µ σ= =


A random

sample of size n is drawn from normal population with mean 5 and variance2σ . Answer the

following:If s=15, x =14 and t=3, what is values of n?


Given the




An urn

contains nine balls; five of them are red and four blue. Three balls are drawn without

replacement. Find the distribution of X= number of red balls drawn.


A

research worker wishes to estimate the mean of a population using a sample sufficiently large that the probability will be 0.95 that the sample mean will not differ from the true mean by more than 25 percent of the standard deviation. How large a sample should be taken?

http://vuattach.ning.com/Paper 2


10!

=………….


When


► 2 ► 0 ► 0.5 ► 1


The value




The curve

of the F- distribution depends upon:


If X and Y


► ( )) (E X E Y+

► ( )) (E X E Y−

► ( )X E Y−

► ( )E X Y−


In testing



For the


0.135 10.135

1!

−l


► 5

► 10 ► 0.135


When two


►

1

4

►

1

2

►

3


From

point estimation, we always get:

► Single value

► Two values

► Range of values

► Zero


The


sample variance

22 ( )x xS

n

∑ −= is:




► None of these


Var(4X +

5) =__________


When f (x,


( ),f x y dx dy∞ ∞

−∞ −∞∫ ∫

is equal to:


The area


► .0401

► .5500


When

a fair die is rolled, the sample space consists of:



When


The F-


The







A

uniform distribution is defined by:



Which



► Ogive

► Histogram

► Frequency curve



In a set of



If

( )0P X ==

1

8 , ( )1P X ==

3

8 , ( )2P X ==

3

8 and ( 3)P X = =

1

8Then find F (1) Question No: 22 ( Marks: 2 )

Write



Discuss

the statistical independence of two discrete random variables:Question No: 24 ( Marks: 3 )

For given



1000, 25, 68.5, 2.7N n µ σ= = = =


Elaborate

the Least Significant Difference (LSD) Test. Question No: 26 ( Marks: 3 )

State the

Bayes’ Theorem.


The


Calculate the 90% confidence interval for the real difference in the incomes of the workers from the two factories.

( ) ≤≤

=elsewhere

xforxf

x

,0

20,2


From the

given data 1340, 723, .54n x p= = = and 0 0 1 0: 0.5 : 0.5H P against H P= ≠ .Carry out the significance test for the stated hypothesis. Question No: 29 ( Marks: 5 )

Given the


Compute the distribution function F(x). Question No: 30 ( Marks: 10 )

1


= ≤ ≤ ≤ ≤


b) Calculate ,

2

5Y,

2

3XP

≤≤


Let



1 2 31

1 2 32

32

4

X X XT

X X XT

+ +=

+ +=


Stat final informationTotal question 31 21 was mcqs and 10 was subjective questions. 2 was of 10,10 marks 2 was of 5,5 marks 4 was of 3,3 marks these question ware about properties and 1 was about confidece interval 2 was of 1, 1 marks, these question were only about defitions. 1) 1 question from confidence interval , question was of 3 marks, find the confidence interval for difference between two ( papolation means) u1 , u2, ye question handouts main say hi aya tha, i think lecture no 35 main say tha.

http://vuattach.ning.com/2) 1 question from hypotheyes testing ( Z- test) , marks 10 3) 2 questions was about properties, one was, write the properties of binomial distribution. and other was , what is the good point estimator? 4) 1 question was from lecture no 23 , this was of 3 marks page no 172, 1st example was same to same. find the F(x) of { 1, 2} x and f( x) was given.

Definition estimate n estimator: x is poisson random variable with U(meu) =2 find (x=0)(x=1)(x=2)

Q : joint probabilty distribution ka ta...bht ezy table dia ta find px=0/y=1

Q: hypergeometric distibution ka ta....

Q: confidence interval level ka ta...

or baki choty choty shy....like why we use t-value..., .s^2 ia approx equall to S^2 how....

STA 301 All Definitions

(Muhammad Rashid Chishti)

• Statistics - a set of concepts, rules, and procedures that help us to:

o organize numerical information in the form of tables, graphs, and charts;

o understand statistical techniques underlying decisions that affect our lives and well-being; and

o make informed decisions.

• Data - facts, observations, and information that come from investigations.

o Measurement data sometimes called quantitative data -- the result of using some instrument to measure something (e.g., test score, weight);

o Categorical data also referred to as frequency or qualitative data. Things are grouped according to some common property(ies) and the number of members of the group are recorded (e.g., males/females, vehicle type).

• Variable - property of an object or event that can take on different values. For example, college major is a variable that takes on values like mathematics, computer science, English, psychology, etc.

o Discrete Variable - a variable with a limited number of values (e.g., gender (male/female), college class (freshman/sophomore/junior/senior).

http://vuattach.ning.com/o Continuous Variable - a variable that can take on many different values, in

theory, any value between the lowest and highest points on the measurement scale.

o Independent Variable - a variable that is manipulated, measured, or selected by the researcher as an antecedent condition to an observed behavior. In a hypothesized cause-and-effect relationship, the independent variable is the cause and the dependent variable is the outcome or effect.

o Dependent Variable - a variable that is not under the experimenter's control -- the data. It is the variable that is observed and measured in response to the independent variable.

o Qualitative Variable - a variable based on categorical data.

o Quantitative Variable - a variable based on quantitative data.

• Graphs - visual display of data used to present frequency distributions so that the shape of the distribution can easily be seen.

o Bar graph - a form of graph that uses bars separated by an arbitrary amount of space to represent how often elements within a category occur. The higher the bar, the higher the frequency of occurrence. The underlying measurement scale is discrete (nominal or ordinal-scale data), not continuous.

o Histogram - a form of a bar graph used with interval or ratio-scaled data. Unlike the bar graph, bars in a histogram touch with the width of the bars defined by the upper and lower limits of the interval. The measurement scale is continuous, so the lower limit of any one interval is also the upper limit of the previous interval.

o Boxplot - a graphical representation of dispersions and extreme scores. Represented in this graphic are minimum, maximum, and quartile scores in the form of a box with "whiskers." The box includes the range of scores falling into the middle 50% of the distribution ( I nter Q uartile R ange = 75 th percentile - 25 th

percentile)and the whiskers are lines extended to the minimum and maximum scores in the distribution or to mathematically defined (+/-1.5*IQR) upper and lower fences.

o Scatterplot - a form of graph that presents information from a bivariate distribution. In a scatterplot, each subject in an experimental study is represented by a single point in two-dimensional space. The underlying scale of measurement for both variables is continuous (measurement data). This is one of the most useful techniques for gaining insight into the relationship between tw variables.

• Measures of Center - Plotting data in a frequency distribution shows the general shape of the distribution and gives a general sense of how the numbers are bunched. Several statistics can be used to represent the "center" of the distribution. These statistics are commonly referred to as measures of central tendency .

o Mode - The mode of a distribution is simply defined as the most frequent or common score in the distribution. The mode is the point or value of X that corresponds to the highest point on the distribution. If the highest frequency is

http://vuattach.ning.com/shared by more than one value, the distribution is said to be multimodal . It is not uncommon to see distributions that are bimodal reflecting peaks in scoring at two different points in the distribution.

o Median - The median is the score that divides the distribution into halves; half of the scores are above the median and half are below it when the data are arranged in numerical order. The median is also referred to as the score at the 50 th percentile in the distribution. The median location of N numbers can be found by the formula ( N + 1) / 2. When N is an odd number, the formula yields a integer that represents the value in a numerically ordered distribution corresponding to the median location. (For example, in the distribution of numbers (3 1 5 4 9 9 8) the median location is (7 + 1) / 2 = 4. When applied to the ordered distribution (1 3 4 5 8 9 9), the value 5 is the median, three scores are above 5 and three are below 5. If there were only 6 values (1 3 4 5 8 9), the median location is (6 + 1) / 2 = 3.5. In this case the median is half-way between the 3 rd and 4 th scores (4 and 5) or 4.5.

o Mean - The mean is the most common measure of central tendency and the one that can be mathematically manipulated. It is defined as the average of a distribution is equal to the Σ X / N . Simply, the mean is computed by summing all the scores in the distribution ( Σ X ) and dividing that sum by the total number of scores ( N ). The mean is the balance point in a distribution such that if you subtract each value in the distribution from the mean and sum all of these deviation scores , the result will be zero.

• Measures of Spread - Although the average value in a distribution is informative about how scores are centered in the distribution, the mean, median, and mode lack context for interpreting those statistics. Measures of variability provide information about the degree to which individual scores are clustered about or deviate from the average value in a distribution.

o Range - The simplest measure of variability to compute and understand is the range. The range is the difference between the highest and lowest score in a distribution. Although it is easy to compute, it is not often used as the sole measure of variability due to its instability. Because it is based solely on the most extreme scores in the distribution and does not fully reflect the pattern of variation within a distribution, the range is a very limited measure of variability.

o Interquartile Range (IQR) - Provides a measure of the spread of the middle 50% of the scores. The IQR is defined as the 75 th percentile - the 25 th

percentile. The interquartile range plays an important role in the graphical method known as the boxplot . The advantage of using the IQR is that it is easy to compute and extreme scores in the distribution have much less impact but its strength is also a weakness in that it suffers as a measure of variability because it discards too much data. Researchers want to study variability while eliminating scores that are likely to be accidents. The boxplot allows for this for this distinction and is an important tool for exploring data.

o Variance - The variance is a measure based on the deviations of individual scores from the mean. As noted in the definition of the mean, however, simply summing the deviations will result in a value of 0. To get around this problem the variance is based on squared deviations of scores about the mean. When the deviations are squared, the rank order and relative distance of scores in

http://vuattach.ning.com/the distribution is preserved while negative values are eliminated. Then to control for the number of subjects in the distribution, the sum of the squared deviations, Σ ( X - X ), is divided by N (population) or by N - 1 (sample). The result is the average of the sum of the squared deviations and it is called the variance.

o Standard deviation - The standard deviation ( s or σ ) is defined as the positive square root of the variance. The variance is a measure in squared units and has little meaning with respect to the data. Thus, the standard deviation is a measure of variability expressed in the same units as the data. The standard deviation is very much like a mean or an "average" of these deviations. In a normal (symmetric and mound-shaped) distribution, about two-thirds of the scores fall between +1 and -1 standard deviations from the mean and the standard deviation is approximately 1/4 of the range in small samples ( N < 30) and 1/5 to 1/6 of the range in large samples ( N > 100).

• Measures of Shape - For distributions summarizing data from continuous measurement scales, statistics can be used to describe how the distribution rises and drops.

o Symmetric - Distributions that have the same shape on both sides of the center are called symmetric. A symmetric distribution with only one peak is referred to as a normal distribution .

o Skewness - Refers to the degree of asymmetry in a distribution. Asymmetry often reflects extreme scores in a distribution.

Positively skewed - A distribution is positively skewed when is has a tail extending out to the right (larger numbers) When a distribution is positively skewed, the mean is greater than the median reflecting the fact that the mean is sensitive to each score in the distribution and is subject to large shifts when the sample is small and contains extreme scores.

Negatively skewed - A negatively skewed distribution has an extended tail pointing to the left (smaller numbers) and reflects bunching of numbers in the upper part of the distribution with fewer scores at the lower end of the measurement scale.

o Kurtosis - Like skewness, kurtosis has a specific mathematical definition, but generally it refers to how scores are concentrated in the center of the distribution, the upper and lower tails (ends), and the shoulders (between the center and tails) of a distribution.

Mesokurtic - A normal distribution is called mesokurtic. The tails of a mesokurtic distribution are neither too thin or too thick, and there are neither too many or too few scores in the center of the distribution.

Platykurtic - Starting with a mesokurtic distribution and moving scores from both the center and tails into the shoulders, the distribution flattens out and is referred to as platykurtic.

Leptokurtic - If you move scores from shoulders of a mesokurtic distribution into the center and tails of a distribution, the result is a

http://vuattach.ning.com/peaked distribution with thick tails. This shape is referred to as leptokurtic.

Discrete Data

A set of data is said to be discrete if the values / observations belonging to it are distinct and separate, i.e. they can be counted (1,2,3,....). Examples might include the number of kittens in a litter; the number of patients in a doctors surgery; the number of flaws in one metre of cloth; gender (male, female); blood group (O, A, B, AB).

Compare continuous data .

Categorical Data

A set of data is said to be categorical if the values or observations belonging to it can be sorted according to category. Each value is chosen from a set of non-overlapping categories. For example, shoes in a cupboard can be sorted according to colour: the characteristic 'colour' can have non-overlapping categories 'black', 'brown', 'red' and 'other'. People have the characteristic of 'gender' with categories 'male' and 'female'.

Categories should be chosen carefully since a bad choice can prejudice the outcome of an investigation. Every value should belong to one and only one category, and there should be no doubt as to which one.

Nominal Data

A set of data is said to be nominal if the values / observations belonging to it can be assigned a code in the form of a number where the numbers are simply labels. You can count but not order or measure nominal data. For example, in a data set males could be coded as 0, females as 1; marital status of an individual could be coded as Y if married, N if single.

Ordinal Data

A set of data is said to be ordinal if the values / observations belonging to it can be ranked (put in order) or have a rating scale attached. You can count and order, but not measure, ordinal data.

The categories for an ordinal set of data have a natural order, for example, suppose a group of people were asked to taste varieties of biscuit and classify each biscuit on a rating scale of 1 to 5, representing strongly dislike, dislike, neutral, like, strongly like. A rating of 5 indicates more enjoyment than a rating of 4, for example, so such data are ordinal.

However, the distinction between neighbouring points on the scale is not necessarily always the same. For instance, the difference in enjoyment expressed by giving a rating of 2 rather than 1 might be much less than the difference in enjoyment expressed by giving a rating of 4 rather than 3.

Interval Scale

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#contdat%23contdat

http://vuattach.ning.com/An interval scale is a scale of measurement where the distance between any two adjacents units of measurement (or 'intervals') is the same but the zero point is arbitrary. Scores on an interval scale can be added and subtracted but can not be meaningfully multiplied or divided. For example, the time interval between the starts of years 1981 and 1982 is the same as that between 1983 and 1984, namely 365 days. The zero point, year 1 AD, is arbitrary; time did not begin then. Other examples of interval scales include the heights of tides, and the measurement of longitude.

Continuous Data

A set of data is said to be continuous if the values / observations belonging to it may take on any value within a finite or infinite interval. You can count, order and measure continuous data. For example height, weight, temperature, the amount of sugar in an orange, the time required to run a mile.

Compare discrete data .

Frequency Table

A frequency table is a way of summarising a set of data. It is a record of how often each value (or set of values) of the variable in question occurs. It may be enhanced by the addition of percentages that fall into each category.

A frequency table is used to summarise categorical, nominal, and ordinal data. It may also be used to summarise continuous data once the data set has been divided up into sensible groups.

When we have more than one categorical variable in our data set, a frequency table is sometimes called a contingency table because the figures found in the rows are contingent upon (dependent upon) those found in the columns.

Example Suppose that in thirty shots at a target, a marksman makes the following scores:

5 2 2 3 4 4 3 2 0 3 0 3 2 1 51 3 1 5 5 2 4 0 0 4 5 4 4 5 5

The frequencies of the different scores can be summarised as: Score Frequency Frequency (%)

0 4 13%1 3 10%2 5 17%3 5 17%4 6 20%5 7 23%

Pie Chart

A pie chart is a way of summarising a set of categorical data. It is a circle which is divided into segments. Each segment represents a particular category. The area of each segment is proportional to the number of cases in that category.

Example

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#discdat%23discdat

http://vuattach.ning.com/Suppose that, in the last year a sports wear manufacturers has spent 6 million pounds on advertising their products; 3 million has been spent on television adverts, 2 million on sponsorship, 1 million on newspaper adverts, and a half million on posters. This spending can be summarised using a pie chart:

Bar Chart

A bar chart is a way of summarising a set of categorical data. It is often used in exploratory data analysis to illustrate the major features of the distribution of the data in a convenient form. It displays the data using a number of rectangles, of the same width, each of which represents a particular category. The length (and hence area) of each rectangle is proportional to the number of cases in the category it represents, for example, age group, religious affiliation.

Bar charts are used to summarise nominal or ordinal data.

Bar charts can be displayed horizontally or vertically and they are usually drawn with a gap between the bars (rectangles), whereas the bars of a histogram are drawn immediately next to each other.

Dot Plot

A dot plot is a way of summarising data, often used in exploratory data analysis to illustrate the major features of the distribution of the data in a convenient form.

For nominal or ordinal data, a dot plot is similar to a bar chart, with the bars replaced by a series of dots. Each dot represents a fixed number of individuals. For continuous data, the dot plot is similar to a histogram, with the rectangles replaced by dots.

A dot plot can also help detect any unusual observations (outliers), or any gaps in the data set.

Histogram

A histogram is a way of summarising data that are measured on an interval scale (either discrete or continuous). It is often used in exploratory data analysis to illustrate the major features of the distribution of the data in a convenient form. It divides up the range of possible values in a data set into classes or groups. For each group, a rectangle is constructed with a base length equal to the range of values in that specific group, and an area proportional to the number of observations falling into that group. This means that the rectangles might be drawn

http://vuattach.ning.com/of non-uniform height.

The histogram is only appropriate for variables whose values are numerical and measured on an interval scale. It is generally used when dealing with large data sets (>100 observations), when stem and leaf plots become tedious to construct. A histogram can also help detect any unusual observations ( outliers ), or any gaps in the data set.

Compare bar chart .

Stem and Leaf Plot

A stem and leaf plot is a way of summarising a set of data measured on an interval scale. It is often used in exploratory data analysis to illustrate the major features of the distribution of the data in a convenient and easily drawn form.

A stem and leaf plot is similar to a histogram but is usually a more informative display for relatively small data sets (<100 data points). It provides a table as well as a picture of the data and from it we can

readily write down the data in order of magnitude, which is useful for many statistical procedures, e.g. in the skinfold thickness example below:

We can compare more than one data set by the use of multiple stem and leaf plots. By using a back-to-back stem and leaf plot, we are able to compare the same characteristic in two different groups, for example, pulse rate after exercise of smokers and non-smokers.

Box and Whisker Plot (or Boxplot)

A box and whisker plot is a way of summarising a set of data measured on an interval scale. It is often used in exploratory data analysis. It is a type of graph which is used to show the shape of the distribution, its central value, and variability. The picture produced consists of the most

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#bar%23bar

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#out%23out

http://vuattach.ning.com/extreme values in the data set (maximum and minimum values), the lower and upper quartiles , and the median .

A box plot (as it is often called) is especially helpful for indicating whether a distribution is skewed and whether there are any unusual observations ( outliers ) in the data set.

Box and whisker plots are also very useful when large numbers of observations are involved and when two or more data sets are being compared.

See also 5-Number Summary .

5-Number Summary

A 5-number summary is especially useful when we have so many data that it is sufficient to present a summary of the data rather than the whole data set. It consists of 5 values: the most extreme values in the data set (maximum and minimum values), the lower and upper quartiles , and the median .

A 5-number summary can be represented in a diagram known as a box and whisker plot . In cases where we have more than one data set to analyse, a 5-number summary is constructed for each, with corresponding multiple box and whisker plots.

Outlier

An outlier is an observation in a data set which is far removed in value from the others in the data set. It is an unusually large or an unusually small value compared to the others.

An outlier might be the result of an error in measurement, in which case it will distort the interpretation of the data, having undue influence on many summary statistics, for example, the mean .

If an outlier is a genuine result, it is important because it might indicate an extreme of behaviour of the process under study. For this reason, all outliers must be examined carefully before embarking on any formal analysis. Outliers should not routinely be removed without further justification.

Symmetry

Symmetry is implied when data values are distributed in the same way above and below the middle of the sample.

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#sampmean%23sampmean

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#box%23box

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#med%23med

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#quart%23quart

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#fiveno%23fiveno


http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#med%23med


http://vuattach.ning.com/Symmetrical data sets:

a. are easily interpreted;

b. allow a balanced attitude to outliers, that is, those above and below the middle value ( median) can be considered by the same criteria;

c. allow comparisons of spread or dispersion with similar data sets.

Many standard statistical techniques are appropriate only for a symmetric distributional form. For this reason, attempts are often made to transform skew-symmetric data so that they become roughly symmetric.

Skewness

Skewness is defined as asymmetry in the distribution of the sample data values. Values on one side of the distribution tend to be further from the 'middle' than values on the other side.

For skewed data, the usual measures of location will give different values, for example, mode<median<mean would indicate positive (or right) skewness.

Positive (or right) skewness is more common than negative (or left) skewness.

If there is evidence of skewness in the data, we can apply transformations, for example, taking logarithms of positive skew data.

Compare symmetry .

Transformation to Normality

If there is evidence of marked non-normality then we may be able to remedy this by applying suitable transformations.

The more commonly used transformations which are appropriate for data which are skewed to the right with increasing strength (positive skew) are 1/x, log(x) and sqrt(x), where the x's are the data values.

The more commonly used transformations which are appropriate for data which are skewed to the left with increasing strength (negative skew) are squaring, cubing, and exp(x).

Scatter Plot

A scatterplot is a useful summary of a set of bivariate data (two variables), usually drawn before working out a linear correlation coefficient or fitting a regression line. It gives a good visual picture of the relationship between the two variables, and aids the interpretation of the correlation coefficient or regression model.

Each unit contributes one point to the scatterplot, on which points are plotted but not joined. The resulting pattern indicates the type and strength of the relationship between the two variables.

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#symm%23symm


Illustrations

a. The more the points tend to cluster around a straight line, the stronger the linear relationship between the two variables (the higher the correlation).

b. If the line around which the points tends to cluster runs from lower left to upper right, the relationship between the two variables is positive (direct).

c. If the line around which the points tends to cluster runs from upper left to lower right, the relationship between the two variables is negative (inverse).

d. If there exists a random scatter of points, there is no relationship between the two variables (very low or zero correlation).

e. Very low or zero correlation could result from a non-linear relationship between the variables. If the relationship is in fact non-linear (points clustering around a curve, not a straight line), the correlation coefficient will not be a good measure of the strength.

A scatterplot will also show up a non-linear relationship between the two variables and whether or not there exist any outliers in the data.

More information can be added to a two-dimensional scatterplot - for example, we might label points with a code to indicate the level of a third variable.

If we are dealing with many variables in a data set, a way of presenting all possible scatter plots of two variables at a time is in a scatterplot matrix.

Sample Mean

The sample mean is an estimator available for estimating the population mean . It is a

measure of location, commonly called the average, often symbolised .

Its value depends equally on all of the data which may include outliers. It may not appear representative of the central region for skewed data sets.

It is especially useful as being representative of the whole sample for use in subsequent calculations.

Example Lets say our data set is: 5 3 54 93 83 22 17 19. The sample mean is calculated by taking the sum of all the data values and dividing by the total number of data values:


See also expected value .

Median

The median is the value halfway through the ordered data set, below and above which there lies an equal number of data values.

It is generally a good descriptive measure of the location which works well for skewed data, or data with outliers .

The median is the 0.5 quantile .

Example With an odd number of data values, for example 21, we have:

Data 96 48 27 72 39 70 7 68 99 36 95 4 6 13 34 74 65 42 28 54 69Ordered Data 4 6 7 13 27 28 34 36 39 42 48 54 65 68 69 70 72 74 95 96 99 Median 48, leaving ten values below and ten values above

With an even number of data values, for example 20, we have: Data 57 55 85 24 33 49 94 2 8 51 71 30 91 6 47 50 65 43 41 7Ordered Data

2 6 7 8 24 30 33 41 43 47 49 50 51 55 57 65 71 85 91 94

Median Halfway between the two 'middle' data points - in this case halfway between 47 and 49, and so the median is 48

Mode

The mode is the most frequently occurring value in a set of discrete data. There can be more than one mode if two or more values are equally common.

Example Suppose the results of an end of term Statistics exam were distributed as follows:

Student: Score:</I.< td>1 942 813 564 905 706 657 908 909 30

Then the mode (most common score) is 90, and the median (middle score) is 81.

Dispersion

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#quantile%23quantile


http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#expval

http://vuattach.ning.com/The data values in a sample are not all the same. This variation between values is called dispersion.

When the dispersion is large, the values are widely scattered; when it is small they are tightly clustered. The width of diagrams such as dot plots, box plots, stem and leaf plots is greater for samples with more dispersion and vice versa.

There are several measures of dispersion, the most common being the standard deviation . These measures indicate to what degree the individual observations of a data set are dispersed or 'spread out' around their mean.

In manufacturing or measurement, high precision is associated with low dispersion.

Range

The range of a sample (or a data set) is a measure of the spread or the dispersion of the observations. It is the difference between the largest and the smallest observed value of some quantitative characteristic and is very easy to calculate.

A great deal of information is ignored when computing the range since only the largest and the smallest data values are considered; the remaining data are ignored.

The range value of a data set is greatly influenced by the presence of just one unusually large or small value in the sample (outlier).

Examples

1. The range of 65,73,89,56,73,52,47 is 89-47 = 42.

2. If the highest score in a 1st year statistics exam was 98 and the lowest 48, then the range would be 98-48 = 50.

Inter-Quartile Range (IQR)

The inter-quartile range is a measure of the spread of or dispersion within a data set.

It is calculated by taking the difference between the upper and the lower quartiles. For example:

Data 2 3 4 5 6 6 6 7 7 8 9Upper quartile 7Lower quartile 4IQR 7 - 4 = 3

The IQR is the width of an interval which contains the middle 50% of the sample, so it is smaller than the range and its value is less affected by outliers.

Quantile

Quantiles are a set of 'cut points' that divide a sample of data into groups containing (as far as possible) equal numbers of observations.

Examples of quantiles include quartile , quintile , percentile .

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#perc%23perc

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#quin%23quin


http://www.stats.gla.ac.uk/steps/glossary/sampling.html#prec

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#standev%23standev


Percentile

Percentiles are values that divide a sample of data into one hundred groups containing (as far as possible) equal numbers of observations. For example, 30% of the data values lie below the 30th percentile.

See quantile . Compare quintile , quartile .

Quartile

Quartiles are values that divide a sample of data into four groups containing (as far as possible) equal numbers of observations.

A data set has three quartiles. References to quartiles often relate to just the outer two, the upper and the lower quartiles; the second quartile being equal to the median. The lower quartile is the data value a quarter way up through the ordered data set; the upper quartile is the data value a quarter way down through the ordered data set.

Example Data 6 47 49 15 43 41 7 39 43 41 36Ordered Data 6 7 15 36 39 41 41 43 43 47 49Median 41Upper quartile 43Lower quartile 15

See quantile . Compare percentile , quintile .

Quintile

Quintiles are values that divide a sample of data into five groups containing (as far as possible) equal numbers of observations.

See quantile . Compare quartile , percentile .

Sample Variance

Sample variance is a measure of the spread of or dispersion within a set of sample data.

The sample variance is the sum of the squared deviations from their average divided by one less than the number of observations in the data set. For example, for n observations x 1 , x 2 , x 3 , ... , x n with sample mean

the sample variance is given by










http://vuattach.ning.com/See also variance .

Standard Deviation

Standard deviation is a measure of the spread or dispersion of a set of data.

It is calculated by taking the square root of the variance and is symbolised by s.d, or s. In other words

The more widely the values are spread out, the larger the standard deviation. For example, say we have two separate lists of exam results from a class of 30 students; one ranges from 31% to 98%, the other from 82% to 93%, then the standard deviation would be larger for the results of the first exam.

Coefficient of Variation

The coefficient of variation measures the spread of a set of data as a proportion of its mean. It is often expressed as a percentage.

It is the ratio of the sample standard deviation to the sample mean :

There is an equivalent definition for the coefficient of variation of a population, which is based on the expected value and the standard deviation of a random variable .


STA301- Statistics and Probability (Session - 4)


1 2 3 4 5 6 7 8 Total

Marks

Q No. 9 10 11 12 13 14 15 16

Marks

Q No. 17 18 19 20 21 22 23 24

Marks

Q No. 25 26 27 28 29 30 31

Marks

http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#randvar

http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#expval

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#sampmean%23sampmean

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#standev%23standev

http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#variance

http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#disp%23disp

http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html#variance


Mean

deviation is always:

► Less than S.D ► Greater than S.D ► Greater or equal to S.D ► Less or equal to S.D Question No: 2 ( Marks: 1 ) - Please choose one

The value




The mean

of the F-distribution is:

► 2

2 11

1 ⟩−

forvv

v

► 2

2 22

2 ⟩−

forvv

v

► 2

2 11

1 ≥−

forvv

v

► 2

2 12

2 ≤−

forvv

v


If X and Y


► ( )) (E X E Y+

► ( )) (E X E Y−

► ( )X E Y−

► ( )E X Y−


Evaluate:

http://vuattach.ning.com/(9-4)! ► 362880 ► 120 ► 24 ► 6 Question No: 6 ( Marks: 1 ) - Please choose one

Which

formula represents the probability of the complement of event A: ► 1 + P ( A ) ► 1 - P ( A ) ► P ( A ) ► P ( A ) -1 Question No: 7 ( Marks: 1 ) - Please choose one

Ideally the

width of confidence interval should be:


If the

sampling distribution of X is normal, the interval 3x xµ σ± includes:

► 99% of the sample means

► 99.73% of the sample means




The

probability distribution of a statistic is called the:

► Population distribution

► Frequency distribution ► Sampling distribution

► Sample distribution


An

estimator T is said to be unbiased estimator of θ if


► E (T) =θ

► E (T) =T

► E (T) =0

► E (T) =1


If the

following is a probability distribution, then what is the value of 'a':

X 1 2 3

P(X) 0.1 a 0.1

► 0.6 ► 0.8 ► 0.2 ► 0.4 Question No: 12 ( Marks: 1 ) - Please choose one

A discrete

probability function f(x) is always:

► Non-negative ► Negative ► One ► Zero Question No: 13 ( Marks: 1 ) - Please choose one

An

expected value of a random variable is equal to:

► Variance ► Mean

http://vuattach.ning.com/ ► Standard deviation ► Covariance Question No: 14 ( Marks: 1 ) - Please choose one

The

( )|1f x =__________:

► ( )1,1f

► ( ),1f x

►

( )( )

,1

1

f x

h

►

( )( )

,1f x

h x


The area


► .0401

► .5500


The







Which of

the following is impossible in sampling:

► Destructive tests ► Heterogeneous ► To make voters list ► None of these


Which of

http://vuattach.ning.com/the following is a systematic arrangement of data into rows and columns? ► Classification ► Tabulation ► Bar chart ► Component bar chart Question No: 19 ( Marks: 1 ) - Please choose one

Which one

of the following statements is true regarding a sample? ► It is a part of population ► It must contain at least five observations ► It refers to descriptive statistics ► It produces True value Question No: 20 ( Marks: 1 ) - Please choose one

The data

for an ogive is found in which distribution? ► A relative frequency distribution ► A frequency distribution ► A joint frequency distribution ► A cumulative frequency distribution Question No: 21 ( Marks: 1 )

Write

down the formula for binomial distribution.

Ans:


Write

down the formula for testing the equality of two population proportions.


Define moment ratios. In which unit they are expressed?


Elaborate

the Poisson distribution:


Explain

the symmetrical form of the Normal distribution.


Elaborate



The

experiment of a house-agent indicates that he can provide suitable accommodation for 75 percent of the client who come to him. If on a particular occasion, 6 client approaches to him independently, calculate the probability that less than 4 clients will get a satisfactory accommodation.


Why we

use confidence interval?


Write

down short note on chi- square test of independence.


Draw all

possible sample of two letters each without replacement from the letters of the word “STATISTICS”. Find proportion of latter “S” in each sample.


A random

variable X has the following probability distribution:X P(X)-2 0.1-1 k0 0.21 2k2 0.33 3k

Find K and P(X<2).

FINALTERM EXAMINATION Spring 2010

STA301- Statistics and Probability (Session - 4) Question No: 1 ( Marks: 1 ) - Please choose one

When

each outcome of a sample space has equal chance to occur as any other, the outcomes are called:

► Mutually exclusive ► Equally likely ► Not mutually exclusive ► Exhaustive


The mean

of the F-distribution is:


► 2

2 11

1 ⟩−

forvv

v

► 2

2 22

2 ⟩−

forvv

v

► 2

2 11

1 ≥−

forvv

v

► 2

2 12

2 ≤−

forvv

v


The LSD

test is applied only if the null hypothesis is:

► Rejected ► Accepted ► No conclusion ► Acknowledged Question No: 4 ( Marks: 1 ) - Please choose one

Analysis

of variance is a procedure that enables us to test the equality of several:

► Variances ► Means ► Proportions ► Groups Question No: 5 ( Marks: 1 ) - Please choose one

ANOVA

was introduced by :

► Helmert ► Pearson ► R.A Fisher ► Francis Question No: 6 ( Marks: 1 ) - Please choose one

For

testing of hypothesis about population proportion , we use:

► Z-test PROPORTIONS ARE TESTED AND MEAN ► t-Test MEAN IS TESTED ► Both Z & T-test ► F test VARIANCE AND STANDARD DEVIATION



If a

random variable X denotes the number of heads when three distinct coins are tossed, the X assumed the values:

► 0,1,2,3 ► 1,3,3,1 ► 1, 2, 3 ► 3, 2 Question No: 8 ( Marks: 1 ) - Please choose one

If X and Y

are independent variables, then E (XY) is:

► E(XX) ► E(X).E(Y) ► X.E(Y) ► Y.E(X)


The

parameters of the binomial distribution b(x; n, p) are:

► x & n ► x & p

► n & p ► x, n & p


If P (E) is

the probability that an event will occur, which of the following must be false: ► P(E)= - 1 ROBIBILITY SHOULD NEVER BE NEGATIVE AND NOT BE GREATER THAN ONE ► P(E)=1 ► P(E)=1/2 ► P(E)=1/3 Question No: 11 ( Marks: 1 ) - Please choose one

An

estimator T is said to be unbiased estimator of θ if

► E (T) = θEXPECTION OF STATISTIC IS EQUAL TO PARAMETER THAT IS ESTIMATED THEN STATISTIC IS

CALLED UNBIASED OTHER WISE BIASED.

► E (T) =T

► E (T) =0

► E (T) =1



The

best unbiased estimator for population variance 2σ is:

► Sample mean

► Sample median

► Sample proportion

► Sample variance


The

sample variance

22 ( )x xS

n

∑ −= is:



IF IT IS DI VIDED BY N-1 THEN IT IS CALLED UNBIASED OTHER WISE BIASED


► None of these


When c is

a constant, then E(c) is:

1

0

c

-c

► 0 ► 1 ► c THE EXPECTION OF A CONSTATNT IS ALWAYS CONSTANT ► -c Question No: 15 ( Marks: 1 ) - Please choose one


If f (x, y) is

bivariate probability density function of continuous r.v.'s X and Y then

( )g x is:

► ( ),f x y dx

∞

−∞∫

►

( ),f x y dy∞

−∞∫

► ( ),f x y dx dy

∞ ∞

−∞ −∞∫ ∫

► ( ),

b d

a c

f x y dy dx∫ ∫ Question No: 16 ( Marks: 1 ) - Please choose one

The

analysis of variance technique is a method for :

► Comparing F distributions ► Comparing three or more means ► Measuring sampling error ► Comparing variances Question No: 17 ( Marks: 1 ) - Please choose one

The




► A continuous distribution is used to approximate a discrete distribution ► The standard normal distribution is applied


Stem and

leaf is more informative when data is : ► Equal to 100

http://vuattach.ning.com/ ► Greater Than 100

► Less than 100

► In all situations


The branch

of Statistics that is concerned with the procedures and methodology for obtaining valid conclusions is called:

► Descriptive Statistics ► Advance Statistics ► Inferential Statistics

► Sampled Statistics Question No: 20 ( Marks: 1 ) - Please choose one

Which of

the following is a systematic arrangement of data into rows and columns? ► Classification ► Tabulation ► Bar chart ► Component bar chart Question No: 21 ( Marks: 1 ) - Please choose one

In normal

distribution Q.D =

► 0.5σ ► 0.75σ ► 0.7979σ

► 0.6745σ Question No: 22 ( Marks: 1 ) - Please choose one

In normal

distribution 2β =

► 1 ► 2 ► 3 ► 0


If you

connect the mid-points of rectangles in a histogram by a series of lines that also touches the x-axis from both ends, what will you get?

► Ogive ► Frequency polygon ► Frequency curve ► Historigram Question No: 24 ( Marks: 1 ) - Please choose one

Which one

of the following statements is true regarding a population? ► It must be a large number of values ► It must refer to people ► It is a collection of individuals, objects, or measurements ► It is small part of whole Question No: 25 ( Marks: 1 ) - Please choose one

When

1 32 4Q and Q= = ,what is the value of Median, if the distribution is symmetrical:


In a

simple linear regression model, if it is assumed that the intercept parameter is equal to zero, then: ► The regression line will pass through the origin

► The regression line will pass through the point (0,10). ► The regression line will pass through the point (0,-10). ► The slope of the line will also be equal to 0.


The

degrees of freedom for a t-test with sample size 10 is:

► 5 ► 8

► 9 n-1 ► 10 Question No: 28 ( Marks: 1 ) - Please choose one

In testing

of hypothesis, we always begin it with assuming that:

► Null hypothesis is true

http://vuattach.ning.com/It is shown by h0 and first we assumption is h0 ► Alternative hypothesis is true ► Sample size is large ► Population is normal Question No: 29 ( Marks: 1 ) - Please choose one

A failing

student is passed by an examiner is an example of:

► Type I error

► Type II error

► Correct decision

► No information regarding student exams


How to

find ( 1)P X Y+ ≤ ? ► f(0, 0) + f(0, 1) + f(1, 2) ► f(2, 0) + f(0, 1) + f(1, 0) ► f(0, 0) + f(1, 1) + f(1, 0) ► f(0, 0) + f(0, 1) + f(1, 0) Question No: 31 ( Marks: 2 )

How

many parameters are involved in hypergeometric distribution?

Three N n k

Poission mean is np and variance and mean are equal


If an

automobile is driven on the average no more than 16000 Km per year, then formulate the null and alternative hypothesis.

0

1

16000

16000

H

H

≤>


Write down

the test statistic when chi- square goodness of fit test is performed.


1 20.30, 0.20.P P= =


Find the

value of F(table value), when 1n 7= , 2n 10= and α= 0.05

3.37


If X = 327,

n = 634, 0p = 0.50 then find the z-test statistic for proportion.


If

population proportions are given as:

Find 1 2

2ˆ ˆp pσ −

,where n = 10

1 2

2ˆ ˆp pσ −

= p1q1/n1+p2q2/n2


A

candidate for mayor in a large city hires the services of a poll-taking organization, and they found that 62 of 100 educated voters interviewed support the candidate, and 69 of 150 uneducated voters support him. At the 0.05 significance level, test the following

1 2: 0.05oH P P− ≥

1 1 2: 0.05H P P− <Book Example # 16.17 on Page 155

Professor sher Muhammad Chaudhry


If we have

RCBD with MSE=3. 19, no.of.treatments = 4, no.of.blocks = 5; then find the value of LSD (least significant difference) for treatments by using α=0.05 and error degrees of freedom is 12.


Find the

mean and variance for the sampling distribution given below.

http://vuattach.ning.com/ ( )p̂ No. of

Samples Probability

( )p̂f

0 1 1/20 1/3 9 9/20 2/3 9 9/20 1 1 1/20

Σ 20 1

µP F( µP ) ¶ 2

PµP F( µP ) ¶ 2

P F( µP )0 1/20

1/3 9/20

2/3 9/20

1 1/20

∑ 1

Mean=µ µ = P f Pµ ∑

Variance=µ µ µ ¶2 2

2( ) ( )E x P f P P f P= −∑ ∑

sta301 final quizz by sarfraz

Documents