probability theory presentation 02

Upload: xing-qiu

Post on 10-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Probability Theory Presentation 02

    1/55

    BST 401 Probability Theory

    Xing Qiu Ha Youn Lee

    Department of Biostatistics and Computational BiologyUniversity of Rochester

    September 9, 2010

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    2/55

    Outline

    1 Review of Calculus

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    3/55

    The many faces of continuity

    A real function f(x) : R R is continuous at point x = a iff

    Cauchy: For any number > 0, there exist > 0 such that:

    |f(x) f(a)| < , x (a , a+ ). (1)

    This is also called the - definition of limit.

    Heine: For any sequence of real numbers (xn) which converges toa, we have

    limn

    f(xn) = f(a). (2)

    For those of you who have taken general topology, this

    type of continuity is called sequential continuity.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    4/55

    The many faces of continuity

    A real function f(x) : R R is continuous at point x = a iff

    Cauchy: For any number > 0, there exist > 0 such that:

    |f(x) f(a)| < , x (a , a+ ). (1)

    This is also called the - definition of limit.

    Heine: For any sequence of real numbers (xn) which converges toa, we have

    limn

    f(xn) = f(a). (2)

    For those of you who have taken general topology, this

    type of continuity is called sequential continuity.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    5/55

    Reduce complexity of a function

    f(x) : {a, b, c, d} R. It is determined by four numbers:f(a), f(b), f(c), f(d), which is equivalent to R4.

    Suppose f(x) has this property: f(a) = f(b), f(c) = f(d),

    then it can be determined by just two numbers: f(a) andf(d), or say R2.

    In general, a function has BA possible candidates, where A

    is the domain of f and B is the range of f.

    In general, we certainly want to reduce the complexity of afunction by its mathematical properties.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    6/55

    Reduce complexity of a function

    f(x) : {a, b, c, d} R. It is determined by four numbers:f(a), f(b), f(c), f(d), which is equivalent to R4.

    Suppose f(x) has this property: f(a) = f(b), f(c) = f(d),

    then it can be determined by just two numbers: f(a) andf(d), or say R2.

    In general, a function has BA possible candidates, where A

    is the domain of f and B is the range of f.

    In general, we certainly want to reduce the complexity of afunction by its mathematical properties.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    7/55

    Reduce complexity of a function

    f(x) : {a, b, c, d} R. It is determined by four numbers:f(a), f(b), f(c), f(d), which is equivalent to R4.

    Suppose f(x) has this property: f(a) = f(b), f(c) = f(d),

    then it can be determined by just two numbers: f(a) andf(d), or say R2.

    In general, a function has BA possible candidates, where A

    is the domain of f and B is the range of f.

    In general, we certainly want to reduce the complexity of afunction by its mathematical properties.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    8/55

    Reduce complexity of a function

    f(x) : {a, b, c, d} R. It is determined by four numbers:f(a), f(b), f(c), f(d), which is equivalent to R4.

    Suppose f(x) has this property: f(a) = f(b), f(c) = f(d),

    then it can be determined by just two numbers: f(a) andf(d), or say R2.

    In general, a function has BA possible candidates, where A

    is the domain of f and B is the range of f.

    In general, we certainly want to reduce the complexity of afunction by its mathematical properties.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    9/55

    Real functions

    Notation: countable infinity (0); continuum infinity (1),

    Number of possible real valued functions: RR, or

    equivalently, 1 different points are required to determine areal function.

    A simple line f(x) = b0 + b1x is a real function.

    It is determined by just two points. So lines are much

    easier objects than arbitrary real functions.

    What about continuous functions?

    Claim: the number of continuous real functions is RQ. I.e.,

    0 points (Q) are enough to determines a continuous realfunction f.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    10/55

    Real functions

    Notation: countable infinity (0); continuum infinity (1),

    Number of possible real valued functions: RR, or

    equivalently, 1 different points are required to determine areal function.

    A simple line f(x) = b0 + b1x is a real function.

    It is determined by just two points. So lines are much

    easier objects than arbitrary real functions.

    What about continuous functions?

    Claim: the number of continuous real functions is RQ. I.e.,

    0 points (Q) are enough to determines a continuous realfunction f.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    11/55

    Real functions

    Notation: countable infinity (0); continuum infinity (1),

    Number of possible real valued functions: RR, or

    equivalently, 1 different points are required to determine areal function.

    A simple line f(x) = b0 + b1x is a real function.

    It is determined by just two points. So lines are much

    easier objects than arbitrary real functions.

    What about continuous functions?

    Claim: the number of continuous real functions is RQ. I.e.,

    0 points (Q) are enough to determines a continuous realfunction f.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    12/55

    Real functions

    Notation: countable infinity (0); continuum infinity (1),

    Number of possible real valued functions: RR, or

    equivalently, 1 different points are required to determine areal function.

    A simple line f(x) = b0 + b1x is a real function.

    It is determined by just two points. So lines are much

    easier objects than arbitrary real functions.

    What about continuous functions?

    Claim: the number of continuous real functions is RQ. I.e.,

    0 points (Q) are enough to determines a continuous realfunction f.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    13/55

    Real functions

    Notation: countable infinity (0); continuum infinity (1),

    Number of possible real valued functions: RR, or

    equivalently, 1 different points are required to determine areal function.

    A simple line f(x) = b0 + b1x is a real function.

    It is determined by just two points. So lines are much

    easier objects than arbitrary real functions.

    What about continuous functions?

    Claim: the number of continuous real functions is RQ. I.e.,

    0 points (Q) are enough to determines a continuous realfunction f.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    14/55

    Real functions

    Notation: countable infinity (0); continuum infinity (1),

    Number of possible real valued functions: RR, or

    equivalently, 1 different points are required to determine areal function.

    A simple line f(x) = b0 + b1x is a real function.

    It is determined by just two points. So lines are much

    easier objects than arbitrary real functions.

    What about continuous functions?

    Claim: the number of continuous real functions is RQ. I.e.,

    0 points (Q) are enough to determines a continuous realfunction f.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    15/55

    Continuity and Approximation

    Sketch of proof:

    For any a R, there exists a sequence of rational numbersq1, q2, . . . such that limn qn = a.

    By sequential continuity, f(a) = limn f(qn). In otherwords, f(q1), f(q2), . . . determines f(a).

    Since every f(a), a R is determined by some sequenceof rational numbers, collectively, {f(q)} , q Q determinesall possible values of f.

    This theorem, together with Cauchys continuity principle,

    allows us to approximate a real function with finite manysteps:

    > 0, n N, s.t. |f(qn) f(a)| < . (3)

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    16/55

    Continuity and Approximation

    Sketch of proof:

    For any a R, there exists a sequence of rational numbersq1, q2, . . . such that limn qn = a.

    By sequential continuity, f(a) = limn f(qn). In otherwords, f(q1), f(q2), . . . determines f(a).

    Since every f(a), a R is determined by some sequenceof rational numbers, collectively, {f(q)} , q Q determinesall possible values of f.

    This theorem, together with Cauchys continuity principle,

    allows us to approximate a real function with finite many

    steps:

    > 0, n N, s.t. |f(qn) f(a)| < . (3)

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    17/55

    Continuity and Approximation

    Sketch of proof:

    For any a R, there exists a sequence of rational numbersq1, q2, . . . such that limn qn = a.

    By sequential continuity, f(a) = limn f(qn). In otherwords, f(q1), f(q2), . . . determines f(a).

    Since every f(a), a R is determined by some sequenceof rational numbers, collectively, {f(q)} , q Q determinesall possible values of f.

    This theorem, together with Cauchys continuity principle,

    allows us to approximate a real function with finite many

    steps:

    > 0, n N, s.t. |f(qn) f(a)| < . (3)

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    18/55

    Continuity and Approximation

    Sketch of proof:

    For any a R, there exists a sequence of rational numbersq1, q2, . . . such that limn qn = a.

    By sequential continuity, f(a) = limn f(qn). In otherwords, f(q1), f(q2), . . . determines f(a).

    Since every f(a), a R is determined by some sequenceof rational numbers, collectively, {f(q)} , q Q determinesall possible values of f.

    This theorem, together with Cauchys continuity principle,

    allows us to approximate a real function with finite many

    steps:

    > 0, n N, s.t. |f(qn) f(a)| < . (3)

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    19/55

    The Squeeze (or Sandwich) theorem

    Functional version. If f(x) g(x) h(x) in aneighborhood of x0 (what does that mean?) with possible

    exception at x0, and

    limxx0 f(x) = limxx0 h(x) = L,

    then we have

    limxx0

    g(x) = L.

    Sequence version. Replace functions by three sequencesand the functional convergence when x approaches x0 by

    sequence convergence when n.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    20/55

    The Squeeze (or Sandwich) theorem

    Functional version. If f(x) g(x) h(x) in aneighborhood of x0 (what does that mean?) with possible

    exception at x0, and

    limxx0 f(x) = limxx0 h(x) = L,

    then we have

    limxx0

    g(x) = L.

    Sequence version. Replace functions by three sequencesand the functional convergence when x approaches x0 by

    sequence convergence when n.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    21/55

    Differentiation and linear approximation

    The tangent problem. Algebraic definition:

    f(x0) :=df(x)

    dx

    x=x0

    = limxx0

    f(x + h) f(x)

    h.

    Differentiation requires continuity, but not the other way.Show students three examples: the step function, the

    absolute value function, and a differentiable function.

    Linear approximation: Near a neighborhood of x0, we have

    f(x0 + x) f(x0) + f(x0)x.

    Example: linear approximation of 182 near 20.

    Approximation: 320. True value: 324.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    22/55

    Differentiation and linear approximation

    The tangent problem. Algebraic definition:

    f(x0) :=df(x)

    dx

    x=x0

    = limxx0

    f(x + h) f(x)

    h.

    Differentiation requires continuity, but not the other way.Show students three examples: the step function, the

    absolute value function, and a differentiable function.

    Linear approximation: Near a neighborhood of x0, we have

    f(x0 + x) f(x0) + f(x0)x.

    Example: linear approximation of 182 near 20.

    Approximation: 320. True value: 324.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    23/55

    Differentiation and linear approximation

    The tangent problem. Algebraic definition:

    f(x0) :=df(x)

    dx

    x=x0

    = limxx0

    f(x + h) f(x)

    h.

    Differentiation requires continuity, but not the other way.Show students three examples: the step function, the

    absolute value function, and a differentiable function.

    Linear approximation: Near a neighborhood of x0, we have

    f(x0 + x) f(x0) + f(x0)x.

    Example: linear approximation of 182 near 20.

    Approximation: 320. True value: 324.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    24/55

    Differentiation and linear approximation

    The tangent problem. Algebraic definition:

    f(x0) :=df(x)

    dx

    x=x0

    = limxx0

    f(x + h) f(x)

    h.

    Differentiation requires continuity, but not the other way.Show students three examples: the step function, the

    absolute value function, and a differentiable function.

    Linear approximation: Near a neighborhood of x0, we have

    f(x0 + x) f(x0) + f(x0)x.

    Example: linear approximation of 182 near 20.

    Approximation: 320. True value: 324.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    25/55

    Differentiation and linear approximation

    The tangent problem. Algebraic definition:

    f(x0) :=df(x)

    dx

    x=x0

    = limxx0

    f(x + h) f(x)

    h.

    Differentiation requires continuity, but not the other way.Show students three examples: the step function, the

    absolute value function, and a differentiable function.

    Linear approximation: Near a neighborhood of x0, we have

    f(x0 + x) f(x0) + f

    (x0)x.

    Example: linear approximation of 182 near 20.

    Approximation: 320. True value: 324.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    26/55

    Power series of real numbers

    A series is a sequence of partial sums of a sequence.(bn)

    n=1, bn =n

    k=1 ak. Alternative notation:

    n=1 an = limn(a+ a

    2 + . . . + an). For simplicity, weassume 1 < a< 1.

    (1 a)(1 + a+ a2

    + . . . + an

    ) = 1 an+1

    an = a+ a2 + . . . + an =

    1 an+1

    1 a 1

    limn

    bn =1

    1 a

    1 =a

    1 a

    .

    If a 1 or a 1: an doesnt converge to zero, so bn mustdiverge. Generalization to complex numbers: |a| < 1 convergence of power series bn.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    27/55

    Power series of real numbers

    A series is a sequence of partial sums of a sequence.(bn)

    n=1, bn =n

    k=1 ak. Alternative notation:

    n=1 an = limn(a+ a

    2 + . . . + an). For simplicity, weassume 1 < a< 1.

    (1 a)(1 + a+ a2

    + . . . + an

    ) = 1 an+1

    an = a+ a2 + . . . + an =

    1 an+1

    1 a 1

    limn

    bn =1

    1 a

    1 =a

    1 a

    .

    If a 1 or a 1: an doesnt converge to zero, so bn mustdiverge. Generalization to complex numbers: |a| < 1 convergence of power series bn.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    28/55

    Power series of real numbers

    A series is a sequence of partial sums of a sequence.(bn)

    n=1, bn =n

    k=1 ak. Alternative notation:

    n=1 an = limn(a+ a

    2 + . . . + an). For simplicity, weassume 1 < a< 1.

    (1 a)(1 + a+ a2

    + . . . + an

    ) = 1 an+1

    an = a+ a2 + . . . + an =

    1 an+1

    1 a 1

    limn

    bn =1

    1 a

    1 =a

    1 a

    .

    If a 1 or a 1: an doesnt converge to zero, so bn mustdiverge. Generalization to complex numbers: |a| < 1 convergence of power series bn.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    29/55

    Approximation and Taylor Expansion

    Taylor expansion is another way of approximating a smooth

    function with finite steps. Rather than approximating f(a)

    by information provided by nearby points, we approximateit with the knowledge of its derivatives. It is closely related

    to the concept of the moment generating function and the

    characteristic function Dr. Lee will cover later.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    30/55

    Approximations in the real world

    Finite step approximation is more than approximation. Itis pretty much the only way we, as animals equipped with

    finite step logic calculation ability, can deal with the real

    world, which is infinitely complex.

    Philosophical implications. Almost all engineering solutionsassumes continuity of the real world. Think: why you even

    dare to drive a car? Predictability.

    No perfect predictability in this world. In fact there is no

    perfect measurement of any sort: time, length, force,

    thickness of your car, smoothness of the road, etc.

    Tolerance of small errors is of crucial importance.

    Ever close precision.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    31/55

    Approximations in the real world

    Finite step approximation is more than approximation. Itis pretty much the only way we, as animals equipped with

    finite step logic calculation ability, can deal with the real

    world, which is infinitely complex.

    Philosophical implications. Almost all engineering solutionsassumes continuity of the real world. Think: why you even

    dare to drive a car? Predictability.

    No perfect predictability in this world. In fact there is no

    perfect measurement of any sort: time, length, force,

    thickness of your car, smoothness of the road, etc.

    Tolerance of small errors is of crucial importance.

    Ever close precision.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    32/55

    Approximations in the real world

    Finite step approximation is more than approximation. Itis pretty much the only way we, as animals equipped with

    finite step logic calculation ability, can deal with the real

    world, which is infinitely complex.

    Philosophical implications. Almost all engineering solutionsassumes continuity of the real world. Think: why you even

    dare to drive a car? Predictability.

    No perfect predictability in this world. In fact there is no

    perfect measurement of any sort: time, length, force,

    thickness of your car, smoothness of the road, etc.Tolerance of small errors is of crucial importance.

    Ever close precision.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    33/55

    Approximations in the real world

    Finite step approximation is more than approximation. Itis pretty much the only way we, as animals equipped with

    finite step logic calculation ability, can deal with the real

    world, which is infinitely complex.

    Philosophical implications. Almost all engineering solutions

    assumes continuity of the real world. Think: why you even

    dare to drive a car? Predictability.

    No perfect predictability in this world. In fact there is no

    perfect measurement of any sort: time, length, force,

    thickness of your car, smoothness of the road, etc.Tolerance of small errors is of crucial importance.

    Ever close precision.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    34/55

    Approximations in the real world

    Finite step approximation is more than approximation. Itis pretty much the only way we, as animals equipped with

    finite step logic calculation ability, can deal with the real

    world, which is infinitely complex.

    Philosophical implications. Almost all engineering solutions

    assumes continuity of the real world. Think: why you even

    dare to drive a car? Predictability.

    No perfect predictability in this world. In fact there is no

    perfect measurement of any sort: time, length, force,

    thickness of your car, smoothness of the road, etc.Tolerance of small errors is of crucial importance.

    Ever close precision.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    35/55

    Qualitative properties of a real function (I)

    For a given differentiable function f(x), many qualitativeproperties can be obtained by computing its first and second

    order derivatives.

    f(a) > 0: f(x) is increasing at point x = a.

    f

    (a) < 0: f(x) is decreasing at point x = a.f(a) = 0: a is a critical point of f(x).

    For a critical point a:

    If f(a) > 0, a is a local minimizer of f(x).

    If f

    (a

    ) < 0, a

    is a local maximizer of f(x).If f(a) = 0, higher derivatives are needed to determinethe property of this point.

    Qiu, Lee BST 401

    Q li i i f l f i (I)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    36/55

    Qualitative properties of a real function (I)

    For a given differentiable function f(x), many qualitativeproperties can be obtained by computing its first and second

    order derivatives.

    f(a) > 0: f(x) is increasing at point x = a.

    f

    (a) < 0: f(x) is decreasing at point x = a.f(a) = 0: a is a critical point of f(x).

    For a critical point a:

    If f(a) > 0, a is a local minimizer of f(x).

    If f

    (a

    ) < 0, a

    is a local maximizer of f(x).If f(a) = 0, higher derivatives are needed to determinethe property of this point.

    Qiu, Lee BST 401

    Q lit ti ti f l f ti (I)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    37/55

    Qualitative properties of a real function (I)

    For a given differentiable function f(x), many qualitativeproperties can be obtained by computing its first and second

    order derivatives.

    f(a) > 0: f(x) is increasing at point x = a.

    f

    (a) < 0: f(x) is decreasing at point x = a.f(a) = 0: a is a critical point of f(x).

    For a critical point a:

    If f(a) > 0, a is a local minimizer of f(x).

    If f

    (a

    ) < 0, a

    is a local maximizer of f(x).If f(a) = 0, higher derivatives are needed to determinethe property of this point.

    Qiu, Lee BST 401

    Q lit ti ti f l f ti (I)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    38/55

    Qualitative properties of a real function (I)

    For a given differentiable function f(x), many qualitativeproperties can be obtained by computing its first and second

    order derivatives.

    f(a) > 0: f(x) is increasing at point x = a.

    f

    (a) < 0: f(x) is decreasing at point x = a.f(a) = 0: a is a critical point of f(x).

    For a critical point a:

    If f(a) > 0, a is a local minimizer of f(x).

    If f

    (a

    ) < 0, a

    is a local maximizer of f(x).If f(a) = 0, higher derivatives are needed to determinethe property of this point.

    Qiu, Lee BST 401

    Q lit ti ti f l f ti (I)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    39/55

    Qualitative properties of a real function (I)

    For a given differentiable function f(x), many qualitativeproperties can be obtained by computing its first and second

    order derivatives.

    f(a) > 0: f(x) is increasing at point x = a.

    f

    (a) < 0: f(x) is decreasing at point x = a.f(a) = 0: a is a critical point of f(x).

    For a critical point a:

    If f(a) > 0, a is a local minimizer of f(x).

    If f

    (a

    ) < 0, a

    is a local maximizer of f(x).If f(a) = 0, higher derivatives are needed to determinethe property of this point.

    Qiu, Lee BST 401

    Qualitative properties of a real function (I)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    40/55

    Qualitative properties of a real function (I)

    For a given differentiable function f(x), many qualitativeproperties can be obtained by computing its first and second

    order derivatives.

    f(a) > 0: f(x) is increasing at point x = a.

    f

    (a) < 0: f(x) is decreasing at point x = a.f(a) = 0: a is a critical point of f(x).

    For a critical point a:

    If f(a) > 0, a is a local minimizer of f(x).

    If f

    (a

    ) < 0, a

    is a local maximizer of f(x).If f(a) = 0, higher derivatives are needed to determinethe property of this point.

    Qiu, Lee BST 401

    Qualitative properties of a real function (II)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    41/55

    Qualitative properties of a real function (II)

    Alternative interpretation through Taylor expansion:

    f(x) = b0 + b1(x a) + b2(x a)2 + . . .

    = f(a) + f(a)(x a) +f(a)

    2(x a)2 + O

    (x a)3

    T1(x) = f(a) + f(a)(x a) is the best first order

    approximation of f(x) near x = a(which is the tangentline). f(a) is the slope of this line. Apparently,positive/negative slope implies T1(x) being

    increasing/decreasing near x = a.

    T2(x) = f(a) + f(a)(x a) + f

    (a)2 (x a)

    2 is the best

    second order approximation of f(x) near x = a.

    Qiu, Lee BST 401

    Qualitative properties of a real function (II)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    42/55

    Qualitative properties of a real function (II)

    Alternative interpretation through Taylor expansion:

    f(x) = b0 + b1(x a) + b2(x a)2 + . . .

    = f(a) + f(a)(x a) +f(a)

    2(x a)2 + O

    (x a)3

    T1(x) = f(a) + f(a)(x a) is the best first order

    approximation of f(x) near x = a(which is the tangentline). f(a) is the slope of this line. Apparently,positive/negative slope implies T1(x) being

    increasing/decreasing near x = a.

    T2(x) = f(a) + f(a)(x a) + f

    (a)2 (x a)

    2 is the best

    second order approximation of f(x) near x = a.

    Qiu, Lee BST 401

    Qualitative properties of a real function (III)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    43/55

    Qualitative properties of a real function (III)

    Usually the first order term dominates the second orderterm when x a = (x a)2 |x a|.

    One notable exception: f(a) = 0. Since this nukes thefirst order term completely.

    When the first order term is absent (f

    (a) = 0), the secondorder term becomes important. Thats why we need to

    know the sign off(x)

    2 in order to classify critical points.

    The sign off(x)

    2 determines the concavity of f(x). Draw

    two figures: b0 + (x a)2

    and b0 (x a)2

    .Back to the quiz: Taylor expansion of sin(2x) at: a) 2 ; b) 0;c) 4 .

    Qiu, Lee BST 401

    Qualitative properties of a real function (III)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    44/55

    Qualitative properties of a real function (III)

    Usually the first order term dominates the second orderterm when x a = (x a)2 |x a|.

    One notable exception: f(a) = 0. Since this nukes thefirst order term completely.

    When the first order term is absent (f

    (a) = 0), the secondorder term becomes important. Thats why we need to

    know the sign off(x)

    2 in order to classify critical points.

    The sign off(x)

    2 determines the concavity of f(x). Draw

    two figures: b0 + (x a)2

    and b0 (x a)2

    .Back to the quiz: Taylor expansion of sin(2x) at: a) 2 ; b) 0;c) 4 .

    Qiu, Lee BST 401

    Qualitative properties of a real function (III)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    45/55

    Qualitative properties of a real function (III)

    Usually the first order term dominates the second orderterm when x a = (x a)2 |x a|.

    One notable exception: f(a) = 0. Since this nukes thefirst order term completely.

    When the first order term is absent (f

    (a) = 0), the secondorder term becomes important. Thats why we need to

    know the sign off(x)

    2 in order to classify critical points.

    The sign off(x)

    2 determines the concavity of f(x). Draw

    two figures: b0 + (x a)2

    and b0 (x a)2

    .Back to the quiz: Taylor expansion of sin(2x) at: a) 2 ; b) 0;c) 4 .

    Qiu, Lee BST 401

    Qualitative properties of a real function (III)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    46/55

    Qualitative properties of a real function (III)

    Usually the first order term dominates the second orderterm when x a = (x a)2 |x a|.

    One notable exception: f(a) = 0. Since this nukes thefirst order term completely.

    When the first order term is absent (f

    (a) = 0), the secondorder term becomes important. Thats why we need to

    know the sign off(x)

    2 in order to classify critical points.

    The sign off(x)

    2 determines the concavity of f(x). Draw

    two figures: b0 + (x a)2

    and b0 (x a)2

    .Back to the quiz: Taylor expansion of sin(2x) at: a) 2 ; b) 0;c) 4 .

    Qiu, Lee BST 401

    Qualitative properties of a real function (III)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    47/55

    Qualitative properties of a real function (III)

    Usually the first order term dominates the second orderterm when x a = (x a)2 |x a|.

    One notable exception: f(a) = 0. Since this nukes thefirst order term completely.

    When the first order term is absent (f

    (a) = 0), the secondorder term becomes important. Thats why we need to

    know the sign off(x)

    2 in order to classify critical points.

    The sign off(x)

    2 determines the concavity of f(x). Draw

    two figures: b0 + (x a)2

    and b0 (x a)2

    .Back to the quiz: Taylor expansion of sin(2x) at: a) 2 ; b) 0;c) 4 .

    Qiu, Lee BST 401

    Compact/Non-compact set

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    48/55

    Compact/Non compact set

    Whether a continuous function f(x) has global min/maxdepends on its domain

    If a continuous function f(x) is defined on a bounded,closed interval [a, b], then there exist x1 and x2 in [a, b]such that f(x1) is the global maximum, f(x2) is the global

    minimum.Generalization: as long as its domain is a) bounded; b)

    closed. Or simply put: a compact set.

    Counter examples1

    Domain is [a,). Escape to .2 Domain is (a, b). Stat. example: MLE of the varianceproblem.

    3 Domain is compact, but f(x) is not continuous.

    Qiu, Lee BST 401

    Compact/Non-compact set

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    49/55

    Compact/Non compact set

    Whether a continuous function f(x) has global min/maxdepends on its domain

    If a continuous function f(x) is defined on a bounded,closed interval [a, b], then there exist x1 and x2 in [a, b]such that f(x1) is the global maximum, f(x2) is the global

    minimum.Generalization: as long as its domain is a) bounded; b)

    closed. Or simply put: a compact set.

    Counter examples1

    Domain is [a,). Escape to .2 Domain is (a, b). Stat. example: MLE of the varianceproblem.

    3 Domain is compact, but f(x) is not continuous.

    Qiu, Lee BST 401

    Compact/Non-compact set

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    50/55

    Compact/Non compact set

    Whether a continuous function f(x) has global min/maxdepends on its domain

    If a continuous function f(x) is defined on a bounded,closed interval [a, b], then there exist x1 and x2 in [a, b]such that f(x1) is the global maximum, f(x2) is the global

    minimum.Generalization: as long as its domain is a) bounded; b)

    closed. Or simply put: a compact set.

    Counter examples1

    Domain is [a,). Escape to .2 Domain is (a, b). Stat. example: MLE of the varianceproblem.

    3 Domain is compact, but f(x) is not continuous.

    Qiu, Lee BST 401

    Compact/Non-compact set

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    51/55

    Co pact/ o co pact set

    Whether a continuous function f(x) has global min/maxdepends on its domain

    If a continuous function f(x) is defined on a bounded,closed interval [a, b], then there exist x1 and x2 in [a, b]such that f(x1) is the global maximum, f(x2) is the global

    minimum.Generalization: as long as its domain is a) bounded; b)

    closed. Or simply put: a compact set.

    Counter examples1

    Domain is [a,). Escape to .2 Domain is (a, b). Stat. example: MLE of the varianceproblem.

    3 Domain is compact, but f(x) is not continuous.

    Qiu, Lee BST 401

    Compact/Non-compact set

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    52/55

    p p

    Whether a continuous function f(x) has global min/maxdepends on its domain

    If a continuous function f(x) is defined on a bounded,closed interval [a, b], then there exist x1 and x2 in [a, b]such that f(x1) is the global maximum, f(x2) is the global

    minimum.Generalization: as long as its domain is a) bounded; b)

    closed. Or simply put: a compact set.

    Counter examples1

    Domain is [a,). Escape to .2 Domain is (a, b). Stat. example: MLE of the varianceproblem.

    3 Domain is compact, but f(x) is not continuous.

    Qiu, Lee BST 401

    Compact/Non-compact set

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    53/55

    p p

    Whether a continuous function f(x) has global min/maxdepends on its domain

    If a continuous function f(x) is defined on a bounded,closed interval [a, b], then there exist x1 and x2 in [a, b]such that f(x1) is the global maximum, f(x2) is the global

    minimum.Generalization: as long as its domain is a) bounded; b)

    closed. Or simply put: a compact set.

    Counter examples1

    Domain is [a,). Escape to .2 Domain is (a, b). Stat. example: MLE of the varianceproblem.

    3 Domain is compact, but f(x) is not continuous.

    Qiu, Lee BST 401

    Homework

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    54/55

    Homework will be sent to you via email.

    Homework is always due on the next Thursday.

    Qiu, Lee BST 401

    Homework

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 02

    55/55

    Homework will be sent to you via email.

    Homework is always due on the next Thursday.

    Qiu, Lee BST 401

    http://find/http://goback/