arxiv:math/0310351v6 [math.gm] 5 oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010...

82
arXiv:math/0310351v6 [math.GM] 5 Oct 2010 NONSTANDARD ANALYSIS - A SIMPLIFIED APPROACH - ROBERT A. HERRMANN 1

Upload: others

Post on 19-Jun-2020

22 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

arX

iv:m

ath/

0310

351v

6 [

mat

h.G

M]

5 O

ct 2

010

NONSTANDARD ANALYSIS

- A SIMPLIFIED APPROACH -

ROBERT A. HERRMANN

1

Page 2: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Copyright c© 2003 by Robert A. Herrmann

2

Page 3: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

CONTENTS

Chapter 1

Filters Ultrafilters, Cofinite Filter, Principle Ultrafilters,

Free Ultrafilters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.

Chapter 2

A Simple Nonstandard Model for Analysis Equivalence

Classes of Sequences, Totally Ordered Field of Equivalence Classes,

The Hyper-extension of Sets and Relations, The Standard Object

Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.

Chapter 3

Hyper-set Algebra

Infinite and Infinitesimal Numbers The Behavior of the Hyper

and Standard Object Generators; Infinitesimals µ(0), The Infinite and

Finite Numbers, *-Transform Process, Maximum Ideal µ(0) . . . . . . . . . . . . . . . . 15.

Chapter 4

Basic Sequential Convergence Bounded Sequences, Convergent

Sequences, Accumulation Point, Subsequences, Cauchy Criterion,

Nonstandard Characteristics and Examples . . . . . . . . . . . . . . . . . . . . . . . 23

Chapter 5

Advanced Sequential Convergence Double Sequences, Iterated

Limits, Upper and Lower Limits, Nonstandard Characteristics and

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.

Chapter 6

Basic Infinite Series Hyperfinite Summation, Standard Results,

Nonstandard Characteristics and Examples . . . . . . . . . . . . . . . . . . . . . . 33.

Chapter 7

An Advance Infinite Series Concept Multiplying of Infinite

Series, Nonstandard Characteristics and Examples . . . . . . . . . . . . . . . . . . . 37.

Chapter 8

Additional Real Number Properties Interior, Closure,

Cluster, Accumulation, Isolated Points, Boundedness, Compactness,

Nonstandard Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.

3

Page 4: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Chapter 9

Basic Continuous Function Concepts All Notions

Generalized to Cluster Points, One-sided Limits and Continuity, Sum,

Product and Composition of Continuous Functions, Extreme and

Intermediate Value Theorems, Nonstandard Characteristics . . . . . . . . . . . . . . . . 46.

Chapter 10

Slightly Advanced Continuous Function Concepts Non-

standard Analysis and Bolzano’s Product Theorem, Inverse Images of

Open Sets, Additive Functions, Uniform Continuity, Extensions of

Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.

Chapter 11

Basic Derivative Concepts Nonstandard Characteristics

for Finite and Infinite Derivatives at Cluster Points, The Infinitesimal

Differential, The Fundamental Theorem of Differentials, Order Ideals,

Basic Theorems, Generalized Mean Value Theorem, L’Hospital’s

Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.

Chapter 12

Some Advanced Derivative Concepts Nonstandard Analysis

and the nth-Order Increments, nth-Order Ideals, Continuous

Differentiability, Uniformly Differentiable, the Darboux Property, and

Inverse Function Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.

Chapter 13

Riemann Integration The Simple Partition, Fine Partitions,

Upper and Lower Sums, Upper and Lower Hyperfinite Sums,

The Simple Integral, The Equivalence of The Simple Integral

and The Riemann Integral, The Basic Integral Theorems and

How The Generalized and Lebesgue Integral Relate to Fine

Partitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 .

Chapter 14

What Does the Integral Measure? Additive Functions and

The Rectangular Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75.

Chapter 15

Generalizations Metric and Normed Linear Spaces . . . . . . . . . . . . . . . . . 78.

Appendix Existence of Free Ultrafilters, Proof of *-Transform

Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80.

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.

4

Page 5: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Although the material in this book is copyrighted by the author, it may be repro-

duced in whole or in part by any method, without the payment of any fees, as long

as proper credit and identification is given to the author of the material reproduced.

Disclaimer

The material in this monograph has not been independently edited nor indepen-

dently verified for correct content. It may contain various typographical or content

errors. Corrections will be made only if such errors are significant. A few results,

due to the simplicity of this approach, may not appear to be convincingly estab-

lished. However, a change in our language or in our structure would remove any

doubts that the results can be established rigorously.

5

Page 6: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

1. FILTERS

For over three hundred years, a basic question about the calculus remained unanswered. Do

the infinitesimals, as conceptional understood by Leibniz and Newton, exist as formal mathematical

objects? This question was answer affirmatively by Robinson (1961) and the subject termed “Non-

standard Analysis” (Robinson, 1966) was introduced to the scientific world. As part of this book,

the mathematical existence of the infinitesimals is established and their properties investigated and

applied to basic real analysis notions.

I intend to write this “book” informally and it’s certainly about time that technical books be

presented in a more “friendly” style. What I’m not going to do is to present an introduction filled

with various historical facts and self-serving statements; statements that indicate what an enormous

advancement in mathematics has been achieved by the use of Nonstandard Analysis. Rather, let’s

proceed directly to this simplified approach, an approach that’s correct but an approach that cannot

be used to analysis certain areas of mathematics that are not classified as elementary in character.

These areas can be analysis but it requires one to consider additional specialized mathematical

objects. Such specialized objects need only be considered after an individual becomes accustomed to

the basic methods used within this simple approach. There are numerous exciting and thrilling new

concepts and results that cannot be presented using the simple approach discussed. The “internal”

objects, objects that “bound” sets that represent “concurrent” relations and saturated models are

for your future consideration. The main goal is to present some of the basic nonstandard results

that can be obtained without investigating such specialized objects.

I’ll present a complete “Proof” for a stated result. However, one only needs to have confidence

that the stated “Theorem” has been acceptably established. Indeed, if you simply are interested

in how these results parallel the original notions of the “infinitesimal” and the like, you need not

bother to read the proofs at all.

There’s an immediate need for a few set-theoretic notions. We let IN be the set of all natu-

ral numbers, which includes the zero as the first one. I assume that you understand some basic

set-theoretic notation. Further, throughout this first chapter, X will always denote a

nonempty set. Recall that for a given set X the set of all subsets of X exists and is called the

power set. It’s usually denoted by the symbol P(X). For example, let X = 0, 1, 2. The power

set of X contains 8 sets. In particular, P(X) = ∅, X, 0, 1, 2, 0, 1, 0, 2, 1, 2, where ∅denotes the empty set, which can be thought of as a set which contains “no members.”

Definition 1.1. (The Filter.) (The symbol ⊂ means “subset” and includes the possible

equality of sets.) A (nonempty) ∅ 6= F ⊂ P(X) is called a (proper) filter on X if and only if

(i) for each A, B ∈ F , A ∩B ∈ F ;

(ii) if A ⊂ B ⊂ X and A ∈ F , then B ∈ F ;

(iii) ∅ /∈ F .

Example 1.2. (i) Let ∅ 6= A ⊂ X . Then [A] ↑ is the set of all subsets of X that contain A, or,

more formally, [A] ↑= x | x ⊂ X and A ⊂ x, is a filter on X called the principal filter.

How to properly define what one means by a “finite” set has a long history. But as Suppes

states “The common sense notion is that a set is finite just when it has “m” members for some

non-negative integer m. [You can use our set IN to get such an “m.”] This common sense idea is

technically sound . . . .” (1960, p. 98). The notion of finite can also be related to constants

6

Page 7: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

that name members of a set and the formal expression that characterizes such a set in terms of the

symbols “=” and “∨” (i.e.“or”). Notice that the empty set is a “finite set.” This association to the

natural numbers is denoted by subscripts that actually represent the range values of a function. I

also assume certain elementary properties of the finite sets and those sets that are not finite. For

X , let ∅ 6= B ⊂ P(X) have the finite intersection property. This means that each Bi ∈ B is

nonempty and that the “intersection” of all of the members of any other finite subset of B is not

the empty set. Given such a B 6= P(X), then we can generate the “smallest” filter that contains B.This is done by first letting B′ be the set of all subsets of P(X) formed by taking the intersection

of each nonempty finite subset of B, where the intersection of the members of a set that contains

but one member is that one member. Let’s consider some basic mathematical abbreviations. The

formal symbol ∧ means “and” and the formal “quantifier” ∃ means “there exists such and such.”

Now take a member B of B′ and build a set composed of all subsets of X that contain B. Now do

this for all members of B′ and gather them together in a set to get the set 〈B〉. Hence, 〈B〉 is the

set of all subsets x of X such that there exists some set B such that B is a member of B′ and B is

a subset of x. Formally, 〈B〉 = x | (x ⊂ X) ∧ (∃B((B ∈ B′) ∧ (B ⊂ x)). I have “forced” 〈B〉 to

contain B and to have the necessary properties that makes it a filter.

There exists a very significant “filter” C on infinite X defined by the notion of not being finite.

This object, once we have shown that it is a filter on X , is called the cofinite filter. [Cofinite means

that the relative complement is a finite set].

Definition 1.3. (QED and iff.) Let C = x | (x ⊂ X) ∧ (X − x) is finite). Also I will

use the symbol for the statement “QED,” which indicates the end of the proof. Then “iff” is an

abbreviation for the phrase “if and only if.”

Theorem 1.4. The set C is a filter on infinite X and, the intersection of all members of C,⋂F | F ∈ C = ∅.

Proof. Since X 6= ∅, then there is some a ∈ X. Further, X−(X−a) = a implies that C 6= ∅.So, assume that A,B ∈ C. Then since X − (A ∩ B) = (X − A) ∪ (X − B), X − (A ∩ B) is a finite

subset of X. Thus, since X − (X − (A ∩ B)) = A ∩ B, then A ∩ B ∈ C. Now suppose A ⊂ C ⊂ X.

Then X − C ⊂ X − A. Thus, because X − A is finite, then X − C is finite. Hence, C ∈ C. Also,since X is infinite, then X − ∅ = X implies that ∅ /∈ C. Consequently, C is a filter on X.

Now observe that K = X − ⋂F | F ∈ C =⋃X − F | F ∈ C = X, for if a ∈ X, then

X −a ∈ C implies that X− (X −a) = a ⊂ K. Hence, we must have that ∩F | F ∈ C = ∅.

I mention that C is also called the Frechet filter. It turns out that we are mostly interested in

a maximum filter that contains C.

Definition 1.5. (Ultrafilter.) A filter U on X is called an ultrafilter iff whenever there’s a

filter F on X such that U ⊂ F , then U = F .

Prior to showing that ultrafilters exist, let’s see if they have any additional useful properties.

Theorem 1.6. Suppose that U is an ultrafilter on X. If A ∪B ∈ U , then A ∈ U or B ∈ U .Proof. Let A /∈ U and B /∈ U but A ∪ B ∈ U . Let G = x | (x ⊂ X) ∧ (A ∪ x ∈ U. We show

that G is a filter on X . [Note: We now begin to use variables such as x, y, z etc. as mathematical

variables representing members of sets. These symbols are used in two context, however. The other

context is as a variable in our formal logical expressions.] Let x, y,∈ G. Then A ∪ x, A ∪ y ∈ U .Hence, (A ∪ x) ∩ (A ∪ y) = A ∪ (x ∩ y) ∈ U implies that x ∪ y ∈ G. Now suppose that x ∈ G and

7

Page 8: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

x ⊂ y ⊂ X. Then A ∪ x ⊂ A ∪ y implies that A ∪ y ∈ U . Hence, y ∈ G. Also A ∪ ∅ /∈ U implies that

∅ /∈ G. Thus G is a filter on X .

Let C ∈ U . Then C ⊂ A ∪ C ∈ U implies that C ∈ G. Therefore, U ⊂ G. But, B ∈ G implies

that U 6= G. This contradicts the maximum aspect for U .

Theorem 1.7. Let F be a filter on X. Then F is an ultrafilter iff for each A ⊂ X, either

A ∈ U or X −A ∈ U , not both.Proof. Assume F is a ultrafilter. Then X = X ∪ (X − A) implies that either A ∈ U or

X−A ∈ U . Both A and X−A cannot be members of U , for if they were then A∩ (X −A) = ∅ ∈ U ;a contradiction.

Conversely, suppose that for each A ⊂ X , either A ∈ F or X − A ∈ F . Let G be filter on X

such that F ⊂ G. Let A ∈ G. Then X −A /∈ G since G is a filter. Thus, X −A /∈ F . Hence, A ∈ F .

Thus G ⊂ F implies that G = F .

Given any filter F on X a major question is whether there exists an ultrafilter U on X such

that F ⊂ U The answer to this question can take on, at least, two forms. The next result states

that such ultrafilters always exist. The proof in the appendix uses a result, Zorn’s Lemma, that is

equivalent to the Axiom of Choice. The Axiom of Choice, although it’s consistent with the other

axioms of set theory, may not be “liked” by some. There’s an axiom that is also consistent with the

other usual axioms of set theory that is weaker than the Axiom of Choice. What it states is that

such an ultrafilter always exits. So, you can take your pick.

Theorem 1.8. Let F be a filter on X. Then there exists an ultrafilter U on X such that F ⊂ U .Proof. See the appendix.

A natural study is to see if we can partition the set of all ultrafilters defined on X into different

categories. And, why don’t we use the symbol FX [resp. UX ] to always denote a filter [resp.

ultrafilter] on X . It turns out there are two basic types of UX , the principal ones and those that

contain CX .

Theorem 1.9. Let p ∈ X. Then [p] ↑ is an UX .

Proof. Let nonempty A ⊂ X. Then either p ∈ A or p ∈ (X −A) and not both. Thus A ∈ [p] ↑or (X −A) ∈ [p] ↑ . Hence, by Theorem 1.3, [p] ↑ is an UX .

Theorem 1.10. Assume that UX is not a principal ultrafilter. Then CX ⊂ UX .

Proof. Let arbitrary nonempty finite p0, . . . , pk ⊂ X. Since UX is non-principal, then UX 6=[pi] ↑, i = 0, . . . , k. Hence, for each i = 0, . . . , k there exists some Ai ⊂ X such that Ai ∈ UX and

pi /∈ Ai. For, otherwise, if pi ∈ Ai for any Ai ∈ UX , then [pi] ↑⊂ UX (they are =.) Consequently,

p0, . . . , pk ∩ (A0 ∩ · · · ∩ Ak) = ∅. However, (A0 ∩ · · · ∩ Ak) ∈ UX . Therefore, p0, . . . , pk /∈ UX .

Theorem 1.3 implies that X − p0, . . . , pk ∈ UX . Thus, CX ⊂ UX .

Non-principal ultrafilters are also called free ultrafilters. This comes from Theorems 1.4 and

1.10 which imply that UX is free iff⋂F | F ∈ UX = ∅. Also, another characterization is that UX

is free iff there does not exist a nonempty finite F ⊂ X such that F ∈ UX . If we let X = IN, then

there are a lot of free UX . Unless otherwise stated, the free ultrafilter that’s used will not affect any

of the stated results.

8

Page 9: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

2. A SIMPLE NONSTANDARD MODEL FOR ANALYSIS

We let IR denote the real numbers. The set IR uses various operators and relations to obtain

results within analysis. For this simplified approach, most of what we need is defined from the basic

addition +, multiplication ·, total order≤ properties and few other ones accorded IR. For convenience,

I denote this fact by the structure notation 〈IR,+, ·,≤,Φi〉, where the Φi are any other relations

one might consider for IR whether definable from the basic relations or not. Further, it is always

understood that each structure includes the = relation, which is but set-theoretic equality or identity

for members of IR. I’ll have a little more to write about how these “numbers” should be viewed

later. But, first to our construction. Let IRIN represent the set of all sequences with domain

IN and range values (images) in IR. Of course, sequences are functions, (maps, mappings, etc.)

that are often displayed as a type of “ordered” set in the form s0, s1, s2, . . .. You can define binary

operators + and ·, among others, for sequences by simply taking any two f, g ∈ IRIN and defining

f + g = h to be the sequence h where the values of h are h(n) = f(n) + g(n) and f · g = fg = k

to be the sequence k where the values of k are k(n) = f(n)g(n) for each n ∈ IN. This forms, at the

very least, what is called a ring with unity. What I’ll do later is to show that there’s a subset of

IRIN that “behaviors” like the real numbers, with respect to the defined relations, and we’ll us this

subset as if it is the real numbers. In all the follows, U = UIN will always be a free ultrafilter and

the symbol U is used to represent members of U . Now to make things symbolically simple capital

letters from the beginning of the alphabet A,B,C, . . . will always denote members of IRIN.

Also, we usually use the subscript notation for the images. Now let us begin our construction of a

nonstandard model for real analysis.

Definition 2.1. (Equality in U) Let A,B ∈ IRIN. Define A =U B iff n | An = Bn = U ∈ U .

(The set of all IN such that the values of the sequences A and B are equal.)

It has been said that the most important binary relation within mathematics is the equivalence

relation. This relation, in general, behaves like = except that you may not be allowed to “substitute”

one equivalent object for another. Recall that for a set X a binary relation R is an equivalence

relation on X iff it has the following properties. For each x, y, z ∈ X , (i) xRx (reflexive property);

(ii) if xRy, then yRx (symmetric property); [Note that if this holds, then xRy iff yRx.] (iii) if xRy

and yRz, then xRz (transitive property). Hence, it is almost an “equality.”

Theorem 2.2. The relation =U is an equivalence relation on IRIN.

Proof. Of course, properties of the = for members of IR are used. First, notice that n | An =

An = IN ∈ U for any A ∈ IRIN. Thus, the relation is reflexive.

Clearly, for any A,B ∈ IR, if n | An = Bn ∈ U , then n | Bn = An ∈ U .Finally, suppose that A,B,C ∈ IR

IN and A =U B and B =U C. Hence, n | An = Bn ∈ U and

n | Bn = Cn ∈ U . The word “and” implies, since U is a filter, that n | An = Bn ∩ n | Bn =

Cn ∈ U . Of course, this “intersection” need not give all the values of IN that these three sequences

have in common, but that does not matter since the “superset” property for a filter implies from

the result

n | An = Bn ∩ n | Bn = Cn ⊂ n | An = Cn

that n | An = Cn ∈ U .

[Note: In the above “proof,” the two step process of getting the common members by the

“intersection” and using the superset property is a major proof method.]

9

Page 10: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Definition 2.3. (Equivalence classes.) We now use the relation =U to define actual subsets

of IRIN. For each A ∈ IRIN, let the set [A] = x | (x ∈ IR

IN) ∧ (x =U A).

It is easy to show that for each A,B ∈ IRIN, either [A] = [B] or [A] ∩ [B] = ∅. (The “=” here

is the set-theoretic equality.) Further, IRIN =⋃[x] | x ∈ IR

IN. That is the set IRIN is completely

partitioned (separated into, broken up into) these non-overlapping nonempty sets. Because of

these properties, we can use any member of the set [A] to generate the set. That is if B,C ∈ [A],

then [A] = [B] = [C]. As to notation, when I’m not particular interested in a sequence that generates

the equivalence class, I’ll denote them by lower case letters a, b, c, . . ..

Denote the set of all of these equivalence classes by ∗IR and call this set the set of all hyperreal

numbers. (The ∗ is often translated as “hyper.”) Consequently, ∗IR = [A] | A ∈ IR

IN. Aftervarious relations are defined on ∗

IR, the resulting “structure” is generally called an ultrapower.

Indeed, it’s this ultrapower that will act as our nonstandard model for portions of real analysis.

There’s still a lot of work to do to turn ∗IR into a such a model, but to motive this work I’ll simply

mention that if you take a sequence s that converges in the normal calculus sense to 0, then [s] is one

of our infinitesimals. What will be done, after the ultrapower model is constructed, is to “embed”

〈IR,+, ·,≤,Φi〉 into the ultrapower so that comparisons can be easily made between the “standard”

objects that represent the properties of the actual real numbers and other objects in the ultrapower.

The notation 〈IR,+, ·,≤,Φi〉 identifies the carrier, IR, as well as certain specialized relations defined

for (on) the carrier.

There are two approaches to analyze this ultrapower, a direct and tedious method, and a method

that uses notions from Mathematical Logic. Once everything is constructed and the embedding is

secured, then the embedded objects become our standard objects. The set of nonstandard objects

is the remainder of the ultrapower.

Definition 2.4. (Addition and multiplication for the ∗IR.) Consider any a, b, c ∈ ∗

IR.

Define a ∗+ b = c iff n | An +Bn = Cn ∈ U . [Note: such definitions assume that you have selected

some sequences An ∈ a, Bn ∈ b, Cn ∈ c. Now define a ∗· b = c iff n | (An) · (Bn) = Cn ∈ U .

Whenever such definitions are made by taking members of a set that contains more than one

member it is always necessary to show that they arewell-defined in that the result is not dependent

upon the member one chooses. The next result shows how this is done and gives insight as to how

it will be done later in completely generality.

Theorem 2.5. The operations defined in definition 2.4 are well-defined.

Proof. Let [A], [D] ∈ a, [B], [F ] ∈ b. Notice that n | An = Dn ∈ U and n | Bn = Fn ∈ Uimplies that n | An = Dn ∩ n | Bn = Fn ∈ U and n | An = Dn ∩ n | Bn = Fn ⊂ n |An + Bn = Dn + Fn implies by the superset property that n | An + Bn = Dn + Fn ∈ U . Thusthe ∗+ is well-defined. (Note: Processes of this type that use filter properties that imply something

is a member of a filter will be abbreviated.) In like manner, for the ∗· .

Thus far, the fact the U is an ultrafilter has not been used. But, for the structure 〈∗IR, ∗+, ∗·〉 tohave all the necessary mathematical “field” properties, this ultrafilter property is significant. That

is so that the ∗+, ∗· arithmetic behaves for ∗IR, like +, · behave for real number arithmetic.

10

Page 11: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 2.6. For the structure 〈∗IR, ∗+, ∗·〉(i) [0] is the additive identity;

(ii) for each a = [A] ∈ ∗IR, −a = [−A] is the additive inverse;

(iii) [1] is the multiplicative identity;

(iv) If a 6= [0], then there exists b = [B] ∈ ∗IR such that a ∗· b = [1].

(v) For each n ∈ IN if Dn = An+Bn and En = AnBn, then [A] ∗+[B] = [D], [A] ∗·[B] = [E].

That is our definitions for addition and multiplication of sequences and the hyper-operators ∗+, ∗ ·are compatible.

Proof. (i) Let [A] ∗+ [0] = [C]. Considering that n | An + 0n = Cn ∈ U and n | An + 0n =

Cn ⊂ n | An = Cn ∈ U , then [A] = [C].

(ii) Let [−A] = [B]. Then once again n | An + (−An) = 0 = 0n = IN ∈ U and thus

[A] ∗+ [−A] = [0].

(iii) This follows in the same manner as (i).

(iv) Let [A] 6= [0]. Then n | An = 0 = 0n = U /∈ U . Hence, IN− U = n | An 6= 0 ∈ U since

U is an ultrafilter. Define

Bn =

A−1n ; if An 6= 0

0; if An = 0.

Notice that n | An ·Bn = 1 = 1n = n | An 6= 0 ∈ U . Hence, [A] ∗· [B] = [1].

(v) By definition, [A] + [B] = [C] iff n | An +Bn = Cn ∈ U . However, n | An +Bn = Dn =

IN ∈ U . Hence, n | An +Bn = Cn ∩ n | An +Bn = Dn = n | Cn = Dn ∈ U. Thus, [C] = [D].

In like manner, the result holds for “multiplication.”

Clearly, one can continue Theorem 2.6 and show that 〈∗IR, ∗+, ∗·〉 satisfies all of the “field”

axioms. It should be obvious, by now, how the “order” relation for ∗IR is defined.

Definition 2.7 ( Order) For each a = [A], b = [B] ∈ ∗IR define a ∗≤b iff n | An ≤ Bn ∈ U .

I won’t show that this relation is well-defined at this time since I’ll do it later for all such

relations. But, we might as well show that this ∗≤ is, indeed, a total order and for 〈∗IR, ∗+, ∗·, ∗≤〉as a binary relation only ∗≤ behaves like the ≤ behaves for IR.

Theorem 2.8. The structure 〈∗IR, ∗+, ∗·, ∗≤〉 is a totally ordered field.

Proof. First, notice that n | An ≤ An = IN ∈ U . Thus, ∗≤ is reflexive.

Next, this relation needs to be anti-symmetric. So, assume that [A] ∗≤[B], [B] ∗≤[A]. Then

n | An ≤ Bn ∩ n | Bn ≤ An ⊂ n | An = Bn ∈ U . Hence, [A] = [B].

For transitivity, consider [A] ∗≤[B], [B] ∗≤[C]. Then n | An ≤ Bn ∩ n | Bn ≤ Cn ⊂ n |An ≤ Cn ∈ U . Thus, [A] ∗≤[C]. (Notice that the same processes seem to be used each time. They

are that U is closed under finite intersection and supersets.)

Next to the notion of “totally.” Let [A], [B] ∈ ∗IR. Suppose that [A] ∗6≤[B]. Thus from the

trichotomy law for IR, n | An > Bn ∈ U . Hence, [A] ∗>[B] or [A] ∗<[B] or [A] = [B]. To show

that it is a totally ordered “field” all that’s really needed is to show that it satisfies two properties

related to this order and the ∗+, ∗· operators. So, let [A], [B], [C] ∈ ∗IR. Let [A] ∗≤ [B]. Then

n | An ≤ Bn ⊂ n | An + Cn ≤ Bn + Cn ∈ U . Thus [A] ∗+ [C] ∗≤[B] ∗+ [C]. Now suppose that

[0] ∗≤ [A], [B]. Then n | 0 ≤ An ∪ n | 0 ≤ Bn ⊂ n | 0 ≤ AnBn ∈ U .

11

Page 12: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

By the way, using repeatedly the ultrafilter properties to establish the above results is actually

unnecessary when a more general result from Mathematical Logic is used. It’s this general result

that gives me complete confidence that these theorems can be established directly. Indeed, the

very definition for the ∗ operators comes from this more powerful approach. A major one of these

Mathematical Logic results I’ll introduce shortly.

There is often introduced into this subject certain concepts from abstract algebra and abstract

model theory. I’ve decided to avoid this as much as possible for this simplified version. But, now

and then, I need to simply state that something holds due to results from these two areas and you

need to have confidence that such statements are fact.

What happens next is to “embed” the structure 〈IR,+, ·,≤〉 into 〈∗IR, ∗+, ∗·, ∗≤〉 so that the

relations +, ·,≤ can be considered as but the relations ∗+, ∗·, ∗≤ restricted to IR. All one does is to

define a function f that takes each x ∈ IR and gives the unique [R], where n | Rn = x ∈ U . Noticethat one such representation for [R] is the sequence Xn = x for each n ∈ IN. Then n | Xn = x =

IN ∈ U . This is called the constant sequence representation for x in ∗IR. This function determines

what is called a model theoretic isomorphism when the relations ∗+, ∗·, ∗≤ are restricted to

the [X ] and is what is used to embed 〈IR,+, ·,≤〉 into 〈∗IR, ∗+, ∗·, ∗≤〉. One of the big results from

abstract model theory states that if one expresses the properties of 〈IR,+, ·,≤〉 in the customary

mathematicians’ way (as a first-order predict statement with constants), then every theorem that

holds true in 〈IR,+, ·,≤〉 will hold true when interpreted within this embedding. It’s important

to note that the real numbers IR are constructed within our basic set theory. Hence, the object

IR has a lot of properties. It’s assumed that all such properties that can be properly expressed

using our present or future defined operations or relations also hold for the structure 〈IR,+, ·,≤〉.Thus, simply consider 〈IR,+, ·,≤〉 as a piece (a substructure) of the structure 〈∗IR, ∗+, ∗·, ∗≤〉. Underthis embedding, the notation can be simplified somewhat, by dropping the ∗ from the relations∗+, ∗·, ∗≤ always keeping in mind that the structure 〈IR,+, ·,≤〉 is formed by simply restricting these

relations to members of the embedded IR. As mentioned each object with which we work and that

becomes part of this embedding will be called a standard object. All other objects discussed are

nonstandard objects.

At this point, I could go onto some abstract algebra and show without any doubt that the

structures 〈IR,+, ·,≤〉 and 〈∗IR,+, ·,≤〉, although they are both totally order fields, are not the

same. But, let’s just show that there is a property that 〈IR,+, ·,≤〉 has that 〈∗IR,+, ·,≤〉 does nothave.

Theorem 2.9. A field property holds for 〈IR,+, ·,≤〉 that does not hold for 〈∗IR,+, ·,≤〉.Proof. There is a property of 〈IR,+, ·,≤〉 that states that for each 0 ≤ r ∈ IR there exists

an n ∈ IN such that r < n. Now the set IN is a subset of IR and in the embedded form (not yet

introduced) IN ⊂ ∗IR. Consider the sequence An = n. Then [A] ∈ ∗

IR. The ultrafilter U is free and

does not contain any finite sets. Thus, for eachm ∈ IN, n | An ≤ m /∈ U . Hence, n | An > m ∈ U .This means that [A] > [M ] = m. Since m is arbitrary, then [A] > [M ], for each m ∈ IN. Hence, at

least for the ordinary embedded IN, this field property for 〈IR,+, ·,≤〉 does not hold for 〈∗IR,+, ·,≤〉.For those that understand the terminology, the field ∗

IR is also not complete.

From our definition, the An = n used to establish Theorem 2.9 would be a nonstandard object.

Now let’s add a vast number of additional relations Φi to our structures. This will allow us to apply

these notions to analysis. The next idea is to “carve out” from our set theory some of the important

set-theoretic objects used throughout nonstandard analysis.

12

Page 13: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Definition 2.10. ( Hyper (*) Extensions of standard objects.) Let U be a free ultrafilter.

For any C ⊂ IR (a 1-ary relation), let b = [B] ∈ ∗C, iff n | Bn ∈ C ∈ U . Let Φ be any k-ary

(k > 1) relation. Then (a1, . . . , ak) = ([A1], · · · , [Ak]) ∈ ∗Φ iff n | (A1(n), . . . , Ak(n)) ∈ Φ ∈ U .This extension process can be continued for other mathematical entities as required.

Now if it’s shown, in general, that these definitions are well-defined, then we can add to our

structure additional n-ary relations Φi and get the structures 〈IR,+, ·,≤,Φi〉 and 〈∗IR,+, ·,≤, ∗Φi〉.In which case, as before, we would have that Φ ⊂ ∗Φ because all of these relations are actually

defined in terms of members taken from IR.

Theorem 2.11. The hyper-extensions defined in 2.10 are well-defined.

Proof. In general, for any [B] ∈ ∗IR, let [B] = [B′]. That is let B′ ∈ IR

IN be any other member

of the equivalence class. Let C ⊂ IR. Then

n | Bn = B′

n ⊂ n | (Bn ∈ C) if and only if (B′

n ∈ C),

n | Bn ∈ C ∩ n | (Bn ∈ C) if and only if (B′

n ∈ C) ⊂ n | B′

n ∈ C,

n | B′

n ∈ C ∩ n | (Bn ∈ C) if and only if (B′

n ∈ C) ⊂ n | Bn ∈ C.

The result for this case follows.

For the other k-ary relations, proceed as just done but alter the proof by starting with

n | B1(n) = B′

1(n) ∩ · · · ∩ n | B1(n) = B′

1(n) ⊂

n | (B1(n), . . . Bk(n)) ∈ Φ if and only if (B′

1(n), . . . B′

k(n)) ∈ Φ.

This completes the proof.

Definition 2.12. (Standard objects operator σ.) I’m using symbols such as x, y, z, w to

represent members of IR or for n > 1 as members of IRn = IR × · · · × IR, with “n” factors. Later,

the Roman font for “variables” in formal expressions is used. For each x ∈ IR, let ∗x = [X ] ∈ ∗IR,

where n | Xn = x = IN (the constant sequence). Then for X ⊂ IR, let σX = ∗x | x ∈ X ⊂ ∗IR.

For n > 1 and each x = (x1, . . . , xn) ∈ IRn, let ∗x = ( ∗x1, . . . ,

∗xn) ∈ ∗(IRn). For X ⊂ IRn,

σX = ∗x | x ∈ X ⊂ ∗(IRn). Each such ∗x and σX is called a standard object. Thus, σIR is the

set of embedded real numbers.

What Definition 2.12 does is to identify within 〈∗IR,+, ·,≤, ∗Φi〉 the embedded 〈IR,+, ·,≤,Φi〉objects. For this structure, it’s significant that not all useful objects can be hyper-extended by the

above, actually necessary, ultrafilter defined extension process. Indeed, because we are only using

sequences with range values in IR, various members of P(P(IR)) cannot be extended. Further, there’s

a problem if the membership relation ∈ is extended. Nonstandard analysis exists as a discipline only

because the structures 〈IR,+, ·,≤,Φi〉 and 〈∗IR, ∗+, ∗·, ∗≤, ∗Φi〉 can be analyzed externally since

they exist as objects in the model of the set theory being used for their construction. In formal set

theory as it might appear in Jech (1971), you find that the natural numbers have the property that

0 ∈ 1 ∈ 2 ∈ 3 ∈ 4 · · · and n /∈ n. The ∈ relation is said to be well founded because there are no

types of sequences of members of this set theory that have this processed reversed. There are no

objects such that · · · a ∈ b ∈ c ∈ d. If, however, the ∈ is extended to ∗∈ for members of IR, then this∗∈ is not well founded.

13

Page 14: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Example 2.13 Suppose that we do define the ∗∈ for appropriate members of IR using definition

2.10 and using a set theory like Jech (1971). Thus ∗∈ is defined for each n ∈ IN as the IN is defined

within this set theory. Now let’s define a collection of sequences from IN into IR as follows, for each

n ∈ IN, let

fn(i) =

0; if i ∈ IN and i ≤ ni− n; if i > n

Here are what some of these sequences look like.

f0(0) = 0, f0(1) = 1, f0(2) = 2, f0(3) = 3, f0(4) = 4, . . . ;f1(0) = 0, f1(1) = 0, f1(2) = 1, f1(3) = 2, f1(4) = 3, . . . ;f2(0) = 0, f2(1) = 0, f2(2) = 0, f2(3) = 1, f2(4) = 2, f2(5) = 3, . . . .

Thus, the sequences after the “0” values have “shifting” range values. From the definition of ∗∈, itfollows that · · · [f2] ∗∈[f1] ∗∈[f0]. To see this, take, say [f2], [f1]. Then f2(0) = 0 /∈ f1(0) = 0, f2(1) =

0 /∈ f1(1) = 0, f2(2) = 0 ∈ f1(2) = 1, f2(3) = 1 ∈ f1(3), . . . . Hence, n | f2(n) ∈ f1(n) = n | n >

1 ∈ C ⊂ U .

Thus, when viewed from the external set theory, ∗∈ is not well founded and does not behave

in the same manner as does ∈ . Further, the ∈ is used to define the “hyper” objects. In order to

avoid this problem for the most basic level, the set IR is considered a set of atoms (Jech, 1971) or

urelements or individuals (Suppes, 1960). This means that each member of IR is not considered

as a set and a statement such as x ∈ y where x, y ∈ IR, has no meaning for our set theory.

14

Page 15: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

3. HYPER-SET ALGEBRA

INFINITE AND INFINITESIMAL NUMBERS

Usually, it’s assumed that we are working with one specific free ultrafilter. Is this of any

significance for our embedding?

Theorem 3.1. Let infinite X ⊂ IN. Then there exists a free ultrafilter U such that X ∈ U . Let

[A], [B] ∈ ∗IR. Then [A] = [B] for all free ultrafilters iff n | An = Bn ∈ C.

Proof. Let infinite X ⊂ IN. Suppose that A ∈ C and A ∩ X = ∅. Then X ⊂ IN − A = a finite

set. Since C has the finite intersection property, this contradiction implies that C ∪ X has the

finite intersection property. Hence, there is an ultrafilter U such that C ∪ X ⊂ U . Obviously, if

n | An = Bn ∈ C, then [A] = [B] for all free ultrafilters. Suppose that [A] = [B] for U and that

n | An = Bn /∈ C. Then X = n | An 6= Bn is infinite. Hence there is some free ultrafilter U1 and

X ∈ U1. Thus for this ultrafilter [A] 6= [B] and the proof is complete.

Later, for Theorem 3.11, I’ll use this result to show that nonstandard objects contained in the

same defined set may be considerable different if different free ultrafilter are used. However, the

actual results obtained when this material is applied to real analysis are, unless otherwise stated,

free ultrafilter independent.

The objects that appear in each structure are also objects that can be discussed by means of

the set theory of which these objects are members. It’s possible to extend these structures to include

other objects from this set theory. Shortly, the *-transform process is introduced and the structures

will be slightly extended to use this process in a technically correct manner.

Theorem 3.2. ∗-Algebra.

(i) ∗∅ = ∅.(ii) If X ⊂ IR [resp. IRn], then σX ⊂ ∗

IR [resp. ∗(IRn)].

(iii) If X ⊂ IR, then ∗x ∈ σX iff x ∈ X iff ∗x ∈ ∗X

(iv) Let X,Y ⊂ IR. Then X ⊂ Y iff ∗X ⊂ ∗Y .

(v) Let X,Y ⊂ IR. Then ∗(X − Y ) = ∗X − ∗Y .

(vi) Let X,Y ⊂ IR Then ∗(X ∪ Y ) = ∗X ∪ ∗Y . Also, ∗(X ∩ Y ) = ∗X ∩ ∗Y .

(vii) Let X ⊂ IR. Then X is a nonempty and finite iff ∗X = σX.

(viii) Let X1, . . . , Xn ⊂ IR. For the customarily defined n-ary relations, Φ = (X1 × · · · ×Xn−1)×Xn iff ∗Φ = ∗(X1× · · ·×Xn−1)× ∗Xn = ( ∗X1× · · · ∗Xn−1)× ∗Xn. Thus,

∗(IRn) = (∗IR)n.

(ix) The statements (iii), (iv), (v), (vi) and (vii) hold for IRn, n > 1.

(x) For i > 1 and ∅ 6= Φ ⊂ IRn, let Pi denote the set-theoretic i’th projection map. Then

∗(Pi(Φ)) = Pi(∗Φ).

Proof. (i) If S = ∅, then for any a ∈ IRIN, n | An ∈ S = ∅ /∈ U . Thus our hyper-set algebra

yields that ∗∅ = ∅.(ii) This is simply a repeat of Definition 2.12.

(iii) By definition, ∗x ∈ σX iff x ∈ X. Now assume that x ∈ X. Then, by definition, ∗x =

[Xn], Xn = x for each n ∈ IN. Hence, n | Xn ∈ X = IN ∈ U . Thus, ∗x ∈ ∗X. Conversely, assume

that ∗x ∈ ∗X. By definition, ∗x = [Xn] and Xn = x for all n ∈ IN. Thus ∅ 6= n | Xn = x = IN ∈ U .Hence, x ∈ X.

15

Page 16: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

(iv) Let X ⊂ Y ⊂ IR. Then X ⊂ IR and a ∈ ∗X iff n | An ∈ X ∈ U . But, n | An ∈ X ⊂n | An ∈ Y . Thus, n | An ∈ Y ∈ U . Now assume that ∗X ⊂ ∗Y . Then for each x ∈ X , ∗x ∈ ∗X

by (iii). Thus ∗x ∈ ∗Y . Again by (iii) x ∈ Y. Thus X ⊂ Y.

(v) First, notice that X−Y ⊂ IR. Let a ∈ ∗(X−Y ). Then n | An ∈ (X−Y ) ∈ U = U ∈ U .But, this implies that U ⊂ n | An ∈ X ∈ U and U ⊂ n | An /∈ Y . Thus, n | An /∈ Y ∈ U .Hence, a /∈ ∗Y . Consequently, a ∈ ∗X− ∗Y . I’m sure you can establish the converse that a ∈ ∗X− ∗Y

implies that a ∈ ∗(X − Y ).

(vi) The sets X ∪ Y and X ∩ Y are subsets of IR. Now simply notice that the following identity

characterizes the intersection operator. C = X ∩ Y = X − (X − Y ). Thus, ∗C = ∗(X ∩ Y ) =∗X−( ∗X− ∗Y ) = ∗X∩ ∗Y . Then a ∈ ∗(X∪Y ) iff n | An ∈ (X∪Y ) = n | An ∈ X∪n | An ∈ Y .Hence, if n | An ∈ (X ∪ Y ) ∈ U , then either n | An ∈ X ∈ U or n | An ∈ Y ∈ U .Thus, ∗(X ∪ Y ) ⊂ ∗X ∪ ∗Y . Since X ⊂ (X ∪ Y ) and Y ⊂ (X ∪ Y ), it follows from (iv) that∗X ∪ ∗Y ⊂ ∗(X ∪ Y ) and the result follows.

(vii) The first part is established by induction. Let X = x. Then x ⊂ IR. By definition

a ∈ ∗X iff n | An ∈ x = n | An = x ∈ U . Now ∗x = [X ] and n | Xn = x = IN ∈ U .Thus, n | Xn = Bn = n | Xn = x ∩ n | Bn = x ∈ U implies that [X ] = [B] = ∗x. Assume

the result holds for a set with k members. Then ∗xi, . . . , xk+1 = ∗(x1, . . . , xk ∪ xk+1) =∗x1, . . . , xk ∪ ∗xk+1 = ∗x1, . . . ,

∗xk+1 by the induction hypothesis and (v), the result holds

for any k ≥ 1.

For the converse, let infinite X ⊂ IR and assume that σX = ∗X. There exists an injection

B: IN → X . Hence Bn | n ∈ IN is an infinite subset of X . Let ∗x = [X ] ∈ σX. Then Xn = x ∈ X

for each n ∈ IN. But, n | Xn = Bn is finite. Hence, [X ] 6= [B] since n | Xn 6= Bn ∈ C. Also

n | Bn ∈ X = IN ∈ U implies that b ∈ ∗X. There is no x ∈ X such that ∗x = b ∈ ∗X impliesσX 6= ∗X.

(viii) The customarily defined notion of an n-ary relation can be found in Jech (1971). The first

idea is that a 1-ary relation is but the subset of IR and this has been established in (iii). The other

cases, n > 1, for xi ∈ Xi ⊂ IR, 1 < i ≤ n the Cartesian product Xn is characterized by the statement

that (x1, . . . , xn) ∈ (X1 × · · · × Xn−1) × Xn iff xi ∈ Xi, 1 ≤ i ≤ n, where the actual “Cartesian

product” is defined by induction. That is X1×X2×X3 = (X1×X2)×X3 etc. (There are other ways

to define the Cartesian product more formally just using 2-tuples and finite sequences.) Note that

for any k > 1, n | (A1(n), . . . , Ak(n)) ∈ (X1 × · · · ×Xk−1) ×Xk = n | A1(n) ∈ X1 ∩ · · · ∩ n |Ak(n) ∈ Xk. The result follows from basic filter properties.

(ix) Statements (iii), (iv), (v) are proved in the exact same manner for (∗IR)n. Statements (vi),

(vii) are proved by application of the method used in (vii) coupled with the characterization used

to establish (Viii).

(x) Let a ∈ ∗(Pi(Φ)). Then n | An ∈ (Pi(Φ)) ∈ U iff n | there exist B1, . . . , Bi−1, Bi+1, . . . ,

Bm such that (B1(n), . . . , Bi−1(n), A(n), Bi+1(n), . . . , Bm(n)) ∈ Φ ∈ U . Hence, there exist bj , 1 ≤j ≤ m, i 6= j such that (bi, . . . , bi−1, a, bi+1, . . . , bm) ∈ ∗Φ. But from the definition of such a

projection this gives that a ∈ Pi(∗Φ)). Thus ∗(Pi(Φ)) = Pi(

∗Φ) because of the equivalence of

the two set-theoretic statements. (At present, I have not introduced a more formal way of writing

definitions for such sets.)

Important. Theorem 3.2 involves properties about our original structure 〈IR,+, ·,≤,Φi〉 andthe 〈∗IR,+, ·,≤, ∗Φi〉 and the embedded objects. Although the embedded objects “behave” like

16

Page 17: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

the original objects, they’re still different from these. This observation will come into play when I

discuss the notion of *-transform.

It’s about time that I demonstrated that the “ideal” numbers used by Leibniz, that did not

really exist mathematically until 1961, exist within ∗IR. These are the infinitesimals which solve this

three hundred year old problem.

Definition 3.3. (Infinite and infinitesimal numbers.) As usual define the absolute value

function (i.e. binary relation) for members of a ∈ ∗IR by requiring as ∗ |a| = |a| = b iff n | |An| =

Bn ∈ U . Although, I won’t show it, but later will show how to establish the fact, this function ∗ | · |has the same mathematical properties as does | · | for members of IR. So, I have written it as if its a

restriction to our embedded σIR of the usual absolute value function. An a ∈ ∗

IR is infinitely large

or simply an infinite number or shorter still infinite iff ∗x < |a| for each ∗x ∈ σIR. (Some might

go back to the pre-embedded IR for these definitions, but remember we are thinking of σIR as our

actual set of real numbers.) A b ∈ ∗IR, is an infinitesimal or as Newton stated infinitely small iff

0 ≤ |b| < ∗x for each x ∈ IR+, the set of all positive real numbers.

Now that these “new” types of numbers are defined, do any exist?

Example 3.4. Let A be the member of IRIN with the property that Ak = k for each k ∈ IN.

Let ∗x ∈ σIR. Then there exists some m ∈ IN such that |x| < m. Hence, |x| = |Xn| < Am = m for

each n ∈ IN implies that n | An > |Xn| ⊃ m,m+ 1, . . . ∈ C ⊂ U . Thus, a is an infinite number.

There are a lot more.

Note that ∗0 is the trivial infinitesimal. Consider, the sequence Gn = 1/n, n ∈ IN − 0and G0 = 0. Then g 6= ∗0. Now for each x ∈ IR

+ there is some m ∈ IN, m 6= 0 such that

0 < 1/m < x. Thus ∗0 < ∗1/ ∗m < ∗x. (Note: So far we would need to establish such statements by

ultrafilter properties. But, this is really trivial because by definition each ∗x is the constant sequence

representation.) Now IN−n | Gn ≥ Xn is a finite subset of IN. Hence, n | 0 < Gn < Xn ∈ C ⊂ U .Thus, g is an infinitesimal. Indeed, once we get one nonzero infinitesimal, we can generate infinitely

many.

Definition 3.5. A a ∈ ∗IR is finite or limited iff it’s not infinite. That is if there is some

∗x ∈ σIR

+ (positive embedded reals) such that |a| ≤ ∗x. The set of all finite numbers is denote

by G(0), the galaxy within our universe ∗IR in which σ

IR resides. (Note: If a ∈ ∗IR, then a ∈ G(0)

iff there is some ∗y such that |a| < ∗y.) The set of all infinitesimals is denoted by µ(0).

What is the algebra of the infinitesimals and does this algebra display the exact algebra used by

Newton and Leibniz? It’s customary to let lower Greek letters represent nonzero infinitesi-

mals. Then as one would expect capital Greek letters represent infinite numbers. I’ll show

later that many real valued functions defined on open intervals about zero preserves infinitesimals.

Indeed, if x > 0 and f : (−x, x) → IR is continuous at x = 0 and f(0) = 0, then ∗f(ǫ) = λ or 0.

(Notation: “f is a function that takes each and every member of (−x, x) and yields members of IR.”)

We know that ∗IR is a totally ordered field and µ(0), G(0) ⊂ ∗

IR. Also, 0 ∈ µ(0) ∩ G(0). The

next Theorem, 3.8, gives exactly how the relations +, ·,≤ behave when they are restricted to µ(0)

and G(0). It will turn out that both of these sets are totally order rings with no zero divisors.

What does this mean for the relations ∗+, ∗·, ∗≤? This means that µ(0) and G(0) behave for these

binary relations exactly like the integers · · · ,−3,−2−1, 0, 1, 2, 3, · · · behave with the one exception

that ∗1 /∈ µ(0). By the way, if your interested, the x, y are zero divisors iff xy = 0 implies that

x = 0 or y = 0.

17

Page 18: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Certainly, establishing results by using the basic properties of the ultrafilters is getting a bit

tedious. There most be a better way. And, there is. If a statement about 〈IR,+, ·,≤,Φi〉 is expressedin a special way and that statement holds, then there is a process that’s used to show that a

altered statement holds in 〈∗IR,+, ·,≤, ∗Φi〉. The process is called *-transform. However, to do

this properly it’s necessary to extend the structure considerably.

(It’s not really necessarily that you fully understand the contents of my new extended struc-

ture. You could just go immediately to Definition 3.6 and simply restate Theorem 3.7 only in

terms of the notation M and ∗M and without the structures being specified.) Because of the

way I have defined the *-extension operator in Definition 2.10, the structure I technically need is

〈IR, . . . , IRn, . . . ,P(IR), . . . ,P(IRn), . . . ,+, ·,≤〉. Although not actually necessary I have identified the

three indicated binary relations used previously. The actual n used in practice is rather small,

usually. Let each element of each of the objects in IR, . . . , IRn, . . . ,P(IR), . . . ,P(IRn), . . . have a

“constant” name and we use +, ·,≤ and the like as the “names” for these specific objects. Also the

constant that “names” a mathematical object itself will not be differentiated from the mathematical

object itself. Let Cn be this set of all of these constants. Notice that “constants” that are mem-

bers of a n-ary relation n > 1 like (x, y, z), use the constants x, y, z from IR. These n-tuple forms

(x1, . . . , xn) are part of our language.

Definition 3.6. (*-transform) Consider any properly formed statement (formally a first-order

formula with equality and constants using the atomic formula in the appendix) with bounded quan-

tifiers and only using members of Cn. Then the *-transform of this statement is obtained by writing

a ∗ to the left as a superscript of each constant. Also, there is the reverse process where a statement

in terms of the Cn is obtained by removing the ∗.

I’m not going to present a course in first-order logic in this monograph. So, you’ll simply need

to assume that I’ve expressed the “formal” statements in the proper bounded form. This means

that the “variable” that appears to the right of a quantifier, the universal “for each,” ∀x, and the

existential “there exists some,” ∃x, must vary over one of the sets in the standard structure. Of

course, it turns out that mathematicians seem to always write their informal sentences in forms

that are logically equivalent to these bounded forms. The reason for the bounded form is that in

the appendix Theorem A3 establishes the following without using the Axiom of Choice. Only some

previous results obtained using ultrafilters are needed. Notice in what follows two new symbols are

introduced for the respective structures and the structures are now extended slightly.

Theorem 3.7. Let S be any sentence in bounded form that uses only constants in Cn. Then

S holds for M = 〈IR, . . . , IRn, . . . ,P(IR), . . . ,P(IRn), . . . ,+, ·,≤〉 iff the *-transform of S holds in∗M = 〈∗IR, . . . , ∗IRn, . . . ,P(∗IR), . . . ,P(∗IRn), . . . ,+, ·,≤〉.

WARNING If someone who has experience with nonstandard analysis reads Theorem 3.7, they

might state that the theorem is in error, since I have written P(∗IRn), etc., in the structure. However,

it is correct as shown in the appendix for the language being used. For example, an expression such

as ∃x(x ∈ P(∗IR)) is not in the proper *-transfer form. This theorem only applies to ∃x(x ∈ ∗P(∗IR)).

I don’t suppose that you noticed that *-transform is a reversible process that relates our original

structure M and the nonstandard structure ∗M and does not technically mention the embedded

objects. Here’s one place where there is a different notational approach. In some work, the original

structure and the embedded structure are consider as identical. I will probably not do this if there’s

18

Page 19: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

any possible confusion. Also there are actually certain properties of IR that can’t be expressed in our

formal language. The language could be enriched. But, if this is done one might as well go all the

way to the object called a superstructure. The idea is to see what can be accomplished without

such an enriched language and the additional complications this would produce.

Theorem 3.8. The sets µ(0), G(0) are totally ordered subrings of ∗IR with no zero divisors and

G(0) has an identity.

Proof. I start with G(0) since it contains µ(0) and G(0) ⊂ ∗IR to show that it is a totally

ordered ring with no zero divisors, all that is needed is to show that it is closed under the operations

+, ·. To do this efficiently Theorem 3.4 is used. Informally, we know that if we are given any two

real numbers x, y, then |x+ y| ≤ |x|+ |y|. The formal bounded statement of this fact is

∀x∀y((x ∈ IR) ∧ (y ∈ IR) → |x + y| ≤ |x|+ |y|)

holds in M, and, hence, its *-transform holds in ∗M. Thus,

∀x∀y((x ∈ ∗IR) ∧ (y ∈ ∗

IR) → |x + y| ≤ |x|+ |y|)

is a fact about ∗M. (Note: You could have written | · | as ∗ | · |. I also point out that this is actually

considered as written in the form for a particular Φj , where we have in our language the ordered

n-tuple notation. Define Φj = (w, y, z) | |w| ≤ |y| + |z|. Then |x + y| ≤ |x| + |y| is equivalent to

((w, x, y) ∈ Φj) ∧ (w = x + y).)

Thus, the triangle inequality holds in ∗IR. So, let a, b ∈ G(0). Then there are standard ∗x, ∗y ∈

σIR such that |a| < ∗x, and |b| < ∗y. But |a+ b| ≤ |a|+ |b| < ∗x+ ∗y = ∗(x+ y) from our definitions

and the order properties of of ∗IR. This gives that a + b ∈ G(0) which gives us closure under +

since ∗0 ∈ G(0). In like manner, one gets that ab ∈ G(0). Of course, since G(0) ⊂ ∗IR the members

have the usual associative, commutative and distributive properties and ∗0 is its zero. Now either

by *-transform or filter properties ∗1 is also an identity in G(0). It now follows immediately that

since ∗IR is a totally ordered field, then G(0) is a totally ordered ring. Further, since ∗

IR has no zero

divisors neither does G(0).

Next, consider µ(0). We can apply the method used to establish that G(0) is a totally ordered

ring with no zero divisors (but 1 /∈ µ(0)) to show that µ(0) is a totally ordered ring with no zero

divisors. The only difference in the proofs is that instead of writing that there is some ∗x > 0 such

that |a| < ∗x, we have that ǫ ∈ µ(0) iff |ǫ| < ∗x for all arbitrary ∗x > 0.

Does µ(0) have any other significant algebraic properties? The answer is yes and it’s this

most remarkable property that’s needed if its members are to mimic the “infinitely small” notion of

Newton. What µ(0) does is to “absorb” via multiplication every member of G(0). It has this “ideal”

property. A I ⊂ G(0), is an ideal iff it is a subring (which µ(0) is) and for each a ∈ G(0) and each

b ∈ I, the product ab ∈ I. An ideal I ⊂ G(0) is maximum iff for any other ideal I1 ⊃ I, I = I1 or

I = G(0).

Theorem 3.9. The set of infinitesimals µ(0) is a proper maximum ideal in G(0).

Proof. Let a ∈ G(0) and ǫ ∈ µ(0). Then there is some ∗x ∈ σIR such that |a| < ∗x. Let

ǫ = g = [G]. Consider arbitrary positive ∗y. Then

F = n | |An| < Xn = x ∩ n | |Gn| < Yn = y ∈ U .

19

Page 20: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

However, n | |AnGn| < xy ⊃ F. But xy is also arbitrary. Hence, aǫ ∈ µ(0).

Let I be any ideal in G(0) such that µ(0) ⊂ I. Assume that there is some b ∈ I−µ(0). Then b 6= 0

and there exists some positive x ∈ IR such that n | |Bn| ≥ r ∈ U . Hence, n | |1/Bn| ≤ 1/x ∈ U .Consequently, [B−1] = b−1 ∈ G(0) implies that ∗1 = bb−1 ∈ I. This last fact will always force

I = G(0) since it’s an ideal. Well, take any 0 6= x ∈ IR. Then ∗x 6= ∗0 and ∗x /∈ µ(0) implies that

µ(0) 6= G(0). Hence, µ(0) is a proper maximum ideal.

I could go into some other abstract algebra material and use the language of quotient rings,

isomorphisms and kernels to show exactly how G(0) and µ(0) are related to σIR, but it’s unnecessary

to do this for this simplified approach. It’s enough to say that the properties of µ(0) exactly match

the “infinitely small” of Newton and the “ideal numbers” of Leibniz.

It has become customary to drop the ∗ from the members of σIR when there is no confusion.

I’ll start doing this in the very important next definition.

Definition 3.10 (Monads of standard numbers.) Let x ∈ σIR. Then the monad of (about) x

is the set µ(x) = x+ ǫ | ǫ ∈ µ(0). The only standard object in µ(x) is x. (Recall that when there’s

no confusion, I might use x in place of ∗x.)

Before showing a remarkable relation between the monads and G(0), I need the next theorem.

Theorem 3.11. Let An be a sequence of real numbers. Then [A] ∈ µ(x) for every free ultrafilter

iff limn→∞ An = x.

Proof. First, note that, for a fixed free ultrafilter U and its monad µ(x), [A] ∈ µ(x) iff there is

some ǫ ∈ µ(0) such that [A] = x+ ǫ, [A]−x = ǫ iff [A]−x ∈ µ(0) iff n | |An−Xn| < r ∈ U for any

arbitrary positive r. Let U be any free ultrafilter and assume that An → x. Then for arbitrary positive

r, we have that |An − x| < r for all but a finite number of An. Thus, n | |An −Xn| < r ∈ C ⊂ U .But r is arbitrary implies that [A] ∈ µ(x).

Conversely, assume that An 6→ x. Then there is a positive r such that X = n | |An − x| ≥ ris an infinite set. Any infinite subset of IN is contained in some free ultrafilter U1 by Theorem 3.1.

Thus, for this U1, [A] /∈ µ1(x) since the complement of X is not a member of U1.

Theorem 3.12 The collection µ(x) | x ∈ σIR is a partition for G(0).

Proof. Technically, to be a partition of G(0), one must have that µ(x) ∩ µ(y) 6= ∅ implies that

µ(x) = µ(y) and that⋃µ(x) | x ∈ σ

IR = G(0). For the first part, assume that there exists some

a ∈ µ(x) ∩ µ(y). Then a = ǫ + x, a = λ + y. But, ǫ + x = λ + y implies that ǫ − λ = y − x. This

is only possible if ǫ − λ = 0 since y − x ∈ σIR. Thus x = y. Let a ∈ ⋃µ(x) | x ∈ σ

IR. Thena = ǫ + x for some x ∈ σ

IR. Then |a| = |ǫ + x| ≤ |ǫ|+ |x| < |x| + 1. Hence a ∈ G(0). Consequently,⋃µ(x) | x ∈ σ

IR ⊂ G(0).

Now assume that a ∈ G(0). Rather than continue to use the properties of the our free ultrafilter,

let’s just consider the properties of <. Hence, there is some ∗x ∈ σIR

+ such that a < ∗x. So, consider

the set S = y | ∗y < a This set is nonempty since −x ∈ S. Also since a < ∗x, S is set of

real numbers that’s bounded above and as such has a least upper bound z. The number z needs

to be located. Assume that |z − a| is not an infinitesimal. Thus there is some w ∈ IR such that

| ∗z−a| > ∗w. Suppose that ∗z < a. Then a− ∗z > ∗w implies that ∗z+ ∗w = ∗(z+w) < a implies

z +w ∈ S and z is not the least upper bound. So, let a < ∗z. This implies that a < ∗(z −w) < ∗z.

But, z −w is an upper bound for the set S. This contradicts the least upper bound property for z.

Hence, ∗z − a = ǫ implies that a ∈ µ(z).

20

Page 21: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Of course, this implies, in general, that⋃µ(x) | x ∈ σ

IR = G(0) is free ultrafilter independent.

But, the monads that contain some of the members of IRIN cannot be readily determined.

Example 3.13 Consider the sequence a = 1,−1, 1,−1, . . .. Then as done in the proof of

Theorem 3.11, U1 = n | An = 1 ∩ U2 = n | An = −1 = ∅. The set U1 is a member of the free

ultrafilter U1 and U2 is a member of the free ultrafilter U2 where U1 6= U2. Further, a ∈ µ1(1) and

a ∈ µ2(−1).

There are some other useful properties that relate members of ∗IR, and the sets G(0) and µ(0)

and show that they model the older notions of real number “infinities” and the “infinitely small.”

Let ∗IR−G(0) = IR∞ be the infinite numbers. It’s immediate from the definition that if a, b ∈ IR∞,

then ab ∈ IR∞. If 0 < a [resp a < 0] ∈ IR∞ and a < b [resp. b < a], then b ∈ IR∞ since a > r [resp.

a < r] for each r ∈ IR.

Theorem 3.14

(i) If b ∈ IR∞, then 1/b ∈ µ(0).

(ii) If 0 6= ǫ ∈ µ(0), then 1/ǫ ∈ IR∞.

(iii) Let ǫ ∈ µ(0), b ∈ µ(x). Then ǫ+ b ∈ µ(x) and ǫ b ∈ µ(0).

(iv) If b ∈ IR∞, and ∗x 6= ∗0, then b ∗x ∈ IR∞. If a ∈ G(0) − µ(0), then ba ∈ IR∞. (The∗IR∞ almost has the special property associated with an ideal.)

(v) If ∗x < ∗y, then ∗x+ ǫ < ∗y + λ for any ǫ, λ ∈ µ(0).

Proof. (Most mathematicians would consider these proofs as trivial and would “leave them to

the reader.” But, I’ll do most of them.)

(i) If b ∈ IR∞, then for any ∗x ∈ σIR+, ∗x < |b|. Thus by field properties, 1/|b| < ∗x. This says

that 1/|b| ∈ µ(0).

(ii) Same method as (i).

(iii) Let ǫ ∈ µ(0) and b ∈ µ(x). Then b = ∗x+λ implies that ǫ+ b = ∗x+ ǫ+λ = ∗x+ γ ∈ µ(0).

Then ǫ b ∈ µ(0) from Theorem 3.9 or ǫ b = ǫ ∗x+ ǫγ = α+ β ∈ µ(0).

(iv) Using (i), 0 6= 1/( ∗xb) ∈ µ(0). Now use (ii). For the second part, use the fact that if

b ∈ G(0) − µ(0), then if a > ∗0, there is some ∗x > 0 such that ∗x < a and if a < 0, then there is

some ∗y such that a < ∗y. Now apply the remark I made just prior to this theorem.

(v) Assume that 0 ≤ ǫ− λ ∈ µ(0). Hence, for a < b, 0 ≤ ǫ− λ < b− a. Thus a+ ǫ < b+ λ.

The fact that the µ( ∗x) | ∗x ∈ σIR forms a partition of G(0) immediately defines for all

members of G(0) an equivalence relation of some importance, where this relation is a short hand for

a member of G(0) being in a unique µ(x).

Definition 3.15. (Infinitely close (near) equivalence relation.) Two a, b ∈ G(0) are

infinitely close iff a− b ∈ µ(0). This relation is written as a ≈ b.

We almost have enough of the basic machinery to continue with real analysis. But, there is one

last major procedure that needs to be introduced, the “standard part” operator.

Definition 3.16. (The standard part operator, st.) Using Theorem 3.12, there is a

function st on G(0) into σIR such that, for each µ(x), st(µ(x)) = ∗x ↔ x. Once the properties of

st are obtained, then, usually, one further allows st(µ(x)) = x ∈ IR. The function st is called the

standard part operator.

21

Page 22: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Most of the results in the next theorem would be what one would expect.

Theorem 3.16. Let st:G(0) → σIR (IR) be the standard part operator. Then for each a, b ∈

G(0),

(i) st(a± b) = st(a)± st(b).

(ii) st(ab) = st(a)st(b).

(iii) If a ≤ b, then st(a) ≤ st(b).

(iv) st(|a|) = |st(a)|, st(maxa, b) = maxst(a), st(b), st(mina, b) =

minst(a), st(b).(v) st(a) = 0 iff a ∈ µ(0).

(vi) For any ∗x, st( ∗x) = ∗x.

(vii) The st(a) ≥ 0 iff |a| ∈ µ(st(a)).

(viii) a ≈ b iff a− b ∈ µ(0) iff st(a) = st(b)

(ix) If st(a) ≤ st(b), then either a− b ∈ µ(0) or a ≤ b.

(x) If ∗0 < c [resp. c < ∗0] and c ∈ ∗IR∞, then for a ≥ ∗0, ∗0 < c + a ∈ ∗

IR∞ [resp.

a ≤ ∗0, c+ a ∈ ∗IR∞ ].

Proof. I’ll do (iii) and leave the others to the reader. Let a, b ∈ G(0), a ≤ b. Then a ∈µ(st(a)), b ∈ µ(st(b)) implies that a = st(a)+ǫ, b = st(b)+γ implies that 0 ≤ st(b)−st(a)+γ−ǫ =

st(b − a) + γ − ǫ implies that st(a) ≤ st(b) since the monads are disjoint.

Is it clear that if IN∞ = ∗IN− σ

IN, then IN∞ ⊂ IR∞?

Theorem 3.17. The set of infinite natural numbers IN∞ ⊂ IR∞ and, for each n ∈ IN∗n < Λ

for each Λ ∈ IN∞.

Proof. In example 3.4, the infinite number defined is actually a member of ININ ⊂ ∗IR. Thus,

∗IN−σ

IN 6= ∅. For each m ∈ σIN, there is the m+1 ∈ σ

IN ⊂ σIR. Hence, σ

IN ⊂ G(0). Let a ∈ ∗IN−σ

IN

and a ∈ G(0). Then since ∗0 ≤ b for each b ∈ ∗IN, there is some r ∈ σ

IR such that ∗0 ≤ |a| = a < r.

But, we know there is some m ∈ IN, hence, ∗m ∈ σIN such that ∗0 ≤ a < ∗m. *-transform of the

statement “for each x, for each y, for each z, if x ∈ IN and y ∈ IN and z ∈ IN and x ≤ y, then z ∈ [x, y]

iff 0 ≤ z ≤ y” or formally ∀x∀y∀z((x ∈ IN)∧(y ∈ IN)∧(z ∈ IN)∧(x ≤ y) → (z ∈ [x, y] ↔ (0 ≤ z ≤ y)))

holds in ∗M and characterizes the set ∗ [ ∗0, ∗m]. Thus, a ∈ ∗ [ ∗0, ∗m]. But, since [0,m] is a finite

set, then Theorem 3.2 (vi) implies that a = ∗n for some ∗n ∈ σIN. This contradiction implies that

IN∞ ⊂ IR∞ as one would expect. The second part follows immediately.

22

Page 23: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

4. BASIC SEQUENTIAL CONVERGENCE

One intuitive statement about sequences of real numbers states something like “all the con-

vergence properties are determined by the behavior of the infinite tails.” In fact, for elementary

converges, we have “Well, the values of the sequence get nearer, and nearer, and nearer and stay

near to the limit no matter how far you go out in the series.” Does the nonstandard theory of

sequential convergence model both “getting nearer, and nearer” and “staying near” simultaneously?

Indeed, you’ll find out that, for convergence, the infinite tails are all members of G(0). More-

over, each nonstandard characteristic based directly upon a definition is stated in, at least, one less

quantifier. Godel considered that just removing one quantifier from any characterization is a major

achievement within mathematics. By the way, all of the results presented in the remainder of this

book are free ultrafilter independent. Also, many of the definitions and proofs presented are easily

generalized to the multi-variable calculus. (I’ll use the notation n ∈ ∗IN where there is no confusion

as to the location of the n. Usually one might write this as a ∈ ∗IN.)

Theorem 4.1. A sequence S: IN → IR is bounded iff ∗S(n) ∈ G(0) for each n ∈ ∗IN iff

( ∗S[∗IN] ⊂ G(0).)

Proof. Let S be bounded. Recall what this means. There exists some x ∈ IR+ such that

“for each n ∈ IN, |S(n)| < x” or ∀y((y ∈ IN) → (|S(y)| < x)) holds in M. By *-transform,

∀y((y ∈ ∗IN) → ( ∗| ∗S(y) ∗| < ∗x)) holds in ∗M. (Note: We can consider with respect to our embedding

that | · | is but a restriction of ∗ | · ∗ | and we need not use the ∗ there, although this is but a notational

simplification.) Hence, for each n ∈ ∗IN, ∗S(n) ∈ G(0).

Conversely, for each n ∈ ∗IN, let ∗S(n) ∈ G(0). We know that there is a b ∈ ∗

IR+∞ ⊂ ∗

IR+ such

that for each c ∈ G(0), |c| < b. Hence, ∃x((x ∈ ∗IR

+) ∧ ∀y((y ∈ ∗IN) → | ∗S(y)| < x)) holds in ∗M.

Thus, the statement ∃x((x ∈ IR+) ∧ ∀y((y ∈ IN) → |S(y)| < x)), obtained by dropping the ∗, holds

in M and the sequence is bounded.

What about the “near to L” and “stays near” intuitive notion and it’s relation to the “true”

infinite part of the tail?

Theorem 4.2 A sequence S: IN → IR converges to L ∈ IR (Sn → L) iff ∗S(Λ)−L ∈ µ(0) for each

Λ ∈ IN∞ iff ∗S(Λ) ∈ µ(L) for each Λ ∈ IN∞ iff st( ∗S(Λ)) = L for each Λ ∈ IN∞ iff ( ∗S[IN∞] ⊂ µ(L).)

Proof. Let S: IN → IR converge to L. Let y ∈ IR+. Then we know that there exists some m ∈ IN

such that for each k ∈ IN where k ≥ m, |S(k)− L| < x. Hence, the statement

∀x((x ∈ IN) ∧ (x > m) → (|S(x)− L| < y))

holds in M; and, hence, in ∗M. In particular, by *-transform, for each Λ ∈ IN∞, | ∗S(Λ)− L| < ∗y.

Since, y is arbitrary, then ∗S(Λ)−L ∈ µ(0) for each Λ ∈ IN∞. Hence, ∗S(Λ) ∈ µ(L) and st( ∗S(Λ)) =

L for each Λ ∈ IN∞.

Conversely, assume that ( ∗S(Λ)− L) ∈ µ(0) for each Λ ∈ IN∞. Let y ∈ IR. Since IN∞ 6= ∅, thenby Theorem 3.17, the sentence

∃z((z ∈ ∗IN) ∧ ∀x((x ∈ ∗

IN) ∧ (z < x) → (| ∗S(x)− L| < ∗y)))

holds in ∗M. Thus, it holds in M, by reverse *-transform. But, this is the standard statement that

Sn → L. All the remaining “iff” are but restatements of ∗S(Λ)− L ∈ µ(0) for each Λ ∈ IN∞.

23

Page 24: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Corollary 4.3 All the basic limit theorems for sums, products, etc. all follow from Theorem

4.2 and the properties of the “st” operator.

Examples 4.4

(i) (1/n)p → 0, n, p > 0, p ∈ IN. We know that for each nonzero Λ ∈ IN∞, (1/Λ)p ∈ µ(0)

and the result follows. (If we had result that the continuous function f(x) = xp, p > 0, preserves

infinitesimals, then we could extend this to any p > 0. But, maybe it’s better to use sequences to

motivate continuity.)

(ii) xn → 0, 0 < |x| < 1. In general, for any n,m ∈ IN such that n < m, we have that

(1/|x|)n < (1/|x|)m. For any y ∈ IR, there is some n ∈ IN such that |y| < (1/|x|)n. Hence, for each

Λ ∈ IN∞ (1/|x|)Λ ∈ IR∞. Consequently, xΛ ∈ µ(0) for each Λ ∈ IN∞.

(iii) Let 0 < x, x 6= 1. Then x1/n → 1, n > 0. Consider that case that x > 1 and Sn = x1/n− 1.

Then x = (Sn + 1)n. Hence, x > nSn for each n > 0. Thus, by *-transform, x > Λ ∗SΛ for each

Λ ∈ IN∞. Consequently, 0 < ∗S(Λ) < (x/Λ) ∈ µ(0) for each Λ ∈ IN∞. Thus ∗S(Λ) ∈ µ(0), Λ ∈ IN∞

and result follows in this case.

Now, if 0 < x < 1, then 1 < 1/x and, as just shown, (1/x)1/Λ ∈ µ(1), Λ ∈ IN∞. Thus

(1/x)1/Λ − 1 = ǫ ∈ µ(0). Hence, 1 − x1/Λ = ǫ(x1/Λ) ∈ µ(0), since by *-transform 0 < x1/Λ < 1.

Therefore, x1/Λ ∈ µ(1) for this case also and the complete result follows.

(iv) n1/n → 1, n > 0. Consider again the sequence Sn = n1/n − 1. Then n = (1 + Sn)n =

∑nk=1

(nk

)

Skn ≥

(n2

)

S2n, n > 1. Thus, 0 ≤ Sn ≤ ( 2

n−1)1/2, n > 1. By *-transform, and in

particular, 0 ≤ ∗SΛ ≤ ( 2Λ−1 )

1/2, Λ ∈ IN∞. But, ( 2Λ−1 )

1/2 ∈ µ(0). Hence ∗S(Λ) ∈ µ(0) and the result

follows from the definition of Sn.

It seems that some of the above algebraic manipulations are what one might do if these limits

were established without using nonstandard procedures. There are major differences, however, in the

number of quantified statements one needs for the standard proofs as compared to the nonstandard.

Let’s establish a standard result by nonstandard means.

Theorem 4.5. Every convergent sequence of real numbers is bounded.

Proof. Let Sn → L ∈ IR. Then ∗S(Λ) ∈ µ(L) ⊂ G(0) for each Λ ∈ IN∞. Since ∗S[σIN] ⊂ G(0),

the result follows from Theorem 4.1.

Theorem 4.6. A set of real numbers B is bounded iff ∗B ⊂ G(0).

Proof. If B is finite, then it’s immediate that ∗B ⊂ G(0). If B is infinite, then there is some real

number x such that for each y ∈ B, |y| ≤ x. By *-transform of the obvious expression any a ∈ ∗B

has the property that |a| ≤ ∗x. Consequently, ∗B ⊂ G(0).

Conversely, if B is not bounded, then for n ∈ IN there is some x ∈ B such that |x| > n. Hence,

by *-transform, there is some p ∈ ∗B such that |p| ≥ Λ, Λ ∈ IN∞. From the remark made prior to

Theorem 3.14, p /∈ G(0) and the converse follows.

One of the first big results one encounters in sequential convergence theory is a sufficient condi-

tion for convergence. Recall that a sequence is monotone iff it is either an increasing or decreasing

function. The following characterization is what would be expected, that for monotone sequences

only one infinite number is needed for convergence.

24

Page 25: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 4.7. If S: IN → IR is monotone and there exists some Λ ∈ IN∞ such that ∗S(Λ) ∈G(0), then Sn → st( ∗S(Λ)).

Proof. Simply assume that S: IN → IR is increasing since the decreasing case is similar. I first

note that ∗y = st( ∗S(Λ)) ∈ σIR. By *-transform, the extension ∗S: ∗IN → ∗

IR is increasing. Thus

for Λ ∈ IN∞ and for each ∗m ∈ σIN, ∗S( ∗m) ≤ ∗S(Λ) and, since ∗S(Λ) ∈ G(0), the st( ∗S( ∗m)) =

∗(S(m)) ∈ IR. Consequently, for IR, then following sentence

∀x((x ∈ IN) → (S(x) ≤ y))

holds in M; and, hence, holds in ∗M. So, let Ω ∈ IN∞. Then ∗S(Ω) ≤ ∗y = st( ∗S(Λ)); which implies

that for each Ω ∈ IN∞, ∗S(Ω) ∈ G(0). Thus st( ∗S(Ω)) ∈ σIR for each such Ω and st( ∗S(Ω)) ≤ ∗y.

Let Ω > Λ. Then ∗S(Λ) ≤ ∗S(Ω); which implies that ∗y ≤ st( ∗S(Ω)). But, since the above

statement still holds for such a Ω, then ∗S(Ω) ≤ ∗y implies that st( ∗S(Ω)) ≤ ∗y. Let Ω < Λ. Then∗S(Ω) ≤ ∗S(Λ); implies st( ∗S(Ω)) = z and the above statement holds for z.. Thus, st( ∗S(Λ)) ≤st( ∗S(Ω)). Hence, st( ∗S(Ω)) = st( ∗S(Λ)) for each Ω ∈ IN∞. Consequently, ∗S(Ω) − ∗S(Λ) ∈ µ(0)

for all Ω ∈ IN∞ implies that ∗S(Ω) ∈ µ(st( ∗S(Λ))) for each Ω ∈ IN∞ and the result follows.

Corollary 4.8. A bounded monotone sequence converges.

Proof. By Theorem 4.1.

Please note that if a − b ∈ µ(0), and b ∈ G(0), then the intuitive statement that a ∈ µ(st(b))

does, indeed, hold. In a slightly more general mode recall that for a sequence S: IN → IR a real

number w is an accumulation point or limit point for S iff for each r ∈ IR+ and for each n ∈ IN,

there is some m ∈ IN that m > n and |Sm − w| < r. This definition allows 1 to be an accumulation

point of sequences such as 1, 1/2, 1, 1/3, 1, 1/4, 1, . . . where both 1 and 0 are accumulation points.

This definition does not correspond to most of the accumulation point definitions for point-sets.

However, there will be a another term used in chapter 8, that does so correspond.

Theorem 4.9.

(i) A w ∈ IR is an accumulation point for S: IN → IR iff there exists some Λ ∈ IN∞ such

that ∗S(Λ) ∈ µ( ∗w) = µ(st( ∗S(Λ)).

(ii) A sequence S: IN → IR has an accumulation point iff there exists some Λ ∈ IN∞ such

that ∗S(Λ) ∈ G(0).

Proof (i) Let w ∈ IR be an accumulation point for S. Then the sentence

∀x∀y((x ∈ IR+) ∧ (y ∈ IN) → ∃z((z ∈ IN) ∧ (z > y) ∧ (|S(z)− w| < x)))

holds in ∗M by *-transform. So, let 0 < ǫ ∈ µ(0) and Ω ∈ IN∞. Then there exists some Λ ∈ ∗IN such

that Λ > Ω and | ∗S(Λ)− ∗w| < ǫ. Hence, ∗S(Λ) ∈ µ( ∗w). Clearly, Λ ∈ IN∞.

Conversely, assume that there exists some Λ ∈ IN∞ such that ∗S(Λ) ∈ µ( ∗w), w ∈ IR. Note

that µ( ∗w) ⊂ ∗(w − y, w + y) for each y ∈ IR+ and that Λ > ∗n for all ∗n ∈ σ

IN. Hence, for given∗w, ∗y > ∗0 and a given ∗m, we have that

∃x((x ∈ IN) ∧ (x > ∗m) ∧ (| ∗S(x)− ∗w| < ∗y))

holds in ∗M; and, hence, in M by reverse *-transform. The result follows.

(ii) This follows “immediately” from (i).

25

Page 26: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 4.10. (i) A sequence S: IN → IR has a subsequence that converges to w ∈ IR iff there

exists some Λ ∈ IN∞ such that ∗S(Λ) ∈ µ(w).

(ii) A sequence has a convergent subsequence iff there is some Λ ∈ IN∞ such that ∗S(Λ) ∈G(0) iff ( ∗S[IN∞] ∩G(0) 6= ∅ ).

Proof. (i) Assume that for Λ ∈ IN∞ that ∗S(Λ) ∈ µ( ∗w). Then w is an accumulation point.

You start with n = 0 and take y = 1. Then you have an Sm such that |Sm − w| < 1. Let S′0 = Sm.

Now take y = 1/2 and consider the next Sk as the one for k > m and |Sk − w| < 1/2. This idea

can be restated in an induction proof for the other 1/n with no great difficulty. This subsequence

obviously converges to w. (Have I used the Axiom of Choice to obtain the Sk?)

On the other hand, if S′: IN → IR is a subsequence of the sequence S and it converges to L, then∗S′[IN∞] ∩G(0) 6= ∅ implies, since ∗S′[IN∞] ⊂ ∗S[IN∞], that L is an accumulation point by Theorem

4.9. (ii) is obvious.

Theorem 4.11. A bounded sequence has a convergent subsequence.

Proof. From Theorems 4.1 and 4.10.

Have I convinced you that the notion of what happens with the truly infinite tail piece of a

sequence does determine all that seems necessary for basic convergence? No. Well, let’s look at

another idea, the special types of divergence written as Sn → +∞ [resp. −∞].

Recall that a sequence Sn → +∞ [resp. −∞] iff for each y > 0 [resp. y < 0] there exists an

m ∈ IN such that for each n ∈ IN such that n ≥ m, Sn ≥ y [resp. Sn ≤ y]. How do we intuitively

state such stuff as this? One might say that S converges to “plus infinity” or converges to

“negative infinity.” But, in basic real analysis, the “numbers” ±∞ do not actually exist.

Theorem 4.12. For sequence S: IN → IR, Sn → +∞ [resp. −∞ ] iff for each Λ ∈ IN∞

∗S(Λ) ∈ IR+∞ [resp. IR

−∞ ], where IR

+∞ = Λ | ∗0 < Λ ∈ IR∞ [resp. IR

−∞ = Λ | ∗0 > Λ ∈ IR∞], iff

( ∗S[IR∞] ⊂ IR+∞ [resp. IR−∞]).

Proof. Assume that Sn → +∞. We can assume that Sn > 0 for each n ∈ IN since it is not

true for only finitely many n. Suppose that there exists some Λ ∈ IN∞ such that ∗S(Λ) /∈ IR+∞.

Thus, ∗S(Λ) ∈ G(0). Therefore there is a subsequence of S, S′: IN → IR and ∗S′n → L, L ∈ IR and

L = st( ∗S′(Λ)). Thus, there exists an m ∈ IN such that for all n ≥ m, |S′(n) − L| < 1. Hence, for

each such n, 0 < S′(n) < L+ 1. Thus, considering y = L+1 there does not exist a p ∈ IN such that

for each n ∈ IN, where n ≥ p, Sn ≥ y.

Conversely, suppose that for each Λ ∈ IN∞, ∗S(Λ) ∈ IR+∞. Let y > 0. Consider Ω ∈ IN∞. If

Λ ≥ Ω, then Λ ∈ IN∞ and under the hypothesis, ∗S(Λ) > ∗y. Consequently, the sentence

∃x((x ∈ ∗IN) ∧ ∀z((z ∈ ∗

IN) ∧ (z ≥ x) → ( ∗S(z) > ∗y))

holds in ∗M and, hence, in M. This result for the positive infinite numbers follows by reverse

*-transform. The case for the negative infinite numbers follows in like manner and the proof is

complete.

By the way, notice how easily the next result is established.

Theorem 4.13. If S : IN → IR converges to L ∈ IR, then L is unique.

Proof. If L 6= M ∈ IR, then µ(L) ∩ µ(M) = ∅.

26

Page 27: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Let’s recap some the significant nonstandard characterizations for a sequence S: IN → IR. Notice

that they are all quantifier (∀, ∃) free.

Theorem 4.14. Given a sequence S: IN → IR. Then

(i) S is bounded iff ∗S[∗IN] ⊂ G(0),

(ii) Sn → L iff ∗S[IN∞] ⊂ µ(L) ⊂ G(0),

(iii) S has a convergent subsequence iff ∗S[IN∞] ∩G(0) 6= ∅.(iv) Sn → ±∞ iff ∗S[IN∞] ⊂ IR

±∞.

I wonder whether S: IN → IR has a subsequence that converges to ±∞ iff ∗S[IN∞] ∩ IR±∞ 6= ∅?

So far, to show that a specific sequence converges we needed to guess at what the limit might be.

One of the more important notions was considered by Cauchy, the Cauchy Criterion, that for the

real numbers characterizes convergence without having to guess at a limit L. A sequence S is called

a Cauchy sequence iff for each y ∈ IR+, there is some m ∈ IN such that for each pair p, q ∈ IN such

that p, q ≥ m, it follows that |S(p)− S(q)| < y.

Theorem 4.15. (Nonstandard Cauchy Criterion.) A sequence S: IN → IR is Cauchy iff

∗S(Λ)− ∗S(Ω) ∈ µ(0)

for each Λ,Ω ∈ IN∞.

Proof. For the necessity, simply let real y > 0, then there exists some my ∈ IN such that the

sentence

∀x∀z((x ∈ IN) ∧ (z ∈ IN) ∧ (z ≥ my) → (|S(x)− S(z)| < y))

holds in M and, hence, in ∗M. In particular, if Λ,Ω ∈ IN∞, then Λ,Ω > ∗my for any such my

implies that | ∗S(Λ)− ∗S(Ω)| < ∗y for any y > 0. Consequently, ∗S(Λ)− ∗S(Ω) ∈ µ(0).

The sufficiency follows in the usual manner since µ(0) ⊂ ∗(− y, y) for each y > 0 and IN∞ 6= ∅imply that the sentence

∃w((w ∈ IN) ∧ ∀z∀x((z ∈ IN) ∧ (x ∈ IN) ∧ (x ≥ w) ∧ (y ≥ w) → (|S(x)− S(z)| < y))

holds in M and the proof is complete.

Theorem 4.16. A sequence S: IN → IR converges iff it is Cauchy.

Proof. Suppose that Sn → L ∈ IR. Then for each pair Λ, Ω ∈ IN∞, ∗S(Λ) − L ∈ µ(0), and∗S(Ω)− L ∈ µ(0). Hence, ∗S(Λ)− ∗S(Ω) ∈ µ(0).

For the converse, let S: IN → IR be Cauchy. Then for Λ, Ω ∈ IN∞, we have that ∗S(Λ)− ∗S(Ω) ∈µ(0) from Theorem 4.15. Let Λ ∈ IN∞ and ∗S(Λ) ∈ G(0). Then ∗S[IN∞] ⊂ µ(st( ∗S(Λ)) = µ( ∗L)

implies that Sn → L. So, assume the other possibility, that ∗S(Ω) /∈ G(0) for any Ω ∈ IN∞. This

implies that S is unbounded. Let m ∈ IN and let y = max|Sm ± 1|, |S0|, . . . , |Sm|. Then there is

some p ∈ IN such that |Sm ± 1| < y < |Sp| and p > m. Thus by *-transform, given any Λ ∈ IN∞,

there is some Ω ∈ IN∞ such that | ∗S(Λ)± ∗1| < S(Ω). Notice that (i) ∗S(Λ)± ∗1 ∈ IN+∞, in which

case, since ∗S(Λ)− ∗S(Ω) ∈ µ(0), it follows that ∗S(Ω) ∈ IN+∞ or (ii) ∗S(Λ)± ∗1 ∈ IN

−∞, in which case

∗S(Ω) ∈ IN−∞. For case (i), consider ∗S(Λ) + 1 < ∗S(Ω); for case (ii), consider ∗S(Ω) < ∗S(Λ) − 1.

For these two cases, this yields that 1 < | ∗S(Ω) − ∗S(Λ)| /∈ µ(0). This contradicts the hypothesis

that ∗S(Λ)− ∗S(Ω) ∈ µ(0). The proof is now complete.

27

Page 28: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

5. ADVANCED SEQUENTIAL CONVERGENCE

Recall that a double sequence S: IN× IN → IR converges to L ∈ IR iff for each y ∈ IR+, there

is some p ∈ IN such that for each pair n,m ∈ IN, such that n,m ≥ p and |S(n,m) − L| < y. The

same nonstandard characteristics hold for such convergence as in the single sequence case.

Theorem 5.1. A sequence S: IN× IN → IR converges to L ∈ IR iff ∗S(Λ,Ω)−L ∈ µ(0) for each

Λ,Ω ∈ IN∞ iff ∗S(Λ,Ω) ∈ µ(L) for each Λ,Ω ∈ IN∞ iff st( ∗S(Λ,Ω)) = L for each Λ,Ω ∈ IN∞ iff

( ∗S[IN∞ × IN∞] ⊂ µ(L).)

Proof. With but almost trivial alterations, this proof is the same as the one for Theorem 4.2.

Example 5.2 Let S(m,n) = m1+mn2 . Then for each Λ,Ω ∈ IN∞, 1+ΛΩ2

Λ = 1Λ + Ω2 /∈ G(0).

Hence, Λ1+ΛΩ2 ∈ µ(0) and, thus, S(n,m) → 0.

The following results, and many more, for double sequences follow in the same manner as in

Chapter 4.

Theorem 5.3. Every convergent double sequence is bounded.

Theorem 5.4. (Nonstandard Cauchy Criterion.) The sequence S: IN × IN → IR converges to

L ∈ IR iff for each Λ,Ω,Λ′,Ω′ ∈ IN∞, ∗S(Λ,Ω)− ∗S(Λ′,Ω′) ∈ µ(0).

In the theory of double sequences, one of the interesting questions, at the least to most mathe-

maticians, is the role played by the iterated sequences, (in brief limit notation) limn(limm s(n,m))

and limn(limm s(n,m)). What this notation means is that, taking the first iterated limit, you might

have that for each n, limm S(n,m) = S′(n) ∈ IR. Then, maybe, limn S′(n) ∈ IR. Now for a conver-

gent double sequence, is it always the case that the iterated sequence converges? In the example

5.2, notice that for n = 0, S(n,m) diverges. Indeed, take any natural number a. Then the sequence

S(n,m) = m1+m(n−a)2 will have this same problem for n = a.

Example 5.5. Consider the sequence S(n,m) = m+1m+n+1 . Then for any n ∈ IN, S(n,m) → 1,

while for a fixed m, S(n,m) → 0. This shows that the double sequence does not converge since the

n,m ∈ IN are arbitrary pairs and as such it should not matter if one is held fixed and the other

varies, the limit being unique, as in this single sequence case, must be the same in all cases. As is

well know, this behavior for double sequences is simply a reflection of the same problems that occur

with multi-variable real valued functions.

The problem displayed by examples like 5.2, does not occur for members of IN∞ as indicated

by the following rather interesting pure nonstandard result.

Theorem 5.6. Let S(m,n) converge to L. Then for any sequence Ωm ∈ IN∞,

limm st( ∗S( ∗m,Ωm)) = L [resp. limn st(∗S(Ωn,

∗n)) = L.

Proof. Let S(m,n) → L. We know that for y ∈ IR+ there is some p ∈ IN such that for each

pair m,n ∈ IN and m ≥ p and n ≥ p, |S(m,n) − L| < y. Now p may be assumed fixed for the y.

By *-transform, it follows that | ∗S( ∗m, b)− L| < ∗y for each ∗m ≥ ∗p, and b ∈ ∗IN, b ≥ ∗p. Hence,

in particular for any sequence Ωm ∈ IN∞, | ∗S( ∗m,Ωm) − L| < ∗y. Now taking the standard part

operator on each side of this inequality implies that |st( ∗S( ∗m,Ωm))−L| ≤ y for each ∗m ≥ ∗p. This

statement is sufficient to state that st( ∗S( ∗m,Ωm)) → L. (Note: Technically, the m that appears

in the sequence notation Ωm should be considered as a member of σIN. But, this does not come

from the *-transform of any standard sequence or any allowed formal statement using our simple

language.)

28

Page 29: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

The use of a sequence such as Ωm changes the double sequence S(m,n) into a nonstandard

type of ordinary sequence. One of the major concerns for double sequences is their relation to the

iterated limits, where I’ll use abbreviated limit notation. There are, of course, standard theorems

that relate convergence of double sequences to the convergence of iterated limits.

Theorem 5.7. Let S(m,n) → L ∈ IR. Then limm(limn S(m,n)) = L iff limn S(m,n) exists for

each m ∈ IN.

Proof. The necessity is obvious. So, assume that limn S(m,n) = rm ∈ IR for each m ∈IN. Then, by *-transform for each Ω ∈ IN∞, st( ∗S( ∗m,Ω)) = rm for each m ∈ IN. All we need

to do is to consider any sequence Ωm ∈ IN∞, like the constant sequence Ωm = Ω, and obtain

limm(st( ∗S( ∗m,Ωm)) = limm rm = L by Theorem 5.6.

A theorem such as Theorem 5.7 holds with an interchange of the n and m symbols. Theorem

5.7 gives a condition under which an iterated limit will converge to the limit of a converging double

sequence. But, are there necessary and sufficient conditions that determine completely when the

limit of a double sequence corresponds to the limit of both iterated limits? Of course, if there is,

it probably is not obvious. We need something special to happen. Consider the limit statement for

limm S(m,n), where limm S(m,n) ∈ IR. Then limm S(m,n) converges uniformly in n iff for each

y ∈ IR+ there exists some p ∈ IN such that for each n ∈ IN, and m,m′ ∈ IN, where m, m′ ≥ p it

follows that |S(m,n)−S(m′, n)| < y. Thus, the p is such that the sequence S(m,n) seems to behave

like an ordinary convergent sequence independent from the actual value of n ∈ IN. Let’s see if this

notion has a somewhat simply nonstandard characteristic. Indeed, one that parallels the statement

for a sequence being Cauchy.

Theorem 5.8. Let S: IN× IN → IR. Then limm S(m,n) converges uniformly in n iff

∗S(Λ, n)− ∗S(Ω, n) ∈ µ(0)

for each Λ,Ω ∈ IN∞ and for each n ∈ ∗IN.

Proof. For the necessity, simply consider the *-transform. Use the fact that from the definition∗y is arbitrary, and then select particular Λ,Ω ∈ IN∞.

For the sufficiency, assume that for each n ∈ ∗IN, ∗S(Λ, n) − ∗S(Ω, n) ∈ µ(0) for each pair

Λ,Ω ∈ IN∞. Let y ∈ IR+. Notice that, for a particular Λ,Ω, there’s a Γ ∈ IN∞ such that Λ,Ω ≥ Γ

and | ∗S(Λ, n)− ∗S(Ω, n)| < ∗y. Thus, the sentence

∃x((x ∈ IN) ∧ ∀y∀z∀w((y ≥ x) ∧ (z ≥ x) ∧ (w ∈ IN) → (|S(y,w)− ∗S(z,w)| < ∗y)))

holds in ∗M and, hence, in M by reverse *-transform. This completes the proof.

Corollary 5.9. If for each n ∈ IN, limm S(m,n) = Sn ∈ IR, then limm S(m,n) converges

uniformly in n iff for each n ∈ ∗IN, ∗S(Λ, n)− ∗S(n) ∈ µ(0) for each Λ ∈ IN∞.

Proof. This follows immediately from Theorem 4.15, the Cauchy Criterion for convergence.

There is a standard necessary and sufficient condition for the equality of the limit of the double

sequence and its iterated limits .

29

Page 30: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 5.10. Let S: IN × IN → IR. Then limS(m,n) = limm(limn S(m,n)) =

limn(limm S(m,n)) ∈ IR iff

(i) limm S(m,n) converges uniformly in n and

(ii) limn S(m,n) converges for each n ∈ IN.

Proof. For the necessity, it’s clear that (ii) follows from the convergence of the iterated limit.

Then limn(limm S(m,n)) ∈ IR implies that limm S(m,n) ∈ IR for each n ∈ IN. Hence, by the Theorem

4.15, ∗S(Λ, n) − ∗S(Ω, n) ∈ µ(0) for each n ∈ σIN and each Λ,Ω ∈ IN∞. By Theorem 5.4, we also

have that ∗S(Λ,Γ)− ∗S(Ω,Γ) ∈ µ(0) for each Γ ∈ IN∞. Hence, ∗S(Λ, n)− ∗S(Ω, n) ∈ µ(0) for each

n ∈ ∗IN. Therefore, Theorem 5.8 yields that limm S(m,n) converges uniformly in n.

For the sufficiency, let limm S(m,n) − Sn = 0 for each n ∈ IN. Uniformly in n means

that this limit is independent from the n used. By *-transform, we have that for any Ω ∈IN∞, limm(st( ∗S( ∗m,Ω)) = st( ∗S(Ω)). Thus, for each Λ ∈ IN∞, st( ∗S(Λ,Ω)) = st( ∗S(Ω)).

From (ii), we have that, in like manner, st( ∗S(Λ,Ω)) = st( ∗S(Λ,Γ)) for each Γ,Ω ∈ IN∞.

Hence, ∗S(Γ) − ∗S(Ω) ∈ µ(0). Thus, ∗S(Λ,Γ) − ∗S(Γ) ∈ µ(0), ∗S(∆,Ω) − ∗S(Ω) ∈ µ(0), which

implies that ∗S(∆,Ω) − ∗S(Λ,Γ) ∈ µ(0) for all ∆,Ω,Λ,Γ ∈ IN∞. Hence, limS(m,n) = L =

limn(limm S(m,n)) → L ∈ IR by Theorem 5.4. Now apply Theorem 5.7 and the proof is com-

plete.

Although Theorem 5.10 is a necessary and sufficient condition, it’s often difficult to apply from

the knowledge of the iterated limit behavior. There are, as one would expect, special classes of

double sequences where convergence of an iterated limit implies that the double limit converges.

Many double sequences can be put into a form S(m,n): IN → IR, where limm S(m,n) → 0 for each

n ∈ IN and for each m ∈ IN, S(m,n) is decreasing [resp. increasing] in n.

Theorem 5.11. Let S(m,n): IN× IN → IR and limm S(m,n) = 0, for each n ∈ IN and S(m,n)

is decreasing [resp. increasing ] in n for each m ∈ IN. Then S(m,n) → 0.

Proof. I show this for the decreasing case since the increasing case is established in like manner.

We have that limm S(m,n) = S′(n) = 0 for each n ∈ IN. Thus, st( ∗S(Λ, ∗n)) = S′(n) = 0 for

each Λ ∈ IN∞. Now limn S′(n) = 0 implies since, S′ is decreasing, that 0 ≤ S′(n) = st( ∗S(Λ, ∗n))

for each n ∈ IN. Thus, in general, either ∗0 < ∗S(Λ,Ω) for Ω ∈ IN∞ or ∗S(Λ,Ω) ∈ µ(0). But, if∗0 < ∗S(Λ,Ω) ≤ ∗S(Λ, ∗n), then 0 ≤ st( ∗S(Λ,Ω)) ≤ st( ∗S(Λ, ∗n)) = 0. Hence, S(m,n) → 0 and

the proof is complete.

The real numbers are complete. Hence, any nonempty set A ⊂ IR that is bounded above (i.e.

there is some y ∈ IR such that x ≤ y for each x ∈ A) has a least upper bound that is denoted

by supA. This means that supA is an upper bound and if y ∈ IR is an upper bound for A, then

supA ≤ y. The greatest lower bound inf A exists for any nonempty B ⊂ IR that is bounded

below. These ideas are applied to sequences that have convergent subsequences. Indeed, if S[IN]

(the range of S) is bounded above [resp. below], them supS[IN] [resp. inf S[IN]] is an accumulation

point and there is a subsequence that converges to this point. (I’ll show in the proof of Theorem

5.14 (ii) a method that you can modify to establish this result.)

Definition 5.12 (lim, inf, lim, sup.) Given the sequence S: IN → IR. Let y ∈ E iff there is a

subsequence S′ of S that converges to y. The lower limit (for S) lim inf Sn = inf E, and the upper

limit (for S) lim supSn = supE.

Notice that if Sn has no upper bound [resp. lower bound], then there is a subsequence S′n such

that S′n → +∞ [resp. S′

n → −∞]. In order to consider subsequences that diverge in this ±∞ special

30

Page 31: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

sense, the two symbols −∞,+∞ are included in the set E and if −∞ ∈ E [resp. +∞ ∈ E], then,

by symbolic definition, let inf E = −∞ [resp. supE = +∞], and no other cases need to be defined

for sequences. We know that S has a subsequence that converges to L ∈ IR iff ∗S[IN∞] ∩ µ(L) 6= ∅.Further, the answer to the question I asked immediately after Theorem 4.14 is yes. So, because

of this, the definition of st can be extended to the case where a subsequence diverges to ±∞. For

Λ ∈ IN∞, if ∗S(Λ) ∈ IR±∞, let st( ∗S(Λ)) = ±∞. By Theorem 4.10, the following result clearly holds.

Theorem 5.13 Let S: IN → IR. Then lim inf Sn = infst( ∗S(Λ)) | Λ ∈ IN∞ and lim supSn =

supst( ∗S(Λ)) | Λ ∈ IN∞.

Theorem 5.14 Let S: IN → IR. Then

(i) lim inf Sn = −∞ [resp. lim supSn = +∞] iff there exists some Λ ∈ IN∞ such that∗S(Λ) ∈ IR

−∞ [resp. IR+∞] iff ∗S[∗IN ∩ IR

−∞] 6= ∅ [resp. IR+∞];

(ii) lim inf Sn = L ∈ IR [resp. lim supSn] iff there exists some Λ ∈ IN∞ such that ∗S(Λ) ∈µ(L) (i.e. st( ∗S(Λ)) = L) and for each Ω ∈ IN∞, ∗S(Ω) ∈ µ(L) or ∗S(Ω) > ∗S(Λ) [resp. <].

Proof. (i) Let lim inf Sn = −∞. This implies that for each y ∈ IR− and for each m ∈ IN there

exists some n ∈ IN such that n ≥ m and Sn < y. It should be obvious by now that such a statement

means in our nonstandard structure that for any a ∈ IR−∞ and Λ ∈ IN∞ there is a Ω ∈ IN such that

Ω ≤ Λ and, hence, Ω ∈ IN∞ such that ∗S(Ω) < a.

For the sufficiency, let y ∈ IR−, m ∈ IN. Then we know that if a ∈ IR

−∞, then a < y. The

hypothesis states that the sentence

∃x((x ∈ ∗IN) ∧ (x > ∗m) ∧ ( ∗S(x) < ∗y))

holds in ∗M; and, hence, in M. In like manner, for the sup . Thus (i) is established.

(ii) Since lim inf Sn = L, the set E contains, at least one real number. I’ll show that L ∈ E.

What we do know is that there is a subsequence of Sn that converges to some number ≥ L. Hence,

there exists Λ ∈ IN∞ such that ∗S(Λ) ∈ G(0). Let P = st( ∗S(a)) | (a ∈ IN∞)∧ ( ∗S(a) ∈ G(0) 6= ∅.Now lim inf Sn = inf P = L. From definition of “inf,” if real r > L, then there is some p ∈ P such

that 0 ≤ p− L < r − L. Thus, let 0 < r − L = 1/(2n), n ∈ IN, n 6= 0. Then there is some p(n) ∈ P

such that 0 ≤ p(n)−L < 1/(2n). Since p(n) is the limit of a subsequence Q, then there exists some

m ∈ IN such that |Q(m) − p(n)| < 1/(2n). Since Qn ∈ S[IN] then by defining Qm = Q′n, we have

that Q′ is a subsequence of S such that |Q′(n)−L| < 1/n, for each nonzero n ∈ IN. Hence, Q′n → L

implies that L ∈ P. Of course, P = E, as E was previously defined.

Now, there exists Λ ∈ IN∞ such that ∗S(Λ) ∈ µ(L). Assume that Ω ∈ IN∞ and ∗S(Ω) /∈ µ(L).

Since L 6= −∞, then (i) implies that ∗S(Ω) /∈ IR−∞. Further, note that ∗S(Ω) /∈ µ(r) for any real

r < L, since if this was so than there would be a subsequence of S that converges to r and then this

contradicts the notion of “inf.” Thus, in this case, ∗S(Ω) > ∗S(Λ) (recall the monads are disjoint).

The “sup” follows in like manner and the proof is complete. (The sufficiency is left to reader.)

Corollary 5.15. For a given S: IN × IN → IR, let E contain the limits of each converging

subsequence. Then inf E ∈ E [resp. supE ∈ E].

Proof. This is established in the above proof for real valued “inf” and “sup.” Now obviously

by definition, it also follows for the two defined cases of ±∞.

31

Page 32: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Example 5.16. Let S: IN → IR.

(i) Define Sn = (−1)n(1 + 1/(n + 1)). Let Λ ∈ IN∞ be a *-odd number. Then ∗S(Λ) =

−(1+1/(Λ+1)) ∈ µ(−1). Then taking a *-even Ω, it’s seen that ∗S(Ω) ∈ µ(1). Since for each n ∈ IN,

−1 ≤ Sn ≤ 1, we have that lim supSn = 1, lim inf Sn = −1. From Theorem 5.14, we also know that

for each Γ ∈ IN∞ that ∗S(Γ) ∈ µ(−1) or ∗S(Γ) ∈ µ(1) or ∗S(Λ) < ∗S(Γ) or ∗S(Γ) < ∗S(Ω).

(ii) Let Sn be the sequence of all the rational numbers. (Yes, technically there is such

a sequence.) Then simply from noticing that there exist negative and positive infinite rational

numbers, we have that lim inf Sn = −∞, lim supSn = +∞.

(iii) Let Sn and Qn be any two sequences. Then

lim inf Sn + lim inf Qn ≤ lim inf(Sn +Qn) ≤

lim sup(Sn +Qn) ≤ lim supSn + lim inf Qn.

Proof. Let A = st( ∗S(Λ)) | Λ ∈ IN∞), B = st( ∗Q(n)) | Λ ∈ IN∞. If A and B are both

nonempty, then nonempty st( ∗S(Λ)+ ∗Q(Λ)) = st( ∗S(Λ))+st( ∗Q(Λ)) | Λ ∈ IN∞ = A+B, where

this “addition” definition is obvious. The result now follows from the “well known” result (taking

into account the ±∞ possibilities) that inf A+ inf B ≤ inf(A+B) ≤ sup(A+B) ≤ supA+ supB.

(iv) Let Sn → L ∈ IR and Qn be any sequence. Then lim inf(Sn + Qn) = L +

lim inf Qn, lim sup(Sn +Qn) = L+ lim supQn.

Proof. Let A = st( ∗Q(Λ)) | Λ ∈ IN∞. By a trivial proof, when we use the symbols ±∞,

we mean that they correspond to any member of IR±∞ it follows symbolically that for any a ∈

G(0), ±∞+ a = ±∞. Recall how the definition of the st operator has been extended to ±∞. For

any a ∈ ∗IR, st(a) = ±∞ iff a ∈ IR

±∞. Thus under this definition A 6= ∅. This definition also satisfies

the usual extend algebra for ±∞. Now we know that for each Λ ∈ IN∞, ∗S(Λ) ∈ µ(L). Under this

extended definition, it follows that for each Ω ∈ IN∞ st( ∗S(Ω)+ ∗Q(Λ)) = st( ∗S(Ω))+st( ∗Q(Λ)) =

L + st( ∗Q(Λ)). The result follows as in the proof of (ii), that inf A + L = inf B. The “sup” part

follows in like manner.

(v) Let S(m,n) → L ∈ IR. Then limn(limm inf S(m,n)) = limn(limm supS(m,n)) =

limm(limn inf S(m,n)) = limm(limn supS(m,n)) = L.

Proof. Now this can be established by nonstandard means. But, it’s immediate from the fact

that S(m,n) → L iff every subsequence S′(m,n) → L by just considering n or m as fixed.

32

Page 33: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

6. BASIC INFINITE SERIES CONCEPTS

Sometimes it’s useful to simplified the notation for the finite and infinite series. Let A(n) =∑n

k=0 ak =∑n

0 ak, there k ∈ IN. Then by definition this series converges to L iff A(n) → L.

Hence, all of our previous nonstandard characteristics for sequential convergence apply. For example,

A(n) → L iff ∗A(Λ) ∈ µ(L) for each Λ ∈ IN∞. Notationally, you will also see this written as∑Λ

0∗ak ≈ L = ∗L. These are called hyperfinite or *-finite summations. Indeed, any set such as

n | ( ∗0 ≤ n ≤ Λ) ∧ (n ∈ ∗IN), where Λ ∈ IN∞ is a *-finite set. The reason it’s termed *-finite is

that *-finite sets satisfy any of the finite set properties that can be presented in our formal language.

To show that most of the basic manipulations done with a finite series hold for *-finite series, it’s

necessary to give a more formal definition for infinite series than is usually presented.

Definition 6.1. Let a: IN → IR. Then the partial sum function A(k) is defined inductively.

(i) Let A(0) = a0;

(ii) then A(k + 1) = A(k) + ak+1, k ∈ IN.

(iii) Further, define for any n,m ∈ IN, n < m, A(n,m) = A(m) − A(n) =∑m

n+1 ak and

A(−1, 0) = a0 and if m = n 6= 0, then A(n, n) = 0. Notice that A(n− 1, n) = an in all cases.

Observe A: IN → IR. Thus, there’s the nonstandard extension of this function to ∗A: IN → ∗IR.

Further, we know that for n ≤ m, n, k ∈ IN, A(m) = A(n)+A(n,m) = A(n,m)+A(n). This property

also holds for Λ ≤ Ω, Λ,Ω ∈ IN∞. But not every ordinary mathematical process that can be done will

hold in M for ∗A. Whatever holds must be expressible in our formal language. This is not always

possible. One thing that cannot be so expressed, generally, is the notion of “any rearrangement” of

the members of a infinite series. What is needed is a specifically stated rearrangement. For each

example, define Θn(k) = n − k for k ∈ [0, n]. Now applying this to the finite sequence of terms for

our finite series a0 + · · ·+ an yields b0 = an + · · ·+ bn = a0. This can be viewed as a new sequence,

and by *-transform, it has meaning for any Λ ∈ IN∞.

Because of how the “term generating” function is defined, it may be convenient to assume that

the first few terms, say a0, a1, a2, . . . , ak, k < n, all equal zero. I assume that all sequences ak

that are defined for n ∈ IN, where n > k, are extended, if necessary, to sequences defined on IN

by letting ai = 0, 0 ≤ i ≤ k. (There will be times when this is not done and the notation will

indicate this.) Moreover, it’s also clear that removing finitely many terms from a series does not

effect whether it converges or not. Thus, given original A: IN → IR, to determine whether A(n)

converges you can use a different B: IN → IR obtained by letting bn = ak+n for any fixed k > 0 and

then investigate the convergence of B(n). Of course, you would need to adjust the two limits if they

do converge. However, mostly, one is interested in the terms of a series, the ak. Further, note that,

for Λ,Ω ∈ IN∞, Λ < Ω, ∗A(Λ,Ω) =∑Ω

Λ+1∗ak.

Theorem 6.2 Let a: IN → IR be a bounded sequence. Then there exists ∗M ∈ σIR such that for

each Λ,Ω ∈ IN∞, Λ ≤ Ω∣∣∣∣∣

∑ΩΛ

∗akΩ− Λ + ∗1

∣∣∣∣∣≤ ∗M.

Proof. Since ak is bounded, then there exists some M ∈ IR such that |ak| ≤ M. One of the most

significant results for finite series is that for n ≤ m, |∑mn ak| ≤

∑mn |ak|. By defining the sequence

bk = |ak|, then by *-transform it follows that for Λ,Ω ∈ IN∞, Λ ≤ Ω, |∑ΩΛ

∗ak| ≤∑Ω

Λ∗bk =

33

Page 34: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

∑ΩΛ | ∗ak|. Now for the standard series since |ak| ≤ M, then

∑mn ak ≤ M(m−n+1). By *-transform,

it follows that for Λ,Ω ∈ IN∞, Λ ≤ Ω, |∑ΩΛ ak| ≤ ∗M(Ω− Λ + ∗1) and the result follows.

I restate some of the previous nonstandard sequence results that now characterize convergence

of the series A(n).

Theorem 6.3. The series A(n) → L iff ∗A(Λ)−L ∈ µ(0), for each Λ ∈ IN∞ iff ∗A(Λ) ∈ µ(L),

for each Λ ∈ IN∞ iff st(∑Λ

0 ak) = L for each Λ ∈ IN∞ iff ∗A[IN∞] ⊂ µ(L).

Theorem 6.4. (i) A(n) → L iff for each Λ,Ω ∈ IN∞, Λ ≤ Ω,∑Ω

Λ ak ∈ µ(0) iff ∗A(Λ,Ω) ∈ µ(0).

(ii) If A(n) → L, then aΩ ∈ µ(0), for each Ω ∈ IN∞.

Proof. (i) First, note that if λ = Ω, then ∗A(Λ,Ω) = 0 and ∗A(Ω− 1,Ω) = ∗A(Ω)− ∗A(Ω− 1) =∑Ω

Ω ak = aΩ. If Λ < Ω, then ∗A(Λ,Ω) = ∗A(Ω)− ∗A(Λ) =∑Ω

Λ+1 ak. If Λ > Ω, then ∗A(Ω)− ∗A(Λ) =

−∑ΩΛ+1 ak. The result, in general, comes from the Cauchy Criterion and, clearly, we may assume

that Λ ≤ Ω. (ii) This is immediate.

Although it’s not required in our investigations, the converse of Theorem 6.4 (ii) holds for

certain series. Recall what Godel wrote, that removing one quantifiers from a characterization is

significance. So far, the nonstandard characterization do just this and often remove all quantifiers.

Theorem 6.5. If A(n) → L, then an → 0.

Proof. From Theorem 6.4 (ii).

Example 6.6.

(i) Let ak = 1(k+1)(k+2) . Let Λ ∈ IN∞. Then ∗A(Λ) =

∑Λ0

1(k+1)(k+2) =

∑Λ0

1k+1 −

∑Λ0

1k+2 =

1+∑Λ

11

k+1 −∑Λ

11

k+1 − 1Λ+2 = 1+0− 1

Λ+2 by *-transform of the finite case and I have applied the

convention of writing ∗x = x for ∗x ∈ σIR. But, 1

Λ+2 ∈ µ(0). Hence A(n) → 1 or, as is often written,∑∞

0 ak = 1.

(ii) Let A(x) =∑∞

0 xk, x 6= 1. We know that, in general, ak(x) = 1−xk+1

1−x . Hence, ∗aΛ =1−xΛ+1

1−x = 11−x + xΛ+1

1−x for x ∈ σIR. If |x| < 1, then xΛ+1

1−x ∈ µ(0) implies that ∗aΛ(x) ∈ µ( 11−x ). On the

other hand, if |x| > 1, then 11−x − xΛ+1

1−x /∈ G(0), for Λ ∈ IN∞. Thus, the series diverges.

When compared with a general series, it’s often easier to show that a non-negative type series

converges or diverges. A series A(n) is non-negative iff there is some m ∈ IN such that ak ≥ 0 for

each k ≥ m.

Theorem 6.7. A non-negative A(n) converges iff there is some Λ ∈ IN∞ such that ∗A(Λ) ∈G(0) iff ∗A[∗IN] ⊂ G(0).

Proof. There is some m ∈ IN such that ak ≥ 0 for all k ≥ m. Thus we write A(n) in terms of a

new sequence B(n), where A(n) = B+B(n) and B(n) is an increasing sequence. The result follows

from Theorems 4.2 and 4.7 and the fact that ∗B(Λ) ∈ G(0) iff B + ∗B(Λ) ∈ G(0) for any B ∈ IR.

And, an increasing sequence converges iff it is bounded. The result follows for A is bounded iff B is

bounded.

Theorem 6.8. A non-negative A(n) diverges iff there is some Λ ∈ IN∞ such that ∗A(Λ) /∈ G(0)

iff ∗A(∗IN) 6⊂ G(0).

There are standard arithmetical results that can aid in determining convergence.

34

Page 35: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 6.9. Given A: IN → IR. Let f : IN → IN have the property that, for each A(f(n+1))−A(f(n)) ≥ bn for each n ≥ j [resp. ≤]. Then for each n ∈ IN, n ≥ j

A(f(n+ 1)) ≥ A(f(j)) +

n∑

j

bk, [resp. ≤ ].

Proof. [For ≥.] For n = j, clearly, the result holds. So, assume it for m > j. Then consider

m + 1. Since A(f(m + 2)) − A(f(m + 1)) ≥ bm+1, then A(f(m + 2)) ≥ A(f(m + 1)) + bm+1 ≥A(f(j)) +

∑nj bn + bm+1 = A(f(j)) +

∑m+1j bk the result holds by induction.

Example 6.10. Let’s determine convergence or divergence directly for two very well know

series. In all cases, we extend any series to include the necessary zero terms although they might be

directly mentioned.

(i) Consider bk = 1/k, k > 0. We look at the series a0 = 0, ak = 1/k, k ≥ 1. Define f : IN → IN

by letting f(n) = 2n. Then, for n ≥ 1,

A(2n+1)−A(2n) =

2n+1∑

2n+1

1

k=

2n∑

1

1

2n + k≥ 2n

2n+1= 1/2.

Applying Theorem 6.9, A(2n+1) ≥ A(1) +∑n

1 (1/2) = 1/2 + n/2. Consequently, for Λ ∈ IN∞,

A(2Λ+1) ∈ IR∞ and the series diverges.

(ii) Now let’s look at famous “p” series, where bk = 1/kp, k > 0. (a) First, let p > 1 and look at

the series a0 = 0 and ak = 1/kp, k ≥ 1. As done for (i) A(2n+1)−A(2n) =∑2n

11

(2n+k)p . I note that

each term of this sum is less than 2−pn and there are 2n terms. Hence∑2n

11

(2n+k)p < 2n

2np implies

from Theorem 6.9 that

0 ≤ A(2n+1) ≤ 1/2p +

n∑

1

2k

2kp.

But, since p > 1, ∗A(2Λ+1) ∈ G(0) and the non-negative series converges.

Now, for 0 ≤ p ≤ 1, each term of the finite sum 1(2n+k)p ≥ 1

2(n+1)p . Hence, as done above the

A(2n+1)−A(2n) =2n∑

1

1

(2n + k)p≥ 2n

2(n+1)p≥ 1

2p.

Consequently from Theorem 6.9, A(2n+1) ≥ 1/2p + n2p and for this non-negative series ∗A(2λ+1) /∈

G(0) and the series diverges. It obviously diverges for all p < 0.

All of the standard converges or divergence tests can be translated into appropriate nonstandard

statements. However, here is an interesting nonstandard comparison test.

Theorem 6.11. Let∑∞

0 ak be a non-negative. If non-negative B(n) converges and there is

some c ∈ ∗IR such that 0 ≤ c and c ∈ G(0) and, for each Λ ∈ IN∞, ∗a(Λ) ≤ c( ∗b(Λ)), then A(n)

converges.

Proof. Assume that B(n) converges. Then for each Λ,Ω ∈ IN∞ such that Λ ≤ Ω,∑Ω

Λ∗bk ∈

µ(0). Also there is some r ∈ IR+ such that c ≤ ∗r. Hence ∗0 ≤ ∗aΛ ≤ c( ∗b(Λ)) ≤ ∗r( ∗b(Λ)), for

each Λ ∈ IN∞. By *-transform of the finite case, this implies that 0 ≤ ∑ΩΛ

∗ak ≤ ∑ΩΛ

∗r( ∗bk) =∗r∑Ω

Λ∗bk ∈ µ(0). This completes the proof.

35

Page 36: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 6.12. If non-negative∑∞

0 bk diverges and there exists c > 0, c ∈ ∗IR − µ(0) and

∗a(Λ) ≥ c( ∗b(Λ)) for each Λ ∈ IN∞, then∑∞

0 ak diverges.

Proof. Since∑∞

0 bk diverges, then there exist Λ,Ω, Λ ≤ Ω and∑Ω

Λ∗bk /∈ µ(0). We also know

that there exists some r ∈ IR+ such that ∗r < c. Therefore ∗r

∑ΩΛ

∗bk =∑Ω

Λ∗r ∗bk /∈ µ(0). Hence,

since∑Ω

Λ∗ak ≥∑Ω

Λ∗r ∗bk > 0, then

∑ΩΛ

∗ak /∈ µ(0) and the result follows.

Example 6.13. Assume that∑∞

0 anuk converges for u 6= 0. Then

∑∞

0 akxk converges abso-

lutely for each x such that |x| < |u|.Proof. Let b = |x|/|u| < 1. Hence, the geometric series

∑∞

0 bk converges for such an x. Let

Λ ∈ IN∞ Then ∗a(Λ)uΛ ∈ µ(0), since∑∞

0 anuk converges. Notice that

| ∗a(Λ)xΛ| =∣∣∣∣∗a(Λ)uΛ xΛ

∣∣∣∣= | ∗a(Λ)uΛ|bΛ < bΛ,

since | ∗a(Λ)uΛ| < 1. You can apply Theorem 6.11, where c = 1.

In the Chapter “Series of nonnegative terms” (1964, p. 55) W. Rudin states that “One might

thus be led to the conjecture that there is a limiting situation of some sort, a ‘boundary’ with all

the convergent series on one side, all the divergent series on the other side - at least as far as a series

with monotonic coefficients are concerned. This notion of ‘boundary’ is of course quite vague. The

point we wish to make is this: No Matter how we make this notion precise, the conjecture is false.”

However, Rudin’s statement using the phrase “No matter how” in this section on non-negative series

is itself false. Theorems 6.7 and 6.8 show that G(0) is just such a “boundary.”

Here is another example of the usefulness of the nonstandard methods and direct proofs.

Example 6.14. Let each ak > 0,∑∞

0 ak converge and ak+1 ≤ ak for all k ∈ IN. Then

limnan = 0.

Proof. Let Λ ∈ IN∞. For each non-negative real number r, there exists a unique natural

number [r] such that [r] ≤ r < [r] + 1. This statement and property can be written in our formal

language. Thus by *-transform, since Λ/2 ∈ IR∞, there exists a > 0, a ∈ ∗IR and a = [Λ/2].

Now 0 ≤ [Λ/2] − Λ/2 < 1. Hence, necessarily, a = Ω = [Λ/2] ∈ IN∞. Then ∗A(Λ) − ∗A(Ω) ≥(Λ − Ω) ∗a(Λ) ≥ (Λ/2) ∗a(Λ) ≥ ∗0 since such a statement holds in IR. Thus, Λ ∗a(Λ) ∈ µ(0) and the

result follows.

36

Page 37: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

7. AN ADVANCED INFINITE SERIES CONCEPT

Some of the most interesting aspects of the nonstandard theory of infinite series are developed

when various infinite series product notions are probed. But we need the following result Abel’s

summation by parts.

Theorem 7.1. Let series A: IN → IR, B: IN → IR. Then for each p, q ∈ IN, p ≤ q

q∑

p

akbk =

q∑

p

(A(k)−A(k − 1))bk =

q∑

p

A(k)bk −q−1∑

p−1

A(k)bk+1 =

q∑

p

A(k)(bk − bk+1)−A(p− 1)bp +A(q)bq+1,

where A−1 = 0.

Theorem 7.2. For each Λ,Ω ∈ IN∞, Λ ≤ Ω

Ω∑

Λ

∗ak∗bk =

Ω∑

Λ

A(k)( ∗bk − ∗bk+1)− ∗A(Λ − 1) ∗b(Λ) + ∗A(Ω) ∗b(Ω + 1).

Proof. By *-transform.

One can immediately induce upon the right-hand side of the equation in Theorem 7.2 various

requirements that will force it to be an infinitesimal. This will be seen in the proof of Theorem 7.3.

But, first, notice that for the collapsing series∑∞

0 (bk−bk+1), we have for each Λ,Ω ∈ IN∞, Λ ≤ Ω, if

|∑ΩΛ

∗bk− ∗bk+1| = | ∗b(Λ)− ∗b(Ω+1)| ≤∑ΩΛ | ∗bk− ∗bk+1| ∈ µ(0), then ∗b(Λ)− ∗b(Ω+1) ∈ µ(0),

for each such Ω and Ω.

Theorem 7.3. If∑∞

0 (bk−bk+1) converges absolutely and A: IN → IR is bounded, then∑∞

0 akbk

converges.

Proof. From Theorem 7.2,

|Ω∑

Λ

∗ak∗bk| ≤

Ω∑

Λ

|A(k)( ∗bk − ∗bk+1)|+ | − ∗A(Λ − 1) ∗b(Λ) + ∗A(Ω) ∗b(Ω + 1)|.

Since there is some r ∈ IR+ such that | ∗A(Γ)| ≤ r for each Γ ∈ IN∞, then

|Ω∑

Λ

∗ak∗bk| ≤ r

((Ω∑

Λ

| ∗bk − ∗bk+1|)

+ | ∗b(Ω + 1)− ∗b(Λ − 1)|)

.

Since∑∞

0 (bk − bk+1) converges absolutely, then | ∗b(Ω+ 1)− ∗b(Λ− 1)| = |∑ΩΛ−1(

∗bk − ∗bk+1)| ≤∑Ω

Λ−1 | ∗bk − ∗bk+1| ∈ µ(0),∑Ω

Λ | ∗bk − ∗bk+1| ∈ µ(0) and the result follows.

Corollary 7.4. Let A: IN → IR be bounded. If∑∞

0 (bk − bk+1) converges and bk is decreasing,

then∑∞

0 akbk converges.

Proof. Observe that for each k ∈ IN, bk − bk+1 ≥ 0 and, hence,∑∞

0 (bk − bk+1) is absolutely

convergent. The result follows from the previous theorem.

Now let’s complete this chapter by investigating the “Cauchy product” and show how nonstan-

dard methods aid intuition. I’ll “play around” with the “subscript” notation somewhat and one

37

Page 38: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

needs to understand what the double summation symbol is actually trying to indicate. The inner

most of the two summations symbols will always indicate a “finite” summation where the index

limit symbol is considered as fixed. Thus the notation∑n

k=0

(∑k

j=0 ajbk−j

)

means that you fixed

each k, 0 ≤ k and obtain the value of the finite sum∑k

j=0 ajbk−j . Then add all of the n+1 results

together to get the double summation. I won’t go through what some consider to be an “easy” proof

that for non-trivial n ≥ 1.

(n∑

0

ak

)(n∑

0

bk

)

=

C︷ ︸︸ ︷

n∑

k=0

k∑

j=0

ajbk−j

+

In︷ ︸︸ ︷

n−1∑

k=0

k∑

j=0

an−k+jbn−j

. (7.5)

In the above expansion, the double sum indicated by the C is the most significant. Indeed, let

ck =∑k

j=0 ajbk−j . This is often called the Cauchy product. Then you have the sequence (i.e.

series) C(n) =∑n

0 ck =∑n

k=0

(∑k

j=0 ajbk−j

)

.

Theorem 7.6. Let A(n) → La and B(n) → Lb. Then C(n) → LaLb iff, for any Ω ∈IN∞,

∑Ω−1k=0

(∑k

j=0∗a(Ω− k + j) ∗b(Ω− j)

)

∈ µ(0)

Proof. From the hypotheses,(∑Ω

0∗ak)

)

∈ µ(La) and(∑Ω

0∗bk)

)

∈ µ(Lb) for any

Ω ∈ IN∞. Hence,(∑Ω

0∗ak

)(∑Ω

0∗bk

)

∈ µ(LaLb). Now∑Ω

0 ck =∑Ω

k=0

(∑k

j=0∗aj

∗bk−j

)

. But,∑Ω

k=0

(∑k

j=0∗aj

∗bk−j

)

∈ µ(LaLb) iff∑Ω−1

k=0

(∑k

j=0∗a(Ω− k + j) ∗b(Ω− j)

)

∈ µ(0).

Although Theorem 7.6 indicates what portion of the right-hand side of equation (7.5) must be

infinitesimal for the Cauchy product to equal the product of the limits of two converging series,

this characterization is not the most useful. Using our previous notation, consider the sequences A

and B and C. Suppose that B(n) → Lb. You should be able to show that for all n ∈ IN, C(n) =

A(n)Lb +∑n

0 ak(B(n − k) − Lb). What is needed in the next few theorems is the notion of the

maximum member of any nonempty finite set determined by a given sequence Q: IN → IR. The

following sentence holds in M.

∀x∀y((x ∈ IN) ∧ (y ∈ IN) ∧ (x ≤ y) → ∃z((z ∈ IN) ∧ (x ≤ z ≤ y)∧

∀w((w ∈ IN) ∧ (x ≤ w ≤ y) → Q(w) ≤ Q(z)))) (7.7)

(Recall that Q(x) ≤ Q(z) is but a short-hand notation for (Q(x), Q(y)) being in the ≤ binary

relation.) For any two i, j ∈ IN, i ≤ j, the Q(w) is called the maximum value in the nonempty

finite set Q(x) | i ≤ x ≤ j. It’s denoted by maxQ(x) | i ≤ x ≤ j. Further, under *-transform

such a member of ∗IR exists for Λ,Ω ∈ IN∞, Λ ≤ Ω. We use this to establish the following theorem.

Theorem 7.8. Let B(n) → Lb. If A(n) → La absolutely, then C(n) → LaLb

Proof. Since B(n)− Lb → 0, then ∗B(n)− ∗Lb ∈ G(0) for each n ∈ ∗IN. Also for each r ∈ IR

+

there is some m ∈ IN such that for each n > m, n ∈ ∗IN, | ∗B(n) − ∗Lb| < ∗r. Let La =

∑∞

0 |ak|.For any Ω ∈ IN∞, consider (in simplified notation)

A1 = |Ω∑

0

∗ak(∗B(Ω− k)− Lb)| ≤

Ω−(m+1)∑

0

| ∗ak| | ∗B(Ω− k)− Lb|+

38

Page 39: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Ω∑

Ω−m

| ∗ak| | ∗B(Ω− k)− Lb| < rLa+

(m+ 1)max| ∗ak| | ∗B(Ω− k)− Lb| | Ω−m ≤ k ≤ Ω = rLa + (m+ 1)ǫ,

where ǫ ∈ µ(0), for Λ−m ≤ k ≤ Λ implies that | ∗ak| ∈ µ(0), which implies that | ∗ak| | ∗B(Ω− k)−Lb| | Ω−m ≤ k ≤ Ω ⊂ µ(0). (I have used the *-transform of (7.7).) But, r is an arbitrary member

of IR+ implies that A1 ∈ µ(0) and the result follows from Theorem 7.6.

What if A(n) → La, B(n) → Lb, C(n) → Lc, then does it follow that Lc = LaLb? In order

to establish this, I establish, by nonstandard means, two special theorems that are useful for many

purposes.

Theorem 7.9. If S(n) → L, then limn

∑n

1sk

n = L = limn

∑n

0ak

n+1 , where ak = sk+1.

Proof. For each r ∈ IR+, | ∗S(Λ) − L| < r since for each Λ ∈ IN∞, ∗S(Λ) − L ∈ µ(0). So,

consider arbitrary r ∈ IR+. Let Ω ∈ IN∞, ρ = [

√Ω] as defined in Example 6.14. Then ρ ∈ IN∞.

Moreover, 1/Ω ≤ 1/ρ2. So, consider

∣∣∣∣∣

∑Ω1

Ω− L

∣∣∣∣∣≤∑Ω

1 | ∗S − L|Ω

≤∑ρ

1 | ∗Sk − L|ρ

1

ρ+

∑Ωρ+1 | ∗Sk − L|

Ω≤

r

ρ+

Ω− ρ

Ωmax| ∗Sx − L| | ρ+ 1 ≤ x ≤ Ω

I apply my previous discussion on the “maximum” object that exist in any such *-finite set. Since

(Ω − ρ)/Ω < 1 and all the objects in ∗ |Sx − L| | ρ + 1 ≤ x ≤ Ω are infinitesimals and r/ρ is an

infinitesimal, then the result follows.

Theorem 7.10. If an → A, bn → B, then

limn

∑n0 akbn−k

n+ 1= AB.

Proof. For each (i.e. ∀), Λ ∈ IN∞, k ∈ IN, let

DΛ =

∑Λ0

∗a(k) ∗b(Λ− k)

Λ + 1=

∑Λ0

∗b(Λ− k)( ∗a(k)−A)

Λ + 1+

A∑Λ

0∗b(Λ− k)

Λ + 1.

From convergence, for some M ∈ IR+, | ∗b(d − k)| ≤ M for each d ∈ ∗

IN, d ≥ k. Hence, | ∗b(Λ −k)( ∗a(k)−A)| ≤ M |( ∗a(k)−A)|, ∀Λ ∈ IN∞. From this and *-transform of the finite sum case,

0 ≤ EΛ =|∑Λ

0∗b(Λ− k)( ∗a(k)−A)|

Λ + 1≤ M

∑Λ0 | ∗a(k)−A|

Λ + 1, ∀Λ ∈ IN∞.

Since an → A, implies that |an −A| → 0, then from Theorem 7.9,

∑Λ0 | ∗a(k)−A|

Λ + 1∈ µ(0), ∀ Λ ∈ IN∞.

Therefore, EΛ ∈ µ(0), ∀Λ ∈ IN∞. Consequently,

DΛ −A

∑Λ0

∗b(Λ − k)

Λ + 1∈ µ(0), ∀Λ ∈ IN∞.

39

Page 40: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Note that, in general,∑n

0 bk =∑n

0 bn−k, ∀n ∈ IN. Thus, by Theorem 7.9,

∑Λ

0

∗b(Λ−k)

Λ+1 =

∑Λ

0

∗b(k)

Λ+1 ∈µ(B), ∀Λ ∈ IN∞. Hence, DΛ −AB ∈ µ(0), ∀Λ ∈ IN∞ and the result follows.

Theorem 7.11. If A(n) → La, B(n) → Lb, C(n) =∑n

k=0

(∑k

j=0 ajbk−j

)

→ Lc, then

Lc = LaLb.

Proof. Recall, that C(n) = A(n)Lb +∑n

0 ak(B(n − k) − Lb). This can be re-expressed as

C(k) =∑k

j=0 ajB(j − k). Then by re-arrangement of the terms, it follows that, in general,

n∑

0

cn =

n∑

0

A(k)B(n− k), ∀n ∈ IN.

Hence, by the previous two theorems,

limn

∑n0 ck

n+ 1= Lc = lim

n

∑n0 A(k)B(n− k)

n+ 1= LaLb

a the result follows.

40

Page 41: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

8. ADDITIONAL REAL NUMBER PROPERTIES

Since this is supposed to be a monograph covering some of the basic notions in a first course in

real analysis (i.e. calculus IV), then one should expect that certain additional real number properties

need to be explored. This is especially the case if a slight generalization of the notion of continuity

and the like is investigated. You will discover that, once again, the monad is the nonstandard

“king,” so to speak, in characterizing these concepts. There are slightly different definitions within

the subject of “point-set topology” for the set-theoretic “accumulation point.” I has chosen to use

a definition that makes this notion equivalent to the previous sequence definition.

Definition 8.1. Let A ⊂ IR. Then p ∈ IR is an accumulation point of (for) A iff, for every

w ∈ IR+, the open interval (−w + p, p+ w) ∩ A 6= ∅. A point p ∈ IR is a cluster point iff for every

w ∈ IR+, the deleted open interval = ((−w + p, p + w) − p) = (−w + p, p + w)′ ∩ A 6= ∅ iff

(−w + p, p+ w)′ ∩A = an infinite set.

A cluster point is an accumulation point but not conversely. Consider the set A = [1, 2] ∩ 3.Then 3 is an accumulation point, and not a cluster point. Also each member of a nonempty A is an

accumulation point.

Definition 8.2. The set of all accumulation points is called the closure of the set A ⊂ IR and

is denoted by A or clA.

Note that A ⊂ cl(A).

Definition 8.3. A point p ∈ A ⊂ IR is an interior point of A iff there exists some w ∈ IR+

such that (−w + p, p+ w) ⊂ A.

Definition 8.4. A point p ∈ A ⊂ IR is an isolated point of A iff there exists some w ∈ IR+

such that (−w + p, p+ w) ∩ A = p.

Notice that if S: IN → IR, then p is an accumulation point for the sequence iff p is an accumulation

point for the set S[IN] (i.e the range). It also follows that p is an accumulation point for A iff there’s

a sequence S of members of A such that S(n) → p. Also a point p ∈ IR is an isolated iff it is an

accumulation point and not a cluster point. This last statement characterizes the difference between

the notions of the accumulation point and cluster point. Cluster points are accumulation points

that are not isolated. Now how do monads characterize this set-theoretic notions?

Theorem 8.5. Let A ⊂ IR, p ∈ IR. Then

(i) p is an accumulation point iff µ(p) ∩ ∗A 6= ∅;(ii) p is an isolated point iff µ(p) ∩ ∗A = p;(iii) p is a cluster point iff the deleted monad µ(p)− p = µ′(p) ∩ ∗A 6= ∅ iff µ(p) ∩ ∗A =

an infinite set.

Proof. These are rather easy to establish and, as usual, depend upon *-transform. (i) Let p ∈ IR

be an accumulation point for A ⊂ IR. Then the formal sentence, which I’m sure you can obtain form

the informal,

∀x((x ∈ IR+) → ∃y(y ∈ A) ∧ |y − p| < x))

holds in M; and, hence in ∗M. So, let 0 < ǫ ∈ µ(0). Then there exists some a ∈ ∗A such that

|a− p| < ǫ; which implies that a ∈ µ(p).

41

Page 42: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Conversely, assume that µ(p)∩ ∗A 6= ∅. Obviously µ(p) ⊂ ∗(−w+ p, p+w), ∀w ∈ IR+. Hence,

letting b ∈ µ(p) ∩ ∗A and w ∈ IR+ the sentence

∃y((y ∈ ∗A) ∧ |y− ∗p| < ∗w))

holds in ∗M; and, hence, in M by reverse *-transform and the conclusion follows.

(ii) The sufficiency follows since p is an accumulation point. For the necessity, there exists

some w ∈ IR+ such that (−w + p, p+ w) ∩ A = p. Hence, ∗(− w + p, p+ w) ∩ ∗A = ∗p = p,

under our notation simplification, and the result follows for p ∈ µ(p) ⊂ ∗(− w + p, p+ w).

(iii) This follows from the observation about accumulation points, cluster points and iso-

lated points and the fact that the only standard number in µ(p) is p. The second iff follows, for if

otherwise there would be a “smallest” w1 ∈ IR+ such that ∗(− w1 + p, p+ w1)

′ ∩ ∗A 6= ∅.

Corollary 8.6. (i) A point p ∈ IR is an accumulation point for A ⊂ IR iff there exists some

a ∈ ∗A such that st(a) = p.

(ii) A point p ∈ IR is a cluster point for A ⊂ IR iff there exists an a ∈ ∗A − σA such that

st(a) = p.

For B ⊂ ∗IR, define the standard part of B as the set st(B) = x | (x ∈ IR) ∧ µ(x) ∩B 6= ∅.

Of course, you can consider st(B) ⊂ σIR. Notice that for any A ⊂ IR, the standard part operator is

defined, at the least, for all members of σA. Indeed, our definitions and characterizations for these

set-theoretic notions are only in terms of monads about standard points.

Theorem 8.7. Let A ⊂ IR. Then st(A) = clA.

Theorem 8.8. A point p ∈ IR is an interior point iff µ(p) ⊂ ∗A.

Proof. I’m sure you can show that µ(p) =⋂ ∗(−w+p, p+w) | w ∈ IR

+. Hence, the necessityfollows.

For the sufficiency, assume that p is not a member of the interior of A. Then for each w ∈IR

+, (−w+p, p+w)∩ (IR−A) 6= ∅. Thus, p ∈ cl(IR−A) and µ(p)∩ ∗(IR−A) = µ(p)∩ (∗IR− ∗A) 6= ∅implies that µ(p) 6⊂ ∗A and the proof is complete.

Definition 8.9. Let A ⊂ IR. Then the derived set A′ for A is the set of all cluster points. Notice

that the derived set contains no isolated points. Example, let A = (1, 2) ∪ 3. Then A′ = [1, 2].

Theorem 8.10. For A ⊂ IR, the set A′ = st( ∗A− σA), (using the extended definition for st ).

Proof. Theorem 8.5 (ii).

Theorem 8.11. For A ⊂ IR, the set of all isolated point is A− st( ∗A− σA).

Proof. An isolated point p for A is a member of A, and such a p is isolated iff µ(p) ∩ ∗A = piff µ′(p) ∩ ∗A = ∅ iff p ∈ A− st( ∗A− σA) (or in simplified notation) iff p ∈ A− st( ∗A−A).

A set A ⊂ IR is closed A = clA = st( ∗A). The set is open iff µ(p) ⊂ ∗A, ∀ p ∈ A. Please note

that ∅, IR are open and closed. (Actually, this is not the standard definition for an open nonempty

set. But, I leave it to you to show that this is equivalent to the statement that for each p ∈ A, there

exists a wp ∈ IR+ such that (−w + pp, p + wp) ⊂ A. Also A is perfect if it is closed and has no

isolated points.

42

Page 43: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 8.12 A set A ⊂ IR is perfect iff A = A′.

Proof. Please note that clA = A ∪ A′ Hence, a set is closed iff A′ ⊂ A. For the necessity,

A′ = st( ∗A− σA) = st( ∗A− σA)∪A = st( ∗A− σA)∪st(σA) = st(( ∗A− σA)∪ σA) = st(σA) = A.

The sufficiency is clear and this completes the proof.

Much of our interest will be restricted to the derived set. The reason for this is that for every

p ∈ A′ there is a sequence S: IN → p such that p /∈ S[IN]. Please consider the following remarkably

short proof of the Bolzano-Weierstrass theorem.

Theorem 8.13. If bounded and infinite A ⊂ IR, then st( ∗A − σA) 6= ∅ (i.e. A has a cluster

point).

Proof. Since A is infinite then ∗A − σA 6= ∅. Since A is bounded that ∗A ⊂ G(0). Thus,∗A− σA ⊂ G(0) implies that st( ∗A− σA) 6= ∅ and this completes the proof.

One of the most important topological concepts used throughout analysis is the notion of

“compactness.” Numerous equivalent definitions for this concept exist in the literature. I select the

most important for our purposes. Intuitively, compactness should mean “closely packet” or “close

together” but it’s different from the notion of density since density is usually a comparison between

two different sets. Often this intuitive understanding for “compactness” is not achieved from the

definition. I’ll give a nonstandard definition that yields this intuitive notion and then show that it’s

equivalent to one of the usual definitions.

Definition 8.14. A set A ⊂ IR is compact iff for each b ∈ ∗A there is some p ∈ A such that

b ∈ µ(p) (i.e. b ≈ p) iff ∗A ⊂ ⋃µ(p) | p ∈ A iff each b ∈ ∗A is near-standard (meaning ≈ to a

member p ∈ A.) The set⋃µ(p) | p ∈ A is often denoted by ns(A) (the set of all near-standard

points).

Our next, and what is a major, result requires what appears to be a rather long proof. I have

not introduced the idea of the δ-incomplete ultrafilter and concurrent relations. For the ultrafilters

I am considering and due to real number property discussed in the next paragraph, the sufficiency

part of the next theorem can be established in but a few lines using a concurrent relation. In general,

this result holds for topological spaces, using a concurrent relation, if a special type of ultrafilter is

used (Herrmann 1991).

A set G of nonempty open sets is said to cover of (for) A ⊂ IR iff A ⊂ ⋃G | G ∈ G. One

standard definition for “compactness” says, that A is compact iff for every open cover G there exists

a finite subset (a subcover) Gf ⊂ G such that Gf covers A. A set A is said to be countable iff either

A or there exists a one-to-one correspondence from IN onto A. The countably compact sets are

those that have this covering property but only for countable open covers. For the real numbers,

due mainly to the fact that the rational numbers are dense in the reals and for every real 0 < r < 1

there is a natural number n such that r < 1/n < 1, if nonempty G is an open set, then there exists

a rational number w ∈ IR+ and a rational number r ∈ IR such that p ∈ I = (−w + r, r + w) ⊂ G.

Thus, every open cover G of A can be replaced by a countable open cover Ii of such open intervals

and such that A ⊂ ⋃Ii ⊂ ⋃G | G ∈ G, where each member of G contains, at least, one member

of Ii. Hence, replace the covering definition for compactness with countable open covers by such

a collection of open sets.

43

Page 44: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 8.15. Let nonempty set A ⊂ IR. Then ∗A ⊂ ⋃µ(p) | p ∈ A iff every countable

open cover Ii for A has a finite subcover.

Proof. Assume that A satisfies the countable covering definition for compactness but that∗A 6⊂ ⋃µ(p) | p ∈ A. There exists some a ∈ ∗A such that a /∈ µ(p) for any p ∈ σ

IR. Consequently,

there is some open interval I(p) with rational end points about some rational number such that

a /∈ ∗Ip and p ∈ I(p). Let G be a set of all such intervals I(p). Then G is a countable cover of A

and there should exists a finite subcover, say I(p1), . . . , I(pn), such that A ⊂ I(p1) ∪ · · · ∪ I(pn).

Consequently, ∗A ⊂ ∗I(p1) ∪ · · · ∪ ∗I(pn). Hence, we have the contradiction that a ∈ ∗I(pi) for

some i = 1, . . . , n.

For the sufficiency, just assume that there is a countable open cover G of A which has no finite

subcover. Our basic aim is to construct by induction from G another cover and do it in such a manner

that a sequence of members of A exists which, when viewed from the ∗IR and with respect to any

free ultrafilter U , the equivalence class containing this sequence is not near to any member of A.

First, consider the nonempty countable set G′ = Ci | i = 1, 2, . . . = x∩A | (x ∈ G)∧ (x∩A 6= ∅).Let D0 = C1. Now, let m1 be the smallest natural number greater than 1 such that Cm1 6⊂ C1. This

unique number exists since C1 cannot be a cover of A for C1 ⊂ C for some C ∈ G. Assume that

the Dk have been defined. Let mk+1 be the smallest natural number great than mk such that

Cmk+16⊂⋃

Di | i = 1, . . . , k.

These unique natural numbers continue to exist since A is not covered by any finite subset of sets

in G. Now define Dk+1 = Cmk+1. The sets Dn, ∀n ∈ IN are defined by induction

Let G1 = Dn | n ∈ IN. Since G is a countable cover of A, then G1 is a countable cover, although

not generally an open cover. Further, G1 has no finite subcover. By definition D0 6= ∅ and

Dn −⋃

Dk | k = 0, . . . , n− 1 6= ∅

for each positive n ∈ IN since Dn 6⊂ ⋃Dk | k = 0, . . . , n− 1. Thus, define p0 to be any point in D0

and for each positive n ∈ IN, define pn to be any point in Dn −⋃Dk | k = 0, . . . , n− 1. (Did I use

the Axiom of Choice or can this be considered an induction definition?) Thus, there is this sequence

P : IN → A such that P (n) = pn. If the natural number m > n, then pm /∈ Di, i = 0, . . . , n. Thus,

if pm ∈ Dk for any k = 0, . . . , n, then m ≤ n. This means that for each n ∈ IN the set of natural

numbers x | (x ∈ IN)∧(P (x) ∈ Dn is finite. Hence, for each n ∈ IN, x | (x ∈ IN)∧(P (x) /∈ Dn ∈ Ufor any free ultrafilter U . This yields, in general, that [P ] /∈ ∗Dn for each n ∈ IN and [P ] ∈ ∗A. For

Dk ∈ G1, there exists some ck ∈ G such that Dk = A ∩ ck. Let G2 be the set of all such ck. Since

[P ] /∈ ∗Dn for n ∈ IN, then [P ] /∈ ∗cn. But, the set G2 is an open cover of A. Thus, for each p ∈ A,

there is some ck ∈ G2 such that µ(p) ⊂ ∗ck. Consequently, [P ] /∈ ⋃µ(p) | p ∈ A and the proof is

complete.

Next, I present nonstandard proofs of a few additional characteristics for compactness, where

trivially a finite A ⊂ IR is compact.

Theorem 8.16. A nonempty A ⊂ IR is compact iff it is closed and bounded.

Proof. Assume that A is compact. Since ∗A ⊂ ⋃µ(p) | p ∈ A ⊂ G(0), the A is bounded.

Now let µ(q) ∩ ∗A 6= ∅. Then µ(q) ∩ µ(p) 6= ∅ for some p ∈ A. Hence, q = p. Thus, A = st( ∗A) and

A is closed.

44

Page 45: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Conversely, let A be bounded. Then ∗A ⊂ ⋃µ(x) | x ∈ IR = G(0). Also, A 6= IR. Let any

q ∈ IR − A. Since A is closed, then µ(q) ∩ ∗A = ∅. Thus ∗A ⊂ ⋃µ(p) | p ∈ A and this completes

the proof.

Theorem 8.17. Let infinite A ⊂ IR. Then A is compact iff each infinite B ⊂ A has a cluster

point in A.

Proof. Since A is compact, then A is bounded and, hence, B is bounded. Thus, by 8.13, B has

a cluster point p. But, since A is closed, then p ∈ A.

For the sufficiency, assume that A is not compact. Then either A is not bounded or A is not

closed. Assume that ∗A 6⊂ G(0). Let r = 1. Consider the case, that A is not bounded above.

Then there’s some p1 ∈ A such that p1 > 1. Let r = p1 + 1. Then there exists some p2 ∈ A

such that p2 > p1 + 1. Assume that we have defined pk. Then there is some pk+1 such that

pk+1 > pk + 1 > pk−1 + 1 > · · · > 1. Let p0 = 1. Thus, there is a sequence P : IN → IR such that

limn pn = +∞. Hence, this sequence has no accumulation point in A, which in this case is equivalent

to not having a cluster point for the infinite P [IN] ⊂ A. The case where A is not bounded below

follows in like manner.

Now suppose that A is not closed. Then there exists some q ∈ A′−A. Hence, there is an infinite

sequence of distinct members of A that converges to q. Again, q would be a cluster point for A.

This completes the proof.

45

Page 46: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

9. BASIC CONTINUOUS FUNCTION CONCEPTS

For all that follows in this chapter, D will denote the domain for the real valued

function f . Recall that the notation f :D → IR means that f is a real valued function defined on

D. Of course, in this case, f is also defined on any nonempty subset of D. First, let’s consider the

idea of the limit of f as x → s or limx→s f(x) or, in abbreviated notation, lims f(x) where I use s

so as not to confuse this with the more general notation for the specific case where we look only at

sequences and use n or m below the lim symbol.

Recall that for f :D → IR, lims f(x) = L iff for every r ∈ IR+, there exists some w ∈ IR

+

such that, whenever x ∈ D and 0 < |x − s| < w, then |f(x) − L| < r. Clearly, s must be a

cluster point for D. That is s ∈ D′ for this notion to have a significant unique meaning. This

is one of the first definitions that appears in a calculus book and that often gives students some

difficulty in its application. But, as will be seen, the nonstandard characteristics, especially (i),

for this limit concept are much easier to state and yield the actual intuitive idea. Recall that for

each p ∈ IR µ′(p) = µ(p) − p is the deleted monad about p and if g:B → ∗IR, A ⊂ B, then

g[A] = g(x) | x ∈ A.

Theorem 9.1. Let f :D → IR. Then lims f(x) = L iff

(i) ∗f [µ′(s) ∩ ∗D] ⊂ µ(L) iff

(ii) for each q ∈ µ′(s) ∩ ∗D, st( ∗f(q)) = L iff

(iii) for each nonzero ǫ ∈ µ(0) such that s+ ǫ ∈ ∗D, then ∗f(s+ ǫ)− L ∈ µ(0) iff

(iv) for each ǫ ∈ µ(0)+ and x ∈ ∗D such that 0 < |x− s| < ǫ, then ∗f(x)− L ∈ µ(0).

Proof. (i) For the necessity, let lims f(x) = L and r ∈ IR+. Then there exists some w ∈ IR

+

such that the following sentence

∀x((x ∈ D) ∧ (0 < |x− s| < w) → (|f(x)− L| < r))

holds in M; and , hence, in ∗M. In particular, for each p ∈ µ′(s) ∩ ∗D, | ∗f(p)− L| < r. Since r is

an arbitrary positive real number, and we have that 0 < |p− s| < w for all w ∈ IR+, it follows that

for each p ∈ µ′(s) ∩ ∗D, | ∗f(p)− L| ∈ µ(0) or that ∗f(p) ∈ µ(L).

For the sufficiency, assume that r ∈ IR+. There exists a q ∈ µ′(s) ∩ ∗D since s is a cluster

point of D. Thus, q 6= s and, hence, there is some ǫ ∈ µ′(0) such that q = s + ǫ. Consequently,

0 < |q − s| = |ǫ| ∈ µ(0). If p ∈ ∗D such that 0 < |p − s| < |ǫ|, then p ∈ µ′(s) ∩ ∗D implies that∗f(p) ∈ µ(s). Consequently, the sentence

∃x((x ∈ IR+) ∧ ∀y((y ∈ D) ∧ (0 < |y− s| < x) → (|f(y)− L| < r))

holds in M by reverse *-transform and this first “iff” is established.

All but the last “iff” are immediately equivalent to this first one. The necessity of “iff” (iv)

is clear. The sufficiency of (iv) follows from the above sentence for the sufficiency of (i) and this

completes the proof.

Corollary 9.2. If lims f(x) = L, then L is unique.

Corollary 9.3. If T ⊂ D, s ∈ T ′ and lims f(x) = L with respect to D, then lims f(x) = L

with respect to T .

46

Page 47: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Corollary 9.4. If lims f(x) = L, then there exists a nonempty open set G such that s ∈ G and∗f [( ∗G− s) ∩ ∗D] ⊂ G(0). Note that s is not an open set.

Theorem 9.5. lims f(x) = L iff there exists a sequence S such that for each n ∈ IN, Sn 6=s, Sn ∈ D; Sn → s and limn f(Sn) = L.

Proof. Suppose that lims f(x) = L and that S: IN → D, Sn → s and that for each n ∈ IN, Sn 6=s. Then for each Λ ∈ IN∞, ∗S(Λ) ∈ µ(s), ∗S(Λ) 6= s and ∗S(Λ) ∈ ∗D implies that ∗S(Λ) ∈ µ′(s)∩ ∗D.

Hence, ∗f( ∗S(Λ)) = ∗(fS)(Λ) ∈ µ(L) and the necessity follows.

For the sufficiency, assume that lims f(x) 6→ L. Then there exists some r ∈ IR+ such that

for each w ∈ IR+ whenever x ∈ D and 0 < |x − s| < w, it follows that |f(x) − L| ≥ r. Since

⋂ ∗( − w + s, s) | w ∈ IR+ ∩ ∗D 6= ∅, then for each w = 1/n, 0 6= n ∈ IN, there exists a sequence

such that Sn 6= s, Sn ∈ D, and 0 < |Sn − s| < 1/n and |f(Sn)− L| ≥ r. Consequently, Sn → s, but

f(Sn) 6→ L and the proof is complete.

Of course, it’s this “sequence” theorem that gives the major intuitive characteristic for such limits.

Modifying the definition for lims f(x) = L yields the one-sided limits. Recall that the

modifications are f(s±) = L iff for each r ∈ IR+ there exists a w ∈ IR

+ such that, whenever0 < x− s < w0 < s− x < w

, then |f(x) − L| < r. For these limits, the monads need to be modified in the

obvious manner. For each p ∈ IR, let µ(p)+ = x | (x > p) ∧ (x ∈ µ(p)) = x | (x > p) ∧ (x ≈p) =

⋂ ∗(p, p + w) | w ∈ IR+, µ(p)− = x | (x < p) ∧ (x ∈ µ(p)) = x | (x < p) ∧ (x ≈ p) =

⋂ ∗( − w + p, p) | w ∈ IR+. Using these positive or negative monads our previous theorems and

corollaries all hold with the appropriate modifications.

Theorem 9.6. Let f :D → IR. Then f(s±) = L iff

(i) ∗f [µ′(s)± ∩ ∗D] ⊂ µ(L) iff

(ii) for each q ∈ µ′(s)± ∩ ∗D, st( ∗f(q)) = L iff

(iii) for each ǫ ∈ µ(0)± such that s+ ǫ ∈ ∗D, then ∗f(s+ ǫ)− L ∈ µ(0) iff

(iv) for each ǫ ∈ µ(0)+ and x ∈ ∗D such that0 < x− s < ǫ0 < s− x < ǫ

, then ∗f(x)− L ∈ µ(0).

Corollary 9.7. If f(s±) = L, then L is unique.

Corollary 9.8. If T ⊂ D, s ∈ T ′ and f(s±) = L with respect to D, then f(s±) = L with

respect to T .

Corollary 9.9. If f(s±) = L, then there exists a nonempty open interval

I+ = (s, r)I− = (r, s)

such

that ∗f [( ∗I±) ∩ ∗D] ⊂ G(0). Note that s is not an open set.

The following is the appropriate modification for Theorem 9.5

Theorem 9.10. Let f :D → IR. Then f(s+) = L [resp. f(s−)] iff there is a sequence S such

that for each n ∈ IN, Sn ∈ D, Sn > s, [resp. Sn < s], Sn → s and limn f(sn) = L.

Proof. I prove this for f(s−) since f(s+) is done in like manner. Let f(s−) = L and Sn →s, ∀n ∈ IN, Sn 6= s, Sn < s, Sn ∈ D. Then ∀Λ ∈ IN∞, ∗S(Λ) ∈ µ(s)− and S(Λ) < s. Thus∗f( ∗S(Λ)) = ∗(fS)(Λ) ∈ µ(L) and the necessity follows.

For the sufficiency, the method is similar to that for Theorem 9.5. Assume that f(s−) 6→ L.

Then there exits some r ∈ IR+ such that ∀w ∈ IR

+ whenever 0 < s − x < w, x ∈ D, then

|f(x) − L| ≥ r. Since µ(s)− =⋂ ∗( − w + s, s) | w ∈ IR

+, by *-transform, for 0 ∈ IN, there is

47

Page 48: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

some s0 ∈ (−1 + s, s) ∩ D and S0 < s. Assume that for k ∈ IN, k ≥ 1, there are Sk ∈ (−1/(k +

1) + s, s) ∩ D, and Sk < s. Now consider k + 1. Then since (−1/(k + 2) + s, s) ∩ D 6= ∅, there is

some Sk+1 ∈ (−1/(k + 2) + s, s) ∩ D and Sk+1 < s. Thus yields a sequence S: IN → D such that

∀n ∈ IN, Sn ∈ D, Sn → s and Sn < s, but |f(Sn)− L| ≥ r and the proof is complete.

Theorem 9.11. Let f :D → IR and s ∈ int(D) (the set of all interior points). Then lims f(x) =

L iff f(s±) = L.

Proof. µ′(s) = µ(s)+ ∪ µ(s)−.

Example 9.12. In the usual calculus text, it’s established that lim0sin xx = 1, by means

of a geometric proof. Although, I won’t mention any apparent geometric facts in the following

nonstandard proof, it might be necessary to use the geometric definitions to establish the facts I do

use.

For each r ∈ IR+ such that 0 < r < π/2, since sin(r) < r < tan(r), cos(r) < sin(r)

r < 1. Thus,

for each ǫ ∈ µ(0)+, by *-transform,

∗cos(ǫ) <∗sin(ǫ)

ǫ< 1.

But, since | sin(r)| ≤ |r|, ∀ r ∈ IR, then | ∗sin(ǫ)| ≤ ǫ implies that ∗sin(ǫ) ∈ µ(0). This yields

1− ∗( cos(ǫ))2 = ∗( sin(ǫ))2 ∈ µ(0)

which implies that ∗cos(ǫ) ∈ µ(1). Consequently, 1 ≤ st( ∗cos(ǫ)) ≤ st(∗sin(ǫ)

ǫ ) ≤ 1 for each ǫ ∈µ(0)+. Thus, sin(0+)

0+ = 1. To show that this last equation holds for ǫ ∈ µ(0)−, simply notice that∗sin(−ǫ))

−ǫ =∗sin(ǫ))

ǫ ∀ ǫ ∈ µ′(0). Hence, the result follows.

All of the usual limit and one-sided limit algebra for such functions follow from the properties

of the standard part operator. Now let’s establish the Cauchy Criterion for functions.

Theorem 9.13. (Cauchy Criterion.) Let f :D → IR. Then lims f(x) = L iff for each pair

p, q ∈ µ′(s) ∩ ∗D, ∗f(p)− ∗f(q) ∈ µ(0).

Proof. The necessity follows from Theorem 9.1.

For the sufficiency, assume that there does not exists w ∈ IR+ such that f is bounded on

(−w + s, s + w)′ ∩ D. Hence, for r = 1, for each w ∈ IR+ there are, at least, two distinct x1, x2 ∈

(−w + s, s+ w)′ ∩D, x1, x2 6= s and |f(x1)− f(x2)| ≥ 1. Consequently, the sentence

∀x((x ∈ IR+ → ∃y∃z((y ∈ D) ∧ (z ∈ D) ∧ (0 < |s− y| < x)∧

(0 < |s− z| < x) ∧ (|f(y)− |f(z)) ≥ 1)))

holds in ∗M by *-transform. So, let ǫ ∈ µ(0)+. Then there exists distinct p, q such that 0 < |s−p| < ǫ

and 0 < |s−q| < ǫ and | ∗f(p)− ∗f(q)| ≥ 1. But, this contradicts the requirement that ∗f(p)− ∗f(q) ∈µ(0). Thus, there is some w ∈ IR

+ such that f is bounded on (−w + s, s + w)′ ∩ D. Consequently,∗f [µ′(s)∩ ∗D] ⊂ G(0). Letting q ∈ µ′(s)∩ ∗D, then for each p ∈ µ′(s)∩ ∗D, ∗f(p) ∈ µ(st( ∗f(q)) and

the result follows where st( ∗f(q)) = L.

Corollary 9.14. Let f :D → IR. Then f(s±) = L iff for each pair p, q ∈ µ′(s)± ∩ ∗D, ∗f(p)−∗f(q) ∈ µ(0).

48

Page 49: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 9.15. Let f : (a, b) → IR and a < c < d < b. If f is increasing [resp. decreasing],

then

f(c−) = supf(x) | a < x < c ≤ f(c) ≤ f(c+) = inff(x) | c < x < b.

[resp. f(c+) = inff(x) | a < x < c ≤ f(c) ≤ f(c−) = supf(x) | c < x < b.]

Further, f(c+) ≤ f(d−) [resp. f(d−) ≤ f(c+)].

Proof. I show this only for an increasing function f . Clearly, supf(x) | a < x < c = L, L ≤f(c). For any real number r < L, there exists some p ∈ (a, c) such that r < f(p) ≤ L. Thus, let

ǫ ∈ µ(0)−. Then a < c + ǫ < c implies that r < ∗f(c + ǫ) ≤ L since ∗f is increasing on ∗(a, b).

Therefore, r < st( ∗f(c + ǫ)) ≤ L. Since r < L is arbitrary, this implies that st( ∗f(c + ǫ)) = L for

each such ǫ, and this first part follows from Theorem 9.6. The inf case, follows in like manner.

Now if c < d, then c + ǫ < d + γ for each ǫ ∈ µ(0)+ and each γ ∈ µ(0)−. Consequently,

st( ∗f(c+ ǫ)) = f(c+) ≤ f(d−) = st( ∗f(d+ γ)) and the proof is complete.

I guess I should mention the other ordinary limit of a function notion used when D is not

bounded above or below, the ∞. Recall that if D is not bounded above, then lim∞ f(x) = L iff

for each r ∈ IR+ there exists some w ∈ IR

+ such that for each p ∈ D such that p > w, |f(p)−L| < w.

For D that is not bound below, this limit notion is defined in the obvious manner.

Theorem 9.16. Suppose that f :D ∈ IR is not bounded above [resp. below]. Then lim∞ f(x) = L

iff ∗f(p) ∈ µ(L) for each p ∈ IR+∞ ∩ ∗D [resp. ∗f(p) ∈ µ(L) for each p ∈ IR

−∞ ∩ ∗D].

Proof. Left to the reader.

Theorem 9.17. Suppose that f :→ D ∈ IR is not bounded above [resp. below]. Then

lim∞ f(x) = L iff for each pair p, q ∈ IR+∞ ∩ ∗D [resp. IR−∞ ∩ ∗D], ∗f(p)− ∗f(q) ∈ µ(0).

Proof. Left to the reader.

Our major interest is to investigate properties of continuous real valued functions defined on

D. Since for f :D → IR to be continuous at s, all one needs is that lims f(x) = f(s) and, hence, we

need s ∈ D.

Theorem 9.18. Let s ∈ int(D). Then function f :D → IR is continuous at s iff lims f(x) =

f(s) = f(s+) = f(s−).

Proof. Note that µ(s) = µ(s)+ ∪ s ∪ µ(s)−.

Each of the previous theorems on the left and right-hand limits, when slightly modified, hold

for continuous functions. Also, each monadic characteristic for continuity holds for isolated points.

Thus, s need not be a cluster point. The changes are made by replacing the deleted monads with

the complete monad and such statements as 0 < |x−s| by |x−s| and the like. The must used result

is that f is continuous at p ∈ D iff ∗f [µ(p) ∩ ∗D] ⊂ µ(f(p)). Now let’s apply these results to obtain

three highly significance continuous function properties.

Theorem 9.19. Let continuous f :D → IR and let D be compact. Then the range, f [D], is

compact.

Proof. Since D is compact, then ∗D ⊂ ⋃µ(p) | p ∈ D. But, using a property that holds for

any function, it follows that

∗f [ ∗D] ⊂⋃

∗f [µ(p) ∩ ∗D] | p ∈ D ⊂⋃

µ(f(p)) | f(p) ∈ f [D]

49

Page 50: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

and the result follows.

Theorem 9.20. (Extreme Value Theorem.) Let continuous f :D → IR and D be compact. Then

there exists pm, pM ∈ D such that for each p ∈ D, f(pm) ≤ f(p) ≤ f(pM ).

Proof. Since f [D] is compact, then it is closed and bounded. Thus, from boundedness supf(x) |x ∈ D = pM and inff(x) | x ∈ D = pm. Since f [D] is closed that pm, pM ∈ D and the result

follows.

I mention that all such standard theorems can be extended to “nonstandard statements” by

*-transform. To establish the intermediate value theorem the notion of connectedness is often

introduced. But, rather than do this, I’ll give a nonstandard proof where connectedness is not

mentioned.

Theorem 9.21. Let continuous f : [a, b] → IR. The for each d such that f(a) ≤ d ≤ f(b) [resp.

f(b) ≤ d ≤ f(a)], there is some c ∈ [a, b] such that f(c) = d.

Proof. The result is immediate if a = b. So, assume that a < b and consider the case where

that f(a) ≤ d ≤ f(b). Let nonzero n ∈ IN and h = (b − a)/n. Then we have a finite partition of

[a, b] a, a + h, a + 2h, . . . , a + nh = b. Thus, there exists some m ∈ IN such that m < n and (i)

f(a) ≤ f(a+mh) ≤ d ≤ f(a+(m+1)h) ≤ f(b) or (ii) f(a) ≤ f(a+(m+1)h) ≤ d ≤ f(a+mh) ≤ f(b).

Assume (i). By *-transform, if Λ ∈ IN∞, then (b − a)/Λ ∈ µ(0). There exists some m1 ∈ ∗IN

such that m1 < Λ and f(a) ≤ ∗f(a + m1h) ≤ d ≤ ∗f(a + (m1 + 1)h) ≤ f(b). (Note the use of

simplified notation for such things as f(a), where technically this should be written as σf( ∗a).)

Since a < a + m1h ≤ b, then there is a real c = st(a + m1h) and a ≤ c ≤ b From the continuity

of f, f(c) = f(st(a + m1h)) = st( ∗f(a + m1h)) ≤ d for a + m1h ∈ µ(c) ∩ ∗ [a, b]. However,

a+m1h+h = a+(m1+1)h ∈ µ(c) implies that f(c) = f(st(a+(m1)h)) = st( ∗f(a+m1+1)h)) ≥ d.

Therefore, f(c) = d. The other cases follow in a similar manner and the proof is complete.

The results that the sum and product function and similar processes defined for continuous func-

tions yield continuous functions follows from the properties of the standard part operator. Our last

result in this chapter is a nonstandard proof of the composition properties for continuous functions.

Theorem 9.22. Let continuous f :D → IR and continuous g:T → R be such that f [D] ⊂ T.

Then the composition gf :D → IR is continuous.

Proof. Let p ∈ D. Then ∗f [µ(p) ∩ ∗D] ⊂ µ(f(p)) ∩ ∗(f [D]) ⊂ ∗(f [D]) ⊂ ∗T imply that∗g[ ∗f [µ(p) ∩ ∗D]] ⊂ ∗g[µ(f(p)) ∩ ∗T ] ⊂ µ(g(f(p)) and the result follows.

50

Page 51: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

10. SLIGHTLY ADVANCED CONTINUOUS FUNCTION CONCEPTS

Unless otherwise specified, for all the follows in this chapter, D will denote the

domain for the real valued function f . Here is a result, you may never have seen before,

that also implies the intermediate value theorem. The original standard proof and result is due to

Bolzano.

Theorem 10.1. For continuous f : [a, b] → IR, if f(a)f(b) < 0, then there exists some c ∈ (a, b)

such that f(c) = 0.

Proof. First note that the hypotheses require that a 6= b. Assume that f(c) 6= 0 for each

c ∈ (a, b). Since f(a) 6= 0 and f(b) 6= 0, then f(c) 6= 0 ∀ c ∈ [a, b]. I now show that for each nonzero

m ∈ IN (i.e. m ∈ IN′) that there exist real numbers sm, tm such that tm − sm = (b− a)/m and

a ≤ sm < tm < b, andf(tm)

f(sm)< 0.

For m ∈ IN′, consider the function

g(x) =f(x+ (b − a)/m)

f(x); a ≤ x ≤ a+

m− 1

m(b− a).

Consequently, the product Πm−10 g(a+k(b−a)/m) = f(b)/f(a) < 0. Thus, there is some k ∈ IN, 0 ≤

k ≤ m− 1 such that g(a+ k(b− a)/m) < 0. Hence, f(a+ (k+1)(b− a)/m)/f(a+ k(b− a)/m) < 0.

Let sm = a+k(b−a)/m and tm = a+(k+1)(b−a)/m and the conditions required hold for sm, tm.

Thus, by *-transform, if Λ ∈ IN∞, there exists p, q ∈ ∗IR such that q − p = (b− a)/Λ, a ≤ p < q ≤ b

and ∗f(q)/ ∗f(p) < 0. Since q − p ∈ µ(0), then p ∈ µ(st(q)) Consequently, using the result that

st(q) ≤ b and the continuity of f , ∗f(p) ∈ µ(f(st(q))). Therefore, st( ∗f(p)) = f(st(p)) = f(st(q))

implies that f(st(p))/f(st(q)) = 1 ≤ 0. This contradiction yields the result.

To obtain the immediate value theorem from Theorem 10.1, just consider for the function

f(x) − d, if f(a) ≤ d ≤ f(b), or d − f(x) if f(b) ≤ d ≤ f(b) for the non-trivial cases f(a) 6= d and

f(b) 6= d. A major result characterizes continuity on the entire set D in terms of open sets. The

proof is a little long due to the simplified structure I’m using. Let G be a nonempty collection of

open subsets of IR. Then since for each p ∈ ⋃G | G ∈ G, µ(p) ⊂ ∗G for some G ∈ G, thenµ(p) ⊂ ⋃ ∗G | G ∈ G ⊂ ∗(

⋃G | G ∈ G) implies that the arbitrary union of a collection of open

sets is an open set.

Theorem 10.2. Let f :D → IR. Then f is continuous on D iff for each open set G ⊂ IR,

f−1[G] is open in D.

Proof. Note that a set G1 is “open” in D iff there exists an open G2 ⊂ IR such that G1 = G2∩D.

Assume that f is continuous on D. Let G be an open set in IR. If G = ∅, then f−1[G] = p | (p ∈D) ∧ (f(p) ∈ G = ∅, which is open in D. The same result would hold if G ∩ f [D] = ∅. Hence,assume that G ∩ f [D] 6= ∅. Then let f(p) ∈ G ∩ f [D]. Since G is open, then µ(f(p)) ⊂ ∗G

and, by continuity, ∗f [µ(p) ∩ ∗D] ⊂ µ(f(p)) ∩ ∗D ⊂ ∗G. Since µ(p) is the intersection of all the

intervals ∗( − r + p, r + p), r ∈ IR+, then there exists some (−r + p, p + r), r ∈ IR

+, such that

p ∈ (−r + p, p+ r) ∩D ⊂ f−1[D]. Since the arbitrary union of open sets is an open set, then using

one of these open intervals for each p ∈ D, one gets an open set G0 ⊂ IR such that G0∩D = f−1[D].

For the sufficiency, I’ll use inverse image, f−1, set-algebra. Consider µ(f(p))∩ ∗(f [D]) =⋂ ∗(−

r + f(p), f(p) + r | r ∈ IR+ ∩ ∗(f [D]) implies that ∗f−1[µ(f(p)) ∩ ∗f [ ∗D]] = ∗f−1[µ(f(p))] ∩ ∗D =

51

Page 52: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

⋂ ∗(f−1[(−r+ f(p), f(p)+ r)]∩D) | r ∈ IR+. Now by the hypothesis, each f−1[(−r+ f(p), f(p)+

r))]∩D is open in D. Hence, for p ∈ f−1[(−r+ f(p), f(p)+ r)]∩D, there is some s ∈ IR+ such that

p ∈ (−s+p, p+s)∩D ⊂ f−1[(−r+f(p), f(p)+r))]∩D. However, µ(p)∩ ∗D ⊂ ∗(−s+p, p+s)∩ ∗D

implies that µ(p) ∩ ∗D ⊂ ∗f−1[µ(f(p))] ∩ ∗D. Therefore,

∗f [µ(p) ∩ ∗D] ⊂ ∗f ∗f−1[µ(f(p))] ∩ ∗f [ ∗D] ⊂ µ(f(p)) ∩ ∗f [ ∗D] ⊂ µ(f(p)).

The proof is complete.

Prior to considering the notion of uniform continuity, here is a nonstandard proof of a rather

interesting result. A real valued function is additive if for each p, q ∈ IR, f(p + q) = f(a) + f(q).

Recall that I’m using simplified notation in that rather than write a statement such as ∗x ∈ σIR,

this is often written as x ∈ IR.

Theorem10.3. Let f : IR → IR be additive. If f is bounded on some non-empty interval I, then

f(x) = xf(1) for each x ∈ IR and is a continuous function on IR.

Proof. Clearly, ∗f : ∗IR → ∗IR is additive. Additivity implies that for any rational r and any

x ∈ IR, f(rx) = rf(x). Hence, ∗f(rx) = r ∗f(x) for each x ∈ ∗IR and *-rational r ∈ ∗

IR (i.e. r ∈ ∗Q).

Let p be in the interior of I (i.e µ(p) ⊂ ∗I). Then from boundedness, | ∗f [µ(p)]| ≤ M ∈ σR.

Consequently, for each ǫ ∈ µ(0), | ∗f(p+ ǫ)| = | ∗f(p)+ ∗f(ǫ)| ≤ M implies that | ∗f(ǫ)| ≤ M + |f(p)|.Now for each n ∈ IN, nǫ ∈ µ(0) implies, by additivity, that | ∗f(nǫ)| = n| ∗f(ǫ)| ≤ M + |f(p)|.Therefore, for n ∈ IN

′, | ∗f(ǫ)| ≤ (M + |f(p)|)/n. This yields that ∗f(ǫ) ∈ µ(0), ∀ ǫ ∈ µ(0). From

the density of the rational numbers in IR, for any r ∈ IR+ and any x ∈ IR, there is some q ∈ Q such

that |x − q| < r. By *-transform, we have that for ǫ ∈ µ(0)+ and x ∈ IR, there is q ∈ ∗Q such that

|x− q| < ǫ. Hence, x− q ∈ µ(0) implies that there is some γ ∈ µ(0) such that x = q + γ. Therefore

f(x) = ∗f(q + γ) = ∗f(q) + ∗f(γ) = ∗f(q · 1) + ∗f(γ) = q(f(1)) + ∗f(γ).

Thus, q(f(1)) ∈ µ(f(x)). Finally, f(x) = st(q(f(1))) = (st(q))st(f(1)) = xf(1) and, obviously, f

is continuous.

Theorem 10.4. If r ∈ IR, then there exists a hyperrational r ∈ ∗Q and some ǫ ∈ µ(0) such

that r = q + ǫ.

You might try showing from Theorem 10.4 that if f : IR → IR and ∀x, y ∈ IR f(x + y) =

f(x)f(y), ∗f [ ∗Q] ⊂ G(0), lim0 f(x) = 0, then f(x) = 0, ∀x ∈ IR.

Recall that f :D → IR is uniformly continuous on D if for each r ∈ IR+ there exists some

w ∈ IR+ such that whenever x, y ∈ D and |x−y| < w, then |f(x)−f(y)| < r. Of course, this concept

is highly significant in series work and integration theory. The follow characteristic follows in the

usual manner, where the big difference between this and Corollary 9.14, extended to continuity, is

that the points are not restricted to a particular µ(s).

Theorem 10.5. The function f :D → IR is uniformly continuous on D iff for each p, q ∈ ∗D

such that q − p ∈ µ(0), ∗f(p)− ∗f(q) ∈ µ(0).

Theorem 10.6. Let f :D → IR be continuous on compact D. Then f is uniformly continuous.

Proof. Let p, q ∈ ∗D and p − q ∈ µ(0). Since ∗D ⊂ ⋃µ(p) | p ∈ D, then p, q ∈ µ(s) for some

s ∈ D. Thus, from continuity, ∗f(p), ∗f(q) ∈ µ(f(s)) implies that ∗f(p) − ∗f(q) ∈ µ(0) and the

result follows.

52

Page 53: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Since being uniformly continuous is so important within analysis, I’ll present a few more perti-

nent propositions.

Theorem 10.7. For real numbers a < b, let f : (a, b) → IR. If h ∈ µ(b)− ∩ (a, b) ( = µ(b)−)

and ∗f(h) /∈ G(0), then for each r ∈ IR+ and each p ∈ (a, b) there exists some q ∈ (a, b) such that

p < q < b and |f(p)− f(q)| > r. (A similar statement holds for the end point “a.”)

Proof. Let h ∈ µ(b)− and p ∈ (a, b). Then p < h < b. Since ∗f(h) /∈ G(0), then ∀ r ∈IR, | ∗f(h)| > r and, by reverse *-transform, there exists q ∈ (a, b) such that p < q < b and

|f(q)| > |f(p)|+ r for the |f(p)|+ r ∈ IR. Hence, |f(p)− f(q)| > r.

Theorem 10.8. Let f : (a, b) → IR be uniformly continuous. Then f(b−), f(a+) ∈ IR.

Proof. Let h ∈ µ(b)− ∩ ∗(a, b) = µ(b)−. Then h < b. Assume that ∗f(h) /∈ G(0). Since

h ∈ ∗(a, b). Then, by *-transform of the conclusion of Theorem 10.7, there exist q ∈ ∗(a, b) such

that h < q < b and | ∗f(h) − ∗f(q)| > 1. Since q ∈ µ(b)−, this contradicts Theorem 10.5. Hence,∗f(h) ∈ G(0) for each h ∈ µ(b)− implies that, for a particular h, st( ∗f(h)) = L. Now if k ∈ µ(b)−,

then uniform continuity implies that ∗f(k) ∈ µ(st( ∗f(h)). Thus, f(b−) = L. The result for f(a+)

is obtained in a similar manner with a similar Theorem 10.7 for a and this completes our proof.

Let nonempty E ⊂ D and f :E → IR. A function g:D → IR is called an extension of f iff for

each x ∈ E, g(x) = f(x).

Theorem 10.9. Let f : (a, b) → IR be uniformly continuous on (a, b). Then there exists an

extension g of f such that g: IR → IR is uniformly continuous.

Proof. Simply use the last theorem and define g(x) = f(x), for each x ∈ (a, b) and g(x) = f(a+)

for all x ≤ a, and g(x) = f(b−) for all x ≥ b. It’s clear that g is uniformly continuous on IR.

Theorem 10.10. Let f :D → IR be uniformly continuous on each bounded B ⊂ D. Then f has

a unique continuous extension g: cl(D) → IR.

Proof. A function like f :D → IR is uniformly continuous on each bounded B ⊂ D iff for each

p, q ∈ ∗D ∩G(0) such that p− q ∈ µ(0), it follows that f(p)− f(q) ∈ µ(0). Since p, q ∈ ∗D ∩G(0) iff

p, q ∈ ∗D ∩ ∗ [− a, a] for some a ∈ IR.

Define g: cl(D) → IR as follows: let g(st(x)) = st( ∗f(x)) for each x ∈ ∗D ∩G(0). This function

is well defined since if st(y) = st(x), then st( ∗f(x)) = st( ∗f(y)) for each x, y ∈ ∗D ∩ G(0) by

uniform continuity. Further, as we know, cl(D) = x | µ(x) ∩ ∗D 6= ∅ = st(y) | y ∈ ∗D ∩G(0).Now g extends f , for if x ∈ D, then g(x) = g(st(x)) = st(f(x)) = f(x). Now it’s necessary

to show that g is continuous for any p ∈ cl(D). Let B = D ∩ [−1 + p, p + 1] and r ∈ IR+. Then

there exists some w ∈ IR+ such that w < 1 and for each x, y ∈ B such that |x − y| < w it follows

that |f(x) − f(y) < r/2 by uniform continuity on bounded subsets of D. By *-transform, for each

x, y ∈ ∗B, and |x− y| < w, it follows that| ∗f(x)− ∗f(y)| < r/2. Let b ∈ cl(D) and |b− p| < w. Then

b = st(y), p = st(x) for some x, y ∈ ∗B and |x−y| < w. Consequently | ∗f(x)− ∗f(y)| < r/2 implies

that |st( ∗f(x)) − st( ∗f(y))| ≤ r/2. Therefore, in the usual manner, we have that |g(b)− g(p)| < r.

Hence, g is continuous at p. Finally, g is unique for if h: cl(D) → IR continuously and extends f ,

then for p ∈ cl(D), h(p) = h(st(x)) = st( ∗h(x)) = st( ∗f(x)) = g(p), where x ∈ ∗D and p = st(x).

I need just one more extension result for the next chapter.

Theorem 10.11. Let f :D → IR continously and for each p ∈ cl(D) − D, assume that

limp f(x) ∈ IR. Then f has a unique continuous extension g: cl(D) → IR.

53

Page 54: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Proof. Obviously, for each p ∈ cl(D), we should let g(p) = limp f(x) and g: cl(D) → IR is

unique, since the limits are unique and extends f for limb f(x) = f(b), b ∈ D. Let p ∈ cl(D).

Then µ(p) ∩ ∗D 6= ∅. Let W = (−w + g(p), g(p) + w) be an open interval about g(p). Clearly,

µ(g(p)) ⊂ ∗W. and there exists r ∈ IR+ such that (−r+ g(p), g(p) + r) ⊂ [−r+ g(p), g(p) + r] ⊂ W.

Since ∗f [µ(p)∩ ∗D] ⊂ µ(g(p)), it follows that there exists an r1 > 0 such that for each open interval

open Is = (−s + p, p + s), 0 < s ≤ r1 about p such that f [Is ∩ D] ⊂ (−r + g(p), g(p) + r) and

Is ∩ D 6= ∅. Let q ∈ Is ∩ (cl(D). Then since µ(q) ∩ ∗D 6= ∅ and µ(q) ∩ ∗D ⊂ ∗Is ∩ ∗D, it follows

that ∅ 6= ∗f [µ(q) ∩ ∗D] ⊂ ∗(f [Is ∩ D]) ⊂ ∗( − r + g(p), g(p) + r). From the definition of g, we

have that µ(g(q)) ∩ ∗( − r + g(p), g(p) + r) 6= ∅ implies that g(q) ∈ ∗ [ − r + g(p), g(p) + r]. Thus

g(q) ∈ W. This yields that g[Is ∩ (cl(D))] ⊂ W. Since W is an arbitrary open interval about g(p),

then µ(g(p)) =⋂ ∗(− r + g(p), g(p) + r) | r ∈ IR

+ and µ(p) ∩ ∗(cl(D)) ⊂ ( ∗Is ∩ ( ∗cl(D))) imply

that ∗g[µ(p) ∩ ∗(cl(D))] ⊂ µ(g(p)) and the proof is complete.

Corollary 10.12. Let continuous f :D → IR, D be bounded and for each p ∈ cl(D) − D,

limp f(x) ∈ IR. Then f has a unique uniformly continuous extension g: cl(D) → IR.

Proof. Since cl(D) is bounded and closed it is compact. Then the unique continuous extension

g defined on cl(D) by Theorem 10.11 is uniformly continuous.

Example 10.13. Assume that you have defined for x > 1 the exponential function xr for each

rational r ∈ Q and that you have shown that it is a strictly increasing function. Then due to the fact

thatQ is dense in IR, it follows that on f :Q → IR, where f(x) = xr = supf(x) | (x ∈ Q)∧(f(x) ≤ r,f is a continuous function. But, cl(Q) = IR. Now from the completeness of the real numbers, given

any irrational r, then, in IR, limr f(x) = supf(x) | (x ∈ Q) ∧ (x < r). Thus, by Theorem 10.11,

there is a unique continuous extension g of f such that for irrational r ∈ IR, g(r) = limr f(x) and

this is the value of this exponential defined at r.

54

Page 55: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

11. BASIC DERIVATIVE CONCEPTS

I now come to the most striking difference between nonstandard analysis and the standard

approach. Although the intuitive notions of the calculus are based upon “infinitesimal modeling,”

it was precisely the logical difficulties that occured using the intuitive infinitesimal approach that

greatly influenced its abandonment. No such difficulties occur for these nonstandard infinitesimals.

For what follows, please notice that D ∩ D′ is the set of all cluster points that are members of D

and as such µ′(p) ∩ ∗D 6= ∅ and p ∈ D. These are the members of D that are not isolated points.

I’ll denote D ∩D′ = DNI .

Definition 11.1. (The Standard Derivative.) Let p ∈ DNI , p + h ∈ D, h 6= 0 and

f :D → IR. Then the derivative at p (denoted by f ′(p)) is finite and has value f ′(p) iff lim0(f(p+

h)−f(p))/h = f ′(p). The derivative f ′(p) = ±∞ iff lim0(f(h+p)−f(p))/h = ±∞ and has geometric

applications to the notion of “vertical” points of inflection.

In all that follows, I’ll use, as was done originally, the symbol dx to denote a member of µ′(0).

This idea of dx being a special type of number was not carried over by Weierstrass when he refined

the limit concept. The next theorem follows immediately from our characterizations for the limit

notion.

Theorem 11.2. Let f :D → IR. Then for p ∈ DNI , f′(p) = s ∈ IR [resp. ±∞] iff for each

dx ∈ µ′(0) such that p+ dx ∈ ∗D

∗f(p+ dx)− f(p)

dx∈ µ(s) [resp. IR±∞].

Note that if you let D = [a, b], a < b and f ′(a) exists, then f ′(a) is but the “right-hand” one-

sided derivative. Clearly, this definition extends slightly the concept as it appears in the usual basic

calculus course. The idea for the derivative is that it is a type of rate of change in infinitesimal values.

In important physical applications, we need to know how infinitesimal rates of change compare with

ordinary real number rates of change. Let f :D → IR, y = f(x), x ∈ D, h ∈ IR such that x+ h ∈ D.

Then usually one writes the increment of (for) y at x, and h as ∆y = f(x+h)− f(x) = ∆f(x, h).

This f generated function (∆f)(p, h) is actually a function that determines a hyperfunction by *-

transform for any q ∈ ∗D, k ∈ ∗IR such that q+ k ∈ ∗D. The *-transform states that ∗(∆f)(q, k) =

∗f(q + k) − ∗f(q) = (∆ ∗f)(q, k) = ∆ ∗f(q, k). Thus, f ′(p) = s [resp. ±∞] iff ∆ ∗f(p, dx)/dx ∈ µ(s)

[resp. IR±∞] for each dx ∈ µ′(0) such that p+ dx ∈ ∗D.

Theorem 11.3. Let f :D → IR. Then f is continuous at p ∈ D iff ∆ ∗f(p, dx) ∈ µ(0), for each

dx ∈ µ(0) such that p+ dx ∈ ∗D.

Theorem 11.4. If f :D → IR and for p ∈ DNI , f′(p) ∈ IR, then f is continuous at p.

Proof. Assume that f ′(p) ∈ IR. Then for each dx ∈ µ′(0), such that p+ dx ∈ ∗D

∗f(p+ dx)− f(p)

dx∈ µ(f ′(p)).

Note that there always exists at least one such dx. Hence, ∗f(p + dx) − f(p) ∈ µ(0) implies that∗f [µ(p) ∩ ∗D] ⊂ µ(f(p)) and the result follows.

Our next notion is that of the differential. This is where we return to the time of Newton

and Leibniz, something that could not be done prior to 1961. I mention that there are different

approaches to the notion of the differential, especially for multi-variable functions.

55

Page 56: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Definition 11.5. (The Differential.) Let f :D → IR, f ′(p) ∈ IR, dx ∈ µ(0), p + dx ∈ ∗D.

Then the differential is df = f ′(p) dx ∈ µ(0).

Theorem 11.6 Let f :D → IR. If p ∈ D, f ′(p) ∈ IR, then f ′(p) = st( dfdx) for each dx ∈ µ′(0)

such that p+ dx ∈ ∗D.

Proof. Immediate.

We need a better understanding of when the derivative exists and its relation to the differential

and the infinitesimal increment. For this reason, let’s call a function h(p, q) defined on A × B ⊂∗IR× ∗

IR, where µ(0) ⊂ B, an infinitesimal function at p ∈ A iff h(p, dx) ∈ µ(0), ∀ dx ∈ µ(0).

Theorem 11.7. Let f :D → IR, p ∈ DNI . Then f ′(p) ∈ IR iff there exists a unique t ∈ IR and

an infinitesimal function, h: p × µ(0) → ∗IR such that for each dx ∈ µ′(0), where p+ dx ∈ ∗D, ‘

∆ ∗f(p, dx) = ∗f(p+ dx) − f(p) = (dx)t + (dx)h(p, dx).

Proof. For the necessity, simply define h(p, dx) = ( ∗f(p + dx) − f(p))/dx − f ′(p), dx 6= 0 and

h(p, 0) = 0. Then let t = f ′(p). It follows that h(p, dx) ∈ µ(0), ∀ dx ∈ µ(0) and that ∗f(p + dx) −f(p) = (dx)t + (dx)h(p, dx), ∀ dx ∈ µ′(0). The fact that t is unique follows from the definition of

the derivative and the disjoint nature of the monads.

For the sufficiency, let ∗f(p+ dx)− f(p) = (dx)t+ (dx)h(p, dx), dx ∈ µ′(0) for each dx ∈ µ′(0)

such that p+ dx ∈ ∗D, then ( ∗f(p+ dx)− f(p))/dx− t = h(p, dx) ∈ µ(0) implies that t = f ′(p).

Note that Theorem 11.7 holds in all cases including the case that f is constant on some interval

about p. The significance of Theorem 11.7 is that there are collections of infinitesimals called order

ideals that give a type of measure as to how well the differential approximates the infinitesimal

increment. For example, the facts are that for a fixed dx > 0, say, and, f ′(p) 6= 0, then the set

o(dx) = γ(dx) | γ ∈ µ(0) generates an ideal that’s a subset of µ(0) with a lot of properties.

Obviously, dxh(p, dx) ∈ o(dx). One says that df is a first-order approximation for ∆ ∗f(p, dx) for

each dx.

This notion of infinitesimal approximation is exactly how “curves” were viewed in the time of

Newton and Leibniz. From Theorem 11.7 we have specifically that ∗f(p+dx) = f(p)+df+dxh(p, dx)

holds ∀ dx ∈ µ(0). Thus within µ(p), the monadic neighborhood about p, the *-line segment g(dx) =

f(p)+dx f ′(p), dx ∈ µ(0) is a first-order approximation for any dx to the *-graph y = ∗f(p+dx). Of

course, this can be phrased in terms of *-range values. One of the original definitions for a curve was

that it is an infinite collection of infinitely small line segments. So, once again, we have a rigorous

formulation for the original intuitive idea. And, yes, under certain circumstances there are “higher

order” approximations.

Although it’s obvious from limit theory that the sum and product of functions f, g that are

differentiable at p (i.e. this means that f ′(p), g′(p) ∈ IR) are differentiable at p, the following two

theorems demonstrate how easily the derivative “formula” and the chain rule are obtained.

Theorem 11.8. Let f, g:D → IR and f ′(p), g′(p) ∈ IR. Then

(i) if u = (f)(g), then u′(p) = f(p)g′(p) + f ′(p)g(p);

(ii) if g(p) 6= 0 and u = f/g, then

u′(p) =g(p)f ′(p)− g′(p)f(p)

g(p)2

56

Page 57: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

.

Proof. (i) For dx ∈ µ′(0) such that p+ dx ∈ ∗D, [ ∗f(p+ dx)][ ∗g(p+ dx)] = [f(p) + dx f ′(p) +

dx h(p, dx)][g(p)+dx g′(p)+dx k(p, dx)] = f(p)g(p)+(f(p)g′(p)+g(p)f ′(p))dx+γ dx where γ ∈ µ(0),

and the result follows.

(ii) For dx ∈ µ′(0) such that p+ dx ∈ ∗D,

∆ ∗u(p, dx) =∗f(p+ dx)∗g(p+ dx)

− f(p)

g(p)=

∆ ∗f(p, dx) + f(p)

∆ ∗g(p, dx) + g(p)− f(p)

g(p)=

g(p)∆ ∗f(p, dx)− f(p)∆ ∗g(p, dx)

g(p)(∆ ∗g(p, dx) + g(p)).

Thus, for dx 6= 0,

∆ ∗u(p, dx)

dx=

g(p)∆∗f(p,dx)dx − f(p)∆

∗g(p,dx)dx

g(p)(∆ ∗g(p, dx) + g(p)).

The result follows by taking the standard part operator and using the fact that st(∆ ∗g(p, dx)) = 0.

Theorem 11.9. Let f :D → IR, p ∈ DNI , g: f [D] → IR, f(p) ∈ f [D]NL. If f′(p), g′(f(p)) ∈ IR,

Then for the composition (gf)(x) = g(f(x)), x ∈ D, (gf)′(p) ∈ IR and (gf)′(p) = g′(u)f ′(p), u =

f(p).

Proof. Let dx ∈ µ′(0), p + dx ∈ ∗D. Then ∗f(p + dx) = f(p) + k, k ∈ µ′(0)

by continuity. Hence, ∗g( ∗f(p + dx)) − g(f(p)) = k(g′(f(p)) + k(hg(f(p), k)) by Theo-

rem 11.7, which also holds if f is a constant in any interval about p. Consequently,∗g( ∗f(p + dx)) − g(f(p)) = ( ∗f(p + dx) − f(p))g′(f(p)) + ( ∗f(p + dx) − f(p))hg(f(p), k) =

f ′(p)g′(f(p))dx+ g′(f(p))dxhf (p, dx)+ f ′(p)dxhg(f(p), k)+ γ hf (p, dx)hg(f(p), k), γ ∈ µ(0). How-

ever, g′(f(p))dxhf (p, dx) + f ′(p)hg(f(p), k) + γhf(p, dx)hg(f(p), k) ∈ µ(0), for each dx ∈ µ′(0) and

the result follows.

Theorem 11.10. Suppose that f : (a, b) → IR, a < b, has a derivative for each p ∈ (a, b)

and both f, f ′ are uniformly continuous on (a, b). Then there is an uniformly continuous extension

g: IR → IR that extends f and g′ is a uniformly continuous extension f ′.

Proof. We know that f ′(b−), f ′(a+), f(b−), f(a+) exist. The result follows by defining

g(x) =

f(x) x ∈ (a, b)f(b−) + f ′(b−)(x− b) x ≥ bf(a+) + f ′(a+)(x− a) x ≤ a

.

Let p − q ∈ µ(0). If p, q ∈ ∗(a, b), then the result follows from the hypothesis. If p, q ∈ ∗ [b,+∞),

then ∗g(p)− ∗g(q) = f(b−) + f ′(b−)(p− b)− f(b−)− f ′(b−)(q − b) = f ′(b−)(p− q) ∈ µ(0) and in

like manner if p, q ∈ ∗( − ∞, a]. Let p ∈ ∗(a, b), q ∈ ∗ [b,+∞), q ≈ b. Then q ≈ b implies, since

p ≈ q, that p ≈ b and ∗g(p) = ∗f(p) ≈ f(b−). Now ∗g(q) = f(b−) + f ′(b−)(q − b) ≈ f(b−), since

q− b ∈ µ(0). Hence, ∗g(p)− ∗g(q) ∈ µ(0). In like manner, for (−∞, a] and for g′. The fact that both

g and g′ are uniformly continuous follows from Theorem 10.5 and the proof is complete.

The basic calculus I idea of the local (relative) maximum or local minimum point requires in

the definition quantification over the set of all open intervals about p ∈ D. The interior of a set

D denoted by int(D) is the set of all interior points, where by Theorem 8.8, p ∈ D is in int(D)

iff µ(p) ⊂ ∗D. Theorem 8.8 eliminates one quantifier from the basic definition. Does a similar

57

Page 58: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

elimination happen for a local maximum or local minimum? I’m sure you recall the definition

relative to the existence of an interval about p that is contained in D. The quantifier eliminated is

the “there exists.”

Theorem 11.11. Let f :D → IR. A point p ∈ int(D) determines a local maximum [resp.

minimum] iff ∗f(q) ≤ f(p), [resp. ≥ ] ∀ q ∈ µ(p).

Proof. The necessity follows from the definition and Theorem 8.8.

For the sufficiency, assume that for every r ∈ IR+ such that (−r + p, p + r) ⊂ D there exists

qr ∈ (−r+ p, p+ r) such that f(p) < f(qr). By *-transform, we have that if r ∈ µ(0)+, there is some

qr ∈ (−r + p, p + r) such that f(p) < ∗f(qr). However, qr ∈ µ(p) ⊂ ∗D; a contradiction and this

completes the proof for the local maximum. The local minimum is similar and the proof is complete.

Now for the major theorem used to find many of the local maximums or minimums. But, this

theorem does not restrict the derivative in the hypothesis to only finite derivatives, although the

conclusion will do so.

Theorem 11.12. Let f :D → IR and p ∈ int(D). If f is differentiable at p and p is a local

maximum or minimum, then f ′(p) = 0.

Proof. First, let p be a local maximum and f ′(p) ∈ IR. Then for dx ∈ µ(0)+ ∗f(p+ dx) ≤ f(p)

and ∗f(p− dx) ≤ f(p). Hence,

∗f(p+ dx)− f(p)

dx≤ 0 ≤

∗f(p− dx) − f(p)

−dx.

The result follows by taking the standard part of this inequality.

I now show that we cannot have that f ′(p) = ±∞. Suppose that f ′(p) = +∞. Then for each

dx ∈ µ′(0)+, ( ∗f(p + dx) − f(p))/dx > 1. This gives that ∗f(p + dx) − f(p) > dx > 0. Therefore,

f(p+ dx) > f(p) + dx ≥ ∗f(p+ dx) + dx from Theorem 11.11. This implies the contradiction that

dx < 0. By considering a −dx, it also follows that f ′(p) 6= −∞. In similar manner, the result holds

for the local minimum and the proof is complete.

Prior to a generalization of Rolle’s theorem, we need the notion of the boundary of a set D.

First, recall that if D is bounded, then cl(D) is bounded. The boundary of D, ∂D, is exactly

what you think it should be, ∂D = cl(D) ∩ cl(R − D). The boundary of a set is a closed set and

a nonstandard characteristic is obvious. For our basic sets, continuity at a boundary may be a

one-sided continuity or even continuity at isolated points. I’ve mostly been giving definitions and

even proofs, that are easily generalized to the multi-variable calculus.

Theorem 11.13. A point p ∈ ∂D iff µ(p) 6⊂ ∗D and µ(p) ∩ ∗D 6= ∅.

Theorem 11.14. Let f :D → IR, D be bounded, int(D) 6= ∅ and f is differentiable at each

p ∈ int(D). Further, if p ∈ int(D) and f ′(p) = ±∞, then f is continuous at p. Finally, assume that

lima f(x) = L, for each a ∈ ∂D. Then there exists some q ∈ int(D) such that f ′(q) = 0.

Proof. Clearly, f is continuous on int(D). Assume there does not exist some q ∈ int(D) such that

f ′(q) = 0. First, let D = cl(D) and p /∈ int(D). Since cl(D) = D = int(D) ∪ ∂D, then p ∈ ∂D and

p ∈ D. This implies that limp f(x) = L. Now assume, cl(D) 6= D and that p ∈ cl(D)−D, p /∈ int(D).

Then again p ∈ ∂D. In this case by Corollary 10.12, there exists a unique continuous extension

g: cl(D) → IR. Now cl(D) is bounded and closed and, hence, is compact. Thus, for f [resp. g] there

58

Page 59: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

is an xm, xM ∈ cl(D) where f [resp. g] attains its minimum at xm and maximum value at xM .

But, since f(q) 6= 0 for each q ∈ int(D), then xm, xM ∈ ∂D. However, since f is continuous on

int(D) and g is a continuous extension of f on ∂D, then L = g(xm) ≤ g(x) ≤ g(xM ) = L, for each

x ∈ cl(D), implies that g = f is constant on int(D). This implies that f ′(q) = 0 for each q ∈ int(D);

a contradiction and the result follows. (Note: The possibility that f ′(p) = ±∞ for some p ∈ int(D)

is still valid.)

Corollary 11.15. (Rolle’s theorem) Let a < b, f : (a, b) → IR be differentiable at each p ∈ (a, b),

and f(a+) = f(b−), then there exists some c ∈ (a, b) such that f ′(c) = 0.

The following has a rather involved hypothesis. All of the requirements appear necessary for this

generalization of the generalized mean value theorem. (Condition (1) holds if f and g are continuous

on ∂D. Further, the conclusion obviously holds under certain conditions for any p ∈ int(D) where

f ′(p)±∞ and g′(p) = ±∞.)

Theorem 11.16. Let f :D → IR, g:D → IR, where D is bounded and has non-empty interior.

Let f and g be finitely differentiable at each p ∈ int(D).

(1) Let lima f(x) ∈ IR, lima g(x) ∈ IR for each a ∈ ∂D and

(lima

f(x)− limb

f(x))(g(a) − g(b)) = (lima

g(x)− limb

g(x))(f(a) − f(b)), ∀ a, b ∈ ∂D.

Then for each a, b ∈ ∂D, then there is some p ∈ int(D) such that

f ′(p)(g(a) − g(b)) = g′(p)(f(a)− f(b)). (11.17)

Proof. Let a, b ∈ ∂D and consider F (x) = f(x)(g(a) − g(b)), G(x) = g(x)(f(a) − f(b)). Now

let h(x) = F (x) − G(x). Then h′(x) = F ′(x) − G′(x) ∈ IR for each x ∈ int(D). Clearly, for each

c ∈ ∂D, limc h(x) = limc f(x)(g(a) − g(b)) − limc g(x)(f(a) − f(b)) and condition (1) yields that

lima h(x) = limb h(x) for each a, b ∈ ∂D. Thus, by Theorem 11.14, there is some p ∈ int(D) such

that h′(p) = F ′(p)−G′(p) = 0 and the proof is complete.

Corollary 11.18. (Generalized Mean Value.) Let D = [a, b], a 6= b and f, g be finitely

differentiable on (a, b) and both are continuous at a and b. Then there exists some p ∈ (a, b) such

that f ′(p)(g(a)− g(b)) = g′(p)(f(a)− f(b)).

Proof. Condition (1) of Theorem 11.16 holds since f, g are both continuous at a, b.

Corollary 11.19. Let D be compact and int(D) 6= ∅. Let continuous f :D → IR be finitely

differentiable at each p ∈ int(D). Then, for each a, b,∈ ∂D, there exists a p ∈ int(D) such that

f ′(p)(b − a) = f(b)− f(a).

I conclude this chapter on basic derivative concepts, by apply Theorem 11.16 to the theory of

strictly increasing [resp. decreasing] functions.

Theorem 11.20. If int(D) 6= ∅, f :D → IR continuously and f ′(p) > 0 [resp. < 0] for each

p ∈ int(D), then f is strictly increasing [resp. decreasing] on every [a, b] ⊂ D, a 6= b.

Proof. Let a 6= b, [a, b] ⊂ D. Then [a, b] is compact and (a, b) ⊂ int(D). (The int(D) is the union

of the collection of all open sets that are subsets of D.) Hence, the conditions of Corollary 11.19 hold.

Thus, for any x < y, x, y ∈ [a, b] there exists some p ∈ (x, y) such that f(y)−f(x) = f ′(p)(y−x) > 0

59

Page 60: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

implies that f is strictly increasing on [a, b]. The proof for decreasing is similar and this complete

the proof.

Corollary 11.21 If a < b and f : [a, b] → IR continuously and f ′(p) = 0, ∀ p ∈ (a, b), then f is

constant on [a, b].

Finally, I remark that each of the previous theorems hold under *-transform and yield some

rather interesting conclusions. Here are two examples with the first a slightly modified application

of Corollary 11.19. Indeed, the major interest in the next result is when p ≈ q and this result is

used in the next chapter.

Theorem 11.22. Let int(D) 6= ∅ and f :D → IR be finitely differentiable at each p ∈ int(D).

Then for each p 6= q, p, q ∈ ∗(int(D)) such that [p, q] ⊂ ∗(int(D)) there exists some c ∈ ∗(p, q) such

that∗f ′(c) =

∗f(p)− ∗f(q)

p− q.

Proof. By *-transform.

Theorem 11.23. (The first L’Hospital Rule.) Assume that f : (a, b) → IR, g: (a, b) → IR and

for each c ∈ (a, b), f ′(c), g′(c) ∈ IR and g′(c) 6= 0. If f(a+) = g(a+) = 0 and lima+(f′(x)/g′(x)) ∈

IR [resp. ±∞], then lima+(f(x)/g(x)) = L.

Proof. Let (f ′(x)/g′(x)) → L as x → a+ and define f(a) = g(a) = 0. Then f and g are

continuous at a. Then f and g satisfy the hypotheses of Corollary 11.18. Let p ∈ µ(a)+ and

consider the *-transform of Corollary 11.18. Then there exists some t ∈ µ(a)+ such that a < t < p

and L ≈ ∗f ′(t)/ ∗g′(t) = ( ∗f(p) − f(a))/( ∗g(p) − g(a)) ≈ ∗f(p)/ ∗g(p) by considering the standard

part operator and the fact that ∗g(p) − g(a) 6= 0. Thus lima(f(a)/g(a)) = L. The proof for ±∞ is

similar and the proof is complete.

Obviously, this last result holds for the substitution of x → b− for x → a+.

Corollary 11.24. Assume that f : (c, b) → IR, g: (c, b) → IR and for each x ∈ ((c, b) −a), f ′(x), g′(x) ∈ IR and g′(c) 6= 0, where a ∈ (c, b). If lima f(x) = lima g(x) = 0 and

lima(f′(x)/g′(x)) = L [ resp. ±∞], then lima(f(x)/g(x)) = L [resp. ±∞ ].

Corollary 11.25. Under the hypotheses, of Theorem 11.23 [resp. Corollary 11.24], for each

ǫ, γ ∈ µ′(0)+ [resp. µ(0)], it follows that ∗f(a+ ǫ)/ ∗g(a+ γ) ≈ ∗f ′(a+ ǫ)/ ∗g′(a+ γ) ≈ L.

60

Page 61: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

12. SOME ADVANCED DERIVATIVE CONCEPTS

Before starting this chapter one small remainder. I will be working with non-trivial continuity

and the derivative at a point p ∈ D. In all cases, these are defined via non-isolated points. A

rather simple observation is that p ∈ D is not isolated iff there exists some q ∈ µ(p) ∩ ∗D such that

q 6= p. Let’s consider the ideas of the “higher order” differentials and their relation to “higher order”

increments, as well as uniform differentiability, and some inverse function theorems.

Recall the standard definition of the nth-order increment, where it is assumed the function

is appropriately defined at the indicated domain members. It’s defined by the recursive expression

∆nf(p, h) = ∆(∆n−1f(p, h)), where ∆0f(p, h) = f(p), ∆f(p, h) = f(p + h) − f(p). For example,

∆2f(p, h) = ∆(∆f(p, h)) = f(p+ 2h)− f(p+ h)− f(p+ h) + f(p) = f(p+ 2h)− 2f(p+ h) + f(p).

Then ∆3f(p, h) = f(p+ 3h)− 2f(p+ 2h) + f(p+ h)− f(p+ 2h) + 2f(p+ h)− f(p) = f(p+ 3h)−3f(p+ 2h) + 3f(p+ h)− f(p). From this we have that for any n ∈ IN

∆nf(p, h) =

n∑

0

(−1)k(nk

)

f(p+ (n− k)h) =

n∑

0

(−1)(n−k)

(nk

)

f(p+ kh),

where

(nk

)

= n!/((n − k)!k!), 0 ≤ k ≤ n is a “Binomial Coefficient.” I now consider the nth

derivative fn.

Theorem 12.1. For n ∈ IN′, b ∈ IR

+ and suppose that fn: [a, a+ nb] → IR. Then there exists

some t ∈ (a, a+ nb) such that ∆nf(a, b) = fn(t)bn.

Proof. This is established by induction. For n = 1, Corollary 11.19 yields the result. Let

g(x, b) = f(x+b)−f(x). Then gn−1(x, b) = fn−1(x+b)−fn−1(x) ∈ IR, for each x ∈ [a, a+(n−1)b].

Thus, there exists some t0 ∈ (a, a+ (n− 1)b) such that ∆n−1g(a, b) = gn−1(t0, b)bn−1. Observe that

∆n−1g(a, b) = ∆nf(a, b). Hence, there exists some t1 ∈ (t0, t0+ b) such that gn−1(t0, b) = fn−1(t0+

b)−fn−1(t0) = fn(t1)b. Consequently, ∆gn−1(a, b) = gn−1(t0, b)bn−1 = fn(t1)b

n = ∆nf(a, b), where

t1 ∈ (a, a+ nb). The result follows by induction.

Corollary 12.2. Let fn: [a, b] → IR. Then for each dx ∈ µ(0)+ and c ∈ ∗ [a, b), there exists

some t ∈ (c, c+ ndx) such that ∆n ∗f(c, dx) = ∗fn(t)(dx)n.

Proof. This follows from *-transform and the fact that [c, c+ ndx] ⊂ ∗ [a, b).

Observe that Theorem 12.1 and Corollary 12.2 clearly hold for the case that fn: [a+nb, a] → IR,

where b ∈ µ(0)−. Now define the nth order differential at p for y = f(x) by dny = dnf(p) =

fn(p)(dx)n = fn(p)dxn. Of course, f0(p) = f(p).

Theorem 12.3. Let fn: [a, b] → IR, n ∈ IN′.

(i) If fn is continuous at a, then for each dx ∈ µ(0)+ and each p ∈ µ(a) ∩ ∗ [a, b],

∗fn(p) ≈ ∆n ∗f(p, dx)/dxn.

(ii) For each c ∈ (a, b) and each dx ∈ µ(0)+,

fn(c) ≈ ∆n ∗f(c, dx)/dxn.

Proof. (i) This obviously holds for n = 0 by continuity at a. From Corollary 12.2 and assuming

that n ≥ 1, we have that for dx ∈ µ(0)+ and p ∈ ∗ [a, b), there is some t ∈ (p, p + ndx) such that

61

Page 62: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

∆n ∗f(p, dx)/dxn = ∗f(t). Now let p ∈ µ(a) ∩ ∗ [a, b]. Then p ≈ a ≈ t and continuity imply that

fn(p) ≈ ∆n ∗f(p, dx)/dxn.

(ii) This obviously holds for n = 0, 1. For n ≥ 2, in order to show this, I consider only interior

points and establish this by induction. Notice that it’s not required that fn be continuous on (a, b).

Let w ∈ IR+ such that nonempty [c,+c + nw] ⊂ [c, c + (n − 1)w] ⊂ [a, b]. Such a w always exists.

Define g: [c, c + (n − 1)w] → IR, by g(y, w) = f(y + w) − f(y), y ∈ [c, c + (n − 1)w]. The function

g(y, w) satisfies the requirements of Theorem 12.1. Hence, there exists some t ∈ (c, c + (n − 1)w)

such that ∆n−1g(c, w) = gn−1(t)wn−1. By *-transform, we have that if w = dx ∈ µ(0)+, then there

exists some t1 ∈ (c+(n−1)dx) such that ∆n−1 ∗g(c, dx) = ∗gn−1(t1) dxn−1. The definition of g(y, w),

and the fact that for n ≥ 2, in general, gn−1(y, w) = ∗fn−1(y+w)− fn−1(y) yields, by *-transform,

that ∆n ∗f(c, dx) = ∆n−1 ∗g(c, dx) and

∆ ∗gn−1(c, dx) dxn−1 =∗fn−1(t1 + dx) − ∗fn−1(t1)

dxdxn.

Consequently,∆n ∗f(c, dx)

dxn=

∗fn−1(t1 + dx)− ∗fn−1(t1)

dx=

∗fn−1(t1 + dx) − fn−1(c)

t1 + dx− c

t1 + dx− c

dx+

fn−1(c)− ∗fn−1(t1)

c− t1

c− t1dx

=

∗fn−1(c+ dx1)− fn−1(c)

dx1

t1 + dx − c

dx− fn−1(c)− ∗fn−1(c+ dx2)

dx2

c− t1dx

fn(c) st

(t1 + dx− c

dx

)

+ fn(c) st

(c− t1dx

)

=

fn(c) st

(t1 + dx− c+ c− t1

dx

)

= fn(c),

where dx1 = t1 + dx− c, dx2 = t1 − c and dx1, dx2 ∈ µ(0) and the proof is complete.

Theorem 12.3 holds for the appropriate negative increments and these results relate directly

to notion of the nth-order approximation via the nth order ideals. This is because (i) yields that∗fn(p)dxn ≈ ∆n ∗f(p, dx) and (ii) ∗fn(c)dxn ≈ ∆n ∗f(c, dx). I have mentioned the first-order ideal

generated by any dx. For n > 1, the nth-order ideal are generated by the dxn and is a strict subset

of the dxn−1 (n − 1)th order ideal. I mentioned that many of the standard theorems have useful

nonstandard statements. One of these is the nonstandard mean value theorem. For any x, y ∈ ∗IR,

the nonstandard interval [x, y] restricted to members of a particular set A ⊂ IR is easily defined by

*-transform. We know that ∀x∀y∀z((x ∈ A) ∧ (y ∈ A) ∧ (z ∈ A) → (z ∈ [x, y] ↔ x ≤ z ≤ y)).

Thus, for p, q ∈ ∗A, p ≤ q, one simply considers the symbol [p, q] for this *-transform. Such an

abbreviation occurs in the *-transform of Corollary 11.19.

Theorem 12.4. Let f :D → IR by finitely differentiable at each p ∈ int(D). For distinct

p, q ∈ ∗(int(D)) such that [p, q] ⊂ ∗(int(D)), there is some c ∈ ∗(p, q) such that ∗f ′(c) = ( ∗f(p)−∗f(q))/(p− q).

Theorem 12.4 is useful in the study of derivatives that are also continuous. Indeed, f :D → IR

is said to be continuously differentiable on D iff f ′ is continuous on D.

62

Page 63: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 12.5. Let f :G → IR be continuously differentiable on G, where G be an nonempty

open subset of IR. Then for each c ∈ G and p, q ∈ µ(c), f ′(p) ≈ ( ∗f(q + dx) − ∗f(q))/dx.

Proof. Observer that µ(c) ⊂ ∗G. Suppose f ′ is continuous at c. Let dx ∈ µ(0)±. We have that

for any p ∈ µ(c), [p, p+ dx] ⊂ ∗G or [p + dx, p] ⊂ ∗G, respectively. Thus by Theorem 12.4, there

is some s such that, in either case, ∗f ′(s) = ( ∗f(p + dx) − ∗f(p))/dx. By continuity, for any other

q ∈ µ(c), ∗f ′(q) ≈ ∗f ′(s) ≈ ∗f(p) and this completes the proof.

As is very well know the derivative can exist but not be continuous. Determining when the

derivative is continuous is a substantial problem. With this in mind, I show how a slight change in

the conclusion of Theorem 11.2 implies the continuity of the derivative.

Definition 12.6 (Uniformly Differentiable.) Let f :D → IR, c ∈ DNI and ∗f ′(c) ∈ ∗IR.

Then f is said to be uniformly differentiable at c iff for each distinct x, y ∈ µ(p) ∩ ∗D

f ′(c) ≈∗f(x)− ∗f(y)

x− y.

By now you should have no difficulty translating Definition 12.6 into standard terms, where

p ∈ DNI , f′(p) ∈ IR. Then such a translating gives that for any w > 0, there’s an open interval

(−r + p, p + r) about p ∈ DNI such that whenever distinct x, y ∈ (−r + p, p + r) ∩ D, then

|f ′(p)− (f(x)−f(y))/(x−y)| < w as the equivalent statement. The uniform part is the requirement

that x, y be somewhat unrestricted within an interval about p.

Theorem 12.7. If for nonempty open G ⊂ IR, f :G → IR, p ∈ G and f ′ is continuous at p.

Then f is uniformly differentiable at p.

Proof. This come from Theorem 12.5 by letting x− y = dx.

Example 12.8 Uniform differentiability was first investigated rather recently (Bahrens, 1972).

It’s major contribution is that a major theorem dealing with inverses, which was previously es-

tablished for continuously differentiable functions, holds true for uniformly differentiable func-

tions. There are many functions that are uniformly differentiable at a point but not differentiable

throughout any open interval about that point and, hence, not continously differentiable at that

point. As an example, consider a function constructed as follows on [−1, 1]. Consider generat-

ing a function f in the following manner. For each n > 0 generate a collection of points by

the recursion starting with x = ±1, f(±1/n) = 1, n = 1. Then, for each x = ±1/(n + 1), let

f(±1/(n+1)) = f(±1/n)− 1/(n2(n+1)). Then consider line segments, connecting successive pairs

of these points as end points, as generating the function f defined on [−1, 1]. The slope of each of

these line segments n > 0 from (±1/(n + 1), f(±1/(n+ 1)) to (±1/n, f(±1/n)) is 1/n. It follows

that f ′(0) = 0 and that f is uniformly differentiable at p = 0. However, any interval (−r, r), r ∈ IR+

about p = 0 contains a point where f ′ does not exist.

I mention that for the real numbers if I is an open set such that real p ∈ I, then there always

exists some r ∈ IR+ such that p ∈ (−r + p, p+ r) = Ip ⊂ I. The Ip is an open interval about p.

Theorem 12.9. Let f :D → IR, p ∈ DNI . If I is an open interval about p, and f is uniformly

differentiable for each c ∈ I ∩DNI , then f ′ is continuous at p.

Proof. Observer that since p ∈ DNI iff µ′(p)∩ ∗D 6= ∅ and p ∈ D′ Thus, there are a lot of these

open intervals I about p such that I ′ ∩D 6= ∅. We first have that f ′(c) ∈ IR, c ∈ DNI . Let r ∈ IR+.

63

Page 64: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Then there exists a w ∈ IR+ such that for each h ∈ IR such that 0 < |h| < w, c+ h ∈ D

∣∣∣∣f ′(c)− f(c+ h)− f(c)

h

∣∣∣∣< r.

Let q ∈ µ(p) ∩ ∗I ∩ ∗D and any dx ∈ µ(0) such that q + dx ∈ µ(p) ∩ ( ∗I ∩ ∗D). Hence, considering

*-transform for arbitrary r ∈ IR+, it follows that

∣∣∣∗f ′(q)−

∗f(q+dx)− ∗f(q)dx

∣∣∣ < r. Thus, ∗f ′(q) ≈

∗f(q+dx)− ∗f(q)dx . Uniformly differentiable yields that f ′(p) ≈ ∗f ′(q). The point q was an arbitrary

member of µ(p)∩ ∗I ∩ ∗D. Since µ(p) ⊂ ∗I, this yields that ∗f ′[µ(p)∩ ∗D] ⊂ µ(f ′(p)) and the proof

is complete.

Corollary 12.10. If nonempty open G ⊂ D and f :D → IR is uniformly differentiable on G

(i.e. at each c ∈ G), then f ′ is continuous on G.

Corollary 12.11. Let f :D → IR, p ∈ DNI and f is uniformly differentiable at p. If for each

q ∈ µ(p) ∩ ∗D and dx ∈ µ′(0) such that q + dx ∈ ∗D, ∗f(x) ≈ ( ∗f(x + dx) − ∗f(x))/dx, then f ′ is

continuous at p.

Although uniform differentiability at a point does not imply that the derivative is continuous

at that point, what does happen is that it forces f to be continuous on an entire non-trivial set that

contains p.

Theorem 12.12. Suppose that f :D → IR is uniformly differentiable at p ∈ int(D). Then there

exists some open interval I ⊂ D about p such that f is continuous on I.

Proof. Since p ∈ int(D), there are many open intervals I about p such that I ⊂ D. Let

L = |f ′(p)| + 1. Assume that for each open interval I ⊂ D about p there is some y ∈ I such

that f is not continuous at y. Since µ(p) ⊂ ∗D, then by *-transform, each microinterval Iγ =

(−γ + p, p + γ), γ ∈ µ(0)+ contains some y such that ∗f is not *-continuous at y. This translates

to say that there is some r ∈ ∗IR such that for all w ∈ ∗

IR+ such that |x− y| < w and ∗f is defined

at x, then |f ∗f(x) − ∗f(y)| ≥ r. Hence, for any such r, there is some x ∈ µ(0) such that x 6= y and

r/L > |x − y|. Thus, | ∗f(x) − ∗f(y)| > L|x − | > 0. Hence, |( ∗f(x) − ∗f(y))/(x − y)| > |f ′(p)| + 1.

This contradicts uniform differentiability at p and the result follows.

Corollary 12.13. Let nonempty open G ⊂ IR and f :G → IR be uniformly differentiable at

p ∈ G. Then there exists an open interval I such that p ∈ I, and f is continuous on I.

I’ll shortly use these ideas on uniform differentiability for an investigation of how the inverse

function for an appropriate differentiable function behaves.

Definition 12.14. (Darboux Property.) A function f :D → IR is said to have the Darboux

property of D iff for each a, b ∈ D such that [a, b] ⊂ D, either [f(a), f(b)] ⊂ f [[a, b]] or [f(b), f(a)] ⊂[a, b]. Also recall that a function f on [a, b] is one-to-one or an injection iff for each distinct

x, y ∈ [a, b] f(x) 6= f(y).

Theorem 12.15. If f : [a, b] → IR, a < b, is an injection and Darboux, then f is either strictly

increasing or strictly decreasing. Further, f [[a, b]] is a non-trivial closed interval with end points

f(a) and f(b).

Proof. Since f(a) 6= f(b), I can simply assume that f(a) < f(b). Let x, y, z ∈ [a, b], x <

z < y, f(x) < f(y), but f(x) 6< f(z) or f(z) 6< f(y). One-to-one implies that f(x) > f(z) or

f(z) > f(y). If f(z) > f(y) > f(x), then the Darboux property implies that there exists some w

64

Page 65: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

such that x < w < z and f(w) = f(y). Since w 6= y this contradicts one-to-one. In like manner,

for f(x) 6> f(z). Therefore, f(x) < f(z) < f(y). Now let c, d ∈ [a, b] such that a < c < d < b.

Then f(a) < f(c) < f(d) and f(c) < f(d) < f(b). Thus, in this case, f is strictly increasing and

f [[a, b]] = [f(a), f(b)]. A similar argument shows that if f(b) < f(a), then f is strictly decreasing.

The Darboux property now implies that f [[a, b]] is a nontrivial closed interval.

Corollary 12.16. Let continuous f : [a, b] → IR. Then f is an injection iff f is either strictly

increasing or decreasing on [a, b].

I now show that there are discontinuous functions that have the Darboux property.

Theorem 12.17. If f :D → IR is finitely differentiable on D, then f ′ has the Darboux property.

Proof. All we need to do is to consider what happens if a, b ∈ D and [a, b] ⊂ D and f ′(a) < f ′(b).

Suppose that f ′(a) < k < f ′(b). Then the function g: [a, b] → IR defined by g(x) = f(x) − kx is

finitely differentiable on [a, b]. Hence g is continuous. Thus, there is some c ∈ [a, b] such that

g(c) ≤ g(x), ∀x ∈ [a, b], (i.e. g(c) is the minimum value of g on [a, b]. Since g′(x) = f ′(x) − k,

then g′(b) = f ′(b) − k > 0. In like manner, g′(a) = f ′(a) − k < 0. Let dx ∈ µ(0)−. Then ( ∗g(b +

dx)− g(b))/dx > 0 implies that ∗g(b+ dx) < g(b). Consequently, there is some x ∈ [a, b], by reverse

*-transform, such that g(x) < g(b). In like manner, there exists some y ∈ [a, b] such that g(y) < g(a).

Hence, a, b 6= c. Therefore, g′(c) = 0 implies that f ′(c) = k and the proof is complete.

I point out that there are examples of functions finitely differentiable on [0, 1] but with un-

countable many discontinuities (Burrill and Knudsen, 1969, p. 191.) Finally, in this chapter, I’ll

investigate various types of inverse function theorems. Let f : (a, b) → IR be continuous. Then f

is Darboux on (a, b). Thus, f defined on [c, d] ⊂ (a, b) is an injection iff f is strictly increasing or

decreasing on [c, d]. In this case, f has an inverse function f−1 such that f−1: f [[c, d]] → [c, d] and

f−1 is an injection onto [c, d] which is also strictly monotone in the same sense.

Theorem 12.18. Let the injection f :D → IR be continuous on D and D is compact. Then the

inverse function f−1: f [D] → D is continuous on f [D].

Proof. Let f(p) ∈ f [D]. We know from one-to-one that ∗f is one-to-one and that for each p ∈ D,∗f [µ(p)∩ ∗D] = ∗f [µ(p)]∩ ∗(f [D]). However, for our purposes consider from continuity that ∗f [µ(p)∩∗D] ⊂ µ(f(p))∩ ∗(f [D]). Let q ∈ µ(f(p))∩ ∗(f [D]). Then there is some s ∈ ∗D such that ∗f(s) = q.

From compactness, there is a p1 ∈ D, such that s ∈ µ(p1) and∗f [µ(p1) ∩ ∗D] ⊂ µ(f(p1)) ∩ ∗(f [D])

implies that q ∈ µ(f(p)) ∩ µ(f(p1)). Thus f(p1) = f(p). From one-to-one, this gives that p = p1.

Consequently, q ∈ µ(p) implies that q ∈ ∗f [µ(p) ∩ ∗D]. Hence, ∗f [µ(p) ∩ ∗D] = µ(f(p)) ∩ ∗(f [D]).

One-to-one gives that µ(p) ∩ ∗D = ∗f−1[µ(f(p)) ∩ ∗(f [D]). Thus, f−1 is continuous at f(p).

Theorem 12.19. Let I be an interval with more than one point. If the injection f : I → IR is

continuous I, then f−1: f [I] → I is continuous on f [I].

Proof. For any p ∈ I, p ∈ [a, b] ⊂ I, for some a, b such that a 6= b.

Corollary 12.20. Let non-empty open G ⊂ IR and the injection f :G → IR is continuous on G.

Then f−1: f [G] → G is continuous on f [G].

Theorem 12.21. For a < b, let the injection f : [a, b] → IR be continuous on [a, b]. If non-zero

f ′(p) ∈ IR, p ∈ [a, b], then (f−1)′(f(p)) = 1/f ′(p).

Proof. Since f is continuous and one-to-one on [a,b], then f is strictly monotone. Assume f is

strictly increasing. Then f [[a, b]] = [f(a), f(b)], f(a) < f(b). The injection f−1: [f(a), f(b)] → [a, b]

65

Page 66: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

is continuous and strictly increasing on [f(a), f(b)]. Let f(p) ∈ [f(a), f(b)]. Clearly f(p) is a cluster

point. Let dx ∈ µ(0)′ and f(p) + dx ∈ µ(f(p) ∩ ∗ [f(a), f(b)] and consider h = ∗f−1(f(p) + dx) −f−1(f(p)). Since f−1 is continuous and one-to-one, then h ∈ µ′(0). Further, one-to-one also implies

that ∗f(h+ p) = f(p) + dx. Now

f ′(p) ≈∗f(h+ p)− f(p)

h=

dx

h

and f ′(p) 6= 0 imply, by considering properties of the standard part operator, that

1

f ′(p)≈ h

dx=

∗f−1(f(p)− dx) − f−1(f(p))

dx.

This completes the proof.

Corollary 12.22. Let non-empty open G ⊂ IR and the injection f :G → IR be continuous on

G. If for p ∈ G, 0 6= f ′(p) ∈ IR, then (f−1)′(f(p)) = 1/f ′(p).

Theorem 12.23. Let strictly monotone f :D → IR be continuous on compact D and, for p ∈ D,

0 6= f ′(p) ∈ IR. Then (f−1)′(f(p)) = 1/f ′(p).

Proof. Assume that f is strictly increasing and, thus, one-to-one. The f−1: f [D] → D is

continuous and strictly increasing on compact f [D]. Since p ∈ DNI , then µ(p)∩ ( ∗D−D) 6= ∅. For,q ∈ µ(p)∩ ( ∗D−D), continuity and strictly increasing imply that ∗f(q) ∈ µ(f(p))∩ [ ∗(f [D])−f [D]].

Thus, f(p) is a cluster point. The proof now follows as in Theorem 12.21.

Example 12.8 can be modified to obtain a strictly increasing function of [−1, 1] such that

f ′(0) 6= 0. By Theorem 12.21, (f−1)′(f(0)) = 1/f ′(0) but it is not continuous since f ′(p) does not

exist on any open interval that contains 0. It is, however, uniformly differentiable. This is why the

next result is a recent improvement over all other previous results relative to differentiable inverses.

Theorem 12.24. Let f :D → IR, where D is compact, f is strictly monotone on D and at

p ∈ int(D), 0 6= f ′(p) is uniformly differentiable. Then f−1 is uniformly differentiable at f(p) and

(f−1)′(f(p)) = 1/f ′(p).

Proof. Assume that f is strictly increasing on D. Then f−1 exists for f [D]. Uniformly differen-

tiable implies that f is continuous on some I ⊂ D, where I is an open interval about p. Thus, there

exists [a, b], (a 6= b) such that p ∈ [a, b] ⊂ I ⊂ D. Since [a, b] is compact, then the result that f−1 is

differentiable at f(p) and that (f−1)′(p) = 1/f ′(p) follows from Theorem 12.23.

Assume that dy ∈ µ′(0), y ∈ µ(f(p)) ⊂ ∗f [ ∗D] such that y + dy ∈ ∗f [ ∗D] and

(f−1)′(f(p)) 6≈∗f−1(y + dy)− ∗f−1(y)

dy.

Then, from properties of the “st” operator,

f ′(p) 6≈ dy∗f−1(y + dy)− ∗f−1(y)

.

Observe that y = ∗f(q) for some unique q ∈ µ(p) ∩ ∗D. Further, continuity of f−1 at f(p) and

increasing imply that 0 6= ∗f−1(y+ dy)− ∗f−1(y) = h ∈ µ′(0). Now ∗f−1(y+ dy) = q+h ∈ ∗D and,

of course, q + h ∈ µ(p). Thus, y + dy = ∗f(q + h) yields dy = ∗f(q + h)− ∗f(q). Therefore,

f ′(p) 6≈∗f(q + h)− ∗f(q)

h;

66

Page 67: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

a contradiction of uniformly differentiable of f at p. Thus, f ′(p) is uniformly differentiable at f(p)

and the proof is complete.

Finally, I apply some of these previously results to establish a major classical theorem on inverse

functions. THE inverse function theorem.

Theorem 12.25. Let G be a non-empty open subset IR. Let the f ′:G → IR be continuous on G.

Then at any p ∈ G, where f ′(p) 6= 0, there exist open intervals I and U such that p ∈ I ⊂ G, f [I] =

U ⊂ f [G] and f−1 exists U , (f−1)′ is continuous on U, and (f−1)′(p) = 1/f ′(p) for each p ∈ U.

Proof. Let p ∈ G and f ′(p) 6= 0. I show first that f(p) ∈ int(f [G]). Let p ∈ (a, b) ⊂ G.

Since f ′(p) 6= 0 and f ′(p) is continuous on (a, b) there exists some open interval I0 such that

p ∈ I0 ⊂ (a, b) and f ′(x) > 0 for each x ∈ I0 or f ′(x) < 0 for each x ∈ I0. Thus, f is strictly

monotone and continuous on I0. So, f is one-to-one on I0. Further, there is a closed non-trivial

interval [c, d] and the open (c, d) such that p ∈ (c, d) ⊂ [c, d] ⊂ I0. Consider that case where

f(c) < f(d). Since [c, d] is compact, it follows that for [f(c), f(d)], f−1: [f(c), f(d)] → [c, d] is

continuous on [f(c), f(d)] by Theorem 12.18 and , hence, continuous on (f(c), f(d)). Now Theorem

12.15 implies that f((c, d)) = (f(c), f(d)) ⊂ int(f [G]). Theorem 12.7 yields that f is uniformly

differentiable on (c, d). Theorem 12.24 gives that f−1 is uniformly differentiable on (f(c), f(d))

and (f−1)′(x) = 1/f ′(y), f(y) = x for each x ∈ (f(c), f(d)). Theorem 12.9 yields that (f−1)′ is

continuous on (f(c), f(d)). In like manner for the case f(c) > f(d) and the proof is complete.

67

Page 68: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

13. RIEMANN INTEGRATION

Since I’m using an arbitrary free ultrafilter generated nonstandard model for analysis and the

somewhat weak structure M one should not expect that M models all aspects needed for Riemann

integration. For this reason, a few results use standard proofs. One the other hand, I’ll obtain many

results relative to the Riemann integral by means of proofs using nonstandard techniques. Unless

otherwise stated, all functions, such as f , discussed in the chapter will be bounded

and, for a < b, map [a, b] into IR. This is a generalization for a basic analysis course of the usual

Calculus I requirement that F is a (bounded) piecewise continuous function defined on [a, b].

One of the problems with Riemann integration is that it may be stated in terms of any partition

of [a, b]. Nonstandard analysis allows us to eliminate this “any partition” notion. For our purposes

a partition of [a,b] is but a finite collection of points a = x0 < · · · < xn ≤ xn+1 = b. Thereare different ways to generate “simple partitions.” The one now introduced is considered as a very

simple type of partition of [a, b] and allows any positive infinitesimals to generator a nonstandard

partition.

Definition 13.1. (The Simple Partition.) In this chapter, let ∆x always denote a positive

real number. For ∆x, there exists a largest natural number “n” such that a+n(∆x) ≤ b. Define xn =

a+ n∆x ≤ b. Then there is a unique partition of [a, b], P (∆x) = a = x0 < · · · < xn ≤ xn+1 = bsuch that for each [xi, xi+1], xi+1−xi = ∆x, i = 0, . . . , i = n−1 and xn+1−xn = b− (a+nx) < ∆x

due to the statement dealing with n being the “largest n” such that 0 ≤ b− (a+ n∆x).

It’s possible that xn = xn+1. Indeed, let ∆x = b − a. Then n = 1 and x1 = x2 = b. The

existence of this unique largest n can be expressed in our formal language as follows

∀x((x ∈ IR+) → ∃y((y ∈ IN) ∧ (a+ yx ≤ b) ∧ ∀z((z ∈ IN)(a+ zx ≤ b) → (z ≤ y)))).

Further, for this unique n, there’s a function from [0, n+1] → [a, b] that generates all of the partition

points, where x0 = a and xn+1 = b. It’s defined by letting xk = (a+k∆x), 0 ≤ k ≤ n, xn+1 = b. Such

functions are called partial sequences. For every ∆x, there exists such a partial sequence. This

allows one to define what is termed as a “fine partition” for each positive infinitesimal. For such ∆x,

the partial sequence has a hyperfinite domain since it’s not difficult to show that if ∆x = γ ∈ µ(0)+,

then the unique n is a member of IN∞. For this reason, such partial sequences generated by positive

infinitesimals are often called hyperfinite sequences.

Definition 13.2. (A Fine Partition.) Let dx, dy, dz etc. denote members of µ(0)+. For any

dx, there exists a unique Λ ∈ IN∞ such that a+ Λdx ≤ b and ∀ k ∈ ∗IN if a+ k dx ≤ b, then k ≤ Λ.

The hyperfinite sequence S: [0,Λ] → ∗ [a, b] such that xk = (a+k dx), k ∈ [0,Λ] and xΛ+1 = b yields

a fine partition P (dx) of (for) [a, b], where xk+1 − xk = dx, 0 ≤ k < Λ, b = xΛ+1 − xΛ < dx.

Since I only consider bounded functions, then this investigation is based upon the completeness

of the real numbers. So, as usual, for each closed interval [xi, xi+1], let mi = inff(x) | xi ≤ x ≤xi+1 and Mi = supf(x) | xi ≤ x ≤ xi+1. (Recall that “inf” is the greatest lower bound of a set,

and “sup” is the least upper bound.) Of course, mi ≤ Mi.

68

Page 69: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Definition 13.3. (Upper and Lower Sums). For each ∆x, and the bounded function f ,

two operators are defined as follows:

L(f,∆x) =

(n−1∑

0

mi∆x

)

+mn(b− xn)

U(f,∆x) =

(n−1∑

0

Mi∆x

)

+Mn(b− xn).

For the fixed function f , the lower sum L(f, ·): IR+ → IR, and the upper sum U(f, ·): IR+ → IR

have the usual nonstandard extensions.

Definition 13.4. (Hyperfinite sums.) For each dx, ∗L(f, dx) and ∗U(f, dx) are called the

lower hyperfinite sum and upper hyperfinite sum, respectively. I’ve used a slight abbreviation

in this notation, where the f in the notation is actually ∗f.

Theorem 13.5. For each dx and any f ,

(i) ∗L(f, dx) ≤ ∗U(f, dx),

(ii) ∗L(f, dx), ∗U(f, dx) ∈ G(0).

Proof. Since f is bounded on [a, b], then there exist n,m ∈ IR such that m ≤ f(x) ≤ M, ∀x ∈[a, b]. Consider any ∆x. Then m∆x ≤ f(x)∆x ≤ M∆x yields that

m

((n−1∑

0

∆x

)

+ (b− xn)

)

≤(

n−1∑

0

mi∆x

)

+mn(b − xn) ≤

(n−1∑

0

Mi∆x

)

+Mn(b − xn) ≤ M

((n−1∑

0

∆x

)

+ (b− xn)

)

,

since these are finite summations. Hence, m(b − a) ≤ L(f,∆x) ≤ U(f,∆x) ≤ M(b − a). Then the

sentence

∀x((x ∈ IR+) → m(b− a) ≤ L(f, x) ≤ U(f, x) ≤ M(b− a))

holds in M; and, hence, in ∗M. By *-transform, the result follows.

In the theory of Riemann integration, refinements of a partition play a significant role. They also

present significant intuitive problems, as well. The next result is similar to a refinement proposition

for Riemann sums.

Theorem 13.6. For every ∆x and for every p ∈ ∗IN

′ = ∗IN− 0,

∗L(f,∆x) ≤ ∗L(f,∆x/p) ≤ ∗U(f,∆x/p) ≤ ∗U(f,∆).

Proof. It follows from the definition, that for each ∆x and corresponding partition P (∆x)

generated by ∆x that P (∆x) ⊂ P (∆x/n), n ∈ IN′. Now I need to consider a standard argument

at this point and direct you to Theorem 10.1 in Burrill and Knudsen (1969, p. 199), where it is

established that, for our case,

L(f,∆x) ≤ L(f,∆x/n) ≤ U(f,∆x/n) ≤ U(f,∆x),

for n ∈ IN′. The result follows by *-transform.

69

Page 70: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

I point out that Theorem 13.5 shows that for each positive infinitesimal dx, since ∗L(f, dx) ≤∗U(f, dx), then st( ∗L(f, dy)) ≤ st( ∗U(f, dy)). Also, I won’t continue to mentioned the fact that

0 ≤ ∗U(f, p)− ∗L(f, p) for each p ∈ ∗IR

+.

Definition 13.7. (Integrable Functions.) The function f is (simply) integrable iff there

is some dx such that

st( ∗L(f, dx)) = st( ∗U(f, dx)) iff

∗U(f, dx)− ∗L(f, dx) ∈ µ(0).

If f is integrable, then we denote st( ∗L(f, dx)) =∫ b

af dx as the integral, where it’s understood

that this is the simple integral.

I also use the notation∫ b

af dx ∈ IR to indicate that f is integrable on [a, b]. At the moment, it

appears that the value of the integral might depend upon the dx chosen. I’ll show, later, that this

is not the case. Of course, it’s clear from above that if∫ b

af dx ∈ IR, then

∫ b

af (dx/n) ∈ IR, n ∈ ∗

IN′.

What functions are integrable?

Theorem 13.8. If f is monotone on [a, b], then∫ b

af dx ∈ IR.

Proof. Assume that f is increasing. For an ∆x generated partition, Mi = f(xi+1), mi =

f(xi), i = 0, . . . , n. For n ∈ IN′, let ∆x = (b− a)/n. Then

U(f(∆x)− L(f,∆x) = ((b − a)/n)(f(b)− f(a)).

By *-transform, for each Λ ∈ IN∞, where dx = (b − a)/Λ,

∗U(f, dx)− ∗L(f, dx) = dx( ∗f(b)− ∗f(a)) ∈ µ(0).

and the result follows.

What is needed is a general standard characterization for integrability in our sense.

Theorem 13.9. The function f is integrable on [a, b], for some dx ∈ µ(0)+, iff, for each

r ∈ IR+, there is some ∆x ∈ IR

+ such that

U(f,∆x)− L(f,∆x) < r.

Proof. Assume that∫ b

af dx ∈ IR and r ∈ IR

+. Then dx ∈ µ(0)+ and ∗U(f, dx)− ∗L(f, dx) < r.

The necessity follows by reverse *-transform.

For the sufficiency, assume that r ∈ IR+ and that there exists some ∆x such that U(f,∆x) −

L(f,∆x) < r. If n ∈ IN′, then it also follows that U(f,∆x/n) − L(f,∆x/n) < r by Theorem 13.6

restricted to IN′. However, there always exists an n ∈ IN

′ such that 0 < ∆x/n < r. Consequently, the

sentence

∀x((x ∈ IR+) → ∃y((y ∈ IR

+) ∧ (y < x) ∧ (U(f, y)− L(f, y) < x)))

holds in M; and, hence, in ∗M. Letting γ ∈ µ(0)+, there exists some dx such that ∗U(f, dx) −∗L(f, dx) < γ. The result follows from Definition 13.7 since ∗U(f, dx) − ∗L(f, dx) ∈ µ(0).

Corollary 13.10. The function f is integrable on [a, b] iff for each γ ∈ µ(0)′ there exists some

dx such that ∗U(f, dx)− ∗L(f, dx) < γ.

70

Page 71: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

I’ll denote Riemann integration by the symbol R∫ b

a f dx. This form of integration is defined

in terms of what appears to be a more general form of partitioning of [a, b]. This may be why

many students, when they first encounter the complete definition for Riemann integration, find it

somewhat difficult to comprehend. I’ll show later that the simple integral as defined here for a rather

simple type of partitioning is equivalent to the Riemann integral. There are two major but equivalent

definitions for the Riemann integral. (A) You consider the same idea of the lower and upper sums,

but you do not restrict the partitions. You define these sums in exactly the same way but for

any partition P ′. But, then you must also do the following. You consider the numbers R(f) =

supL(f, P ′) | P ′ any partition of [a, b] and R(f) = infL(f, P ′) | P ′ any partition of [a, b]. IfR(f) = R(f), then this value is the Riemann integral of f. For the structure I’m working with,

the collection of all such partitions is not part of the structure. So, this is why I need to use

some results obtained by standard means. Then there is the more familiar equivalent definition.

(B) You consider a general partition P ′ = a = x0 < · · · < xn = b, n > 0, where one defines

the mesh(P ) = max∆xi | (∆xi = xi − xi−1) ∧ (i = 0, . . . , n). Then you consider any finite

collection qi ∈ [xi−1, xi] and evaluate the function at these points and consider the Riemann sum∑n

1 f(qi)(xi − xi−1). Then a number R∫ b

ais the Riemann integral iff for each r ∈ IR

+, there exists

a w ∈ IR+ such that for every partition P ′ such that mesh(P ′) < w and every qi ∈ [xi−1, xi],

∣∣∣∣∣

n∑

1

f(qi)(xi − xi−1)−R

∫ b

a

f dx

∣∣∣∣∣< r.

Although the notation contains the symbol dx, infinitesimals are not mentioned in definitions (A)

and (B). For bounded f , I use Definition (A) for the Riemann integral since the only difference is in

the collection of partitions needed. For dx, the partition notation P (dx) is an abbreviation for the

fine partition that can be explicitly defined for the dx.

Theorem 13.11. If∫ b

af dx ∈ IR, then

∫ b

af dx = R

∫ b

af dx.

Proof. Let U(f, P ′), L(f, P ′) be the upper and lower Riemann sums for a any general partition

P ′. Here is where I need a standard result about Riemann integration. It states that R∫ b

a f dx ∈ IR

iff, for each r ∈ IR+, there exists a general partition P ′ such that U(f, P ′) − L(f, P ′) < r. Also,

L(f, P ′) ≤ R∫ b

af dx ≤ U(f, P ′) (Burrill and Knudson, 1969, p. 202). But, a simple partition

P (∆x) is a general partition. Indeed, for our partitions L(f, P ′) = L(f,∆x), U(f, P ′) = U(f,∆x).

Consequently, Theorem 13.9 yields that R∫ b

a f dx ∈ IR. And, further, by *-transform, that for the *-

Riemann partition P (dx), st( ∗U(f, dx)) = R∫ b

af dx =

∫ b

af dx = st( ∗L(f, dx)) and this completes

the proof.

Corollary 13.12. If∫ b

af dx,

∫ b

af dy ∈ IR, then

∫ b

af dx =

∫ b

af dy.

Let’s easily establish some of the basic integral properties.

Theorem 13.13. Let bounded f and g be integrable on [a, b] for dx.

(i) For each c ∈ IR, f + g, cf are integrable on [a, b] and∫ b

a(f + g) dx =

∫ b

af dx +

∫ b

a g dx,∫ b

a cf dx = c∫ b

a f dx,∫ b

a dx = b− a.

(ii) If f(x) ≤ g(x), ∀x ∈ [a, b] then∫ b

af dx ≤

∫ b

ag dx.

(iii) If m ≤ f(x) ≤ M, ∀x ∈ [a, b], then m(b− a) ≤∫ b

a f dx ≤ M(b− a).

Proof. These are all established by simple observations about finite sums.

71

Page 72: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

(i) Observe that

L(f,∆x) + L(g,∆x) ≤ L(f + g,∆x) ≤ U(f + g,∆x) ≤ U(f,∆x) + U(g,∆x).

This result follows by *-transform using the dx and the standard part operator.

Since L(cf,∆x) = cL(f,∆x) ≤ U(cf,∆x) = c U(f∆x), then result follows by using the stan-

dard part operator for the given dx.

Now observe that L(1,∆x) = U(1,∆x) = b− a. Thus, for the given dx,∫ b

adx = b− a.

(ii) Clearly, for each ∆x, L(f,∆x) ≤ L(g,∆x) implies that st( ∗L(f, dx)) =∫ b

a f dx ≤st( ∗U(g, dx)) =

∫ b

ag dx for the given dx.

(iii) Simply apply (i) and (ii) and the proof is complete.

Monotone bounded functions need not be continuous, but they are integrable. The continuous

functions should be integrable or this integral would not be very useful. There are very short

nonstandard proofs of the following result but they require a more comprehensive structure than

I’m using. The nonstandard proof that establishes the next result is longer than the standard proof

since I’ve defined integration nonstandardly. So, I’ll give the usual standard proof that depends

upon the standard characterization of Theorem 13.9.

Theorem 13.14. If f is continuous on [a, b], then∫ b

a f dx ∈ IR for some dx ∈ µ(0)+.

Proof. Let r ∈ IR and let c = r/(b − a). From uniform continuity, there is a w ∈ IR+ such that

for any x, y ∈ [a, b] such that |x − y| < w, then |f(x) − f(y)| < c. Consider any simple partition

P (∆x) for [a, b]. Let [xi, xi+1] be one of the subdivisions. Then there is x′, y′ ∈ [xi, xi+1] such

that mi = f(x′i), Mi = f(y′i). Hence, U(f,∆x)) − L(f,∆x) =

∑n−10 (f(y′i) − f(x′

i))∆x + (f(y′n) −f(x′

n))(b− xn) < c(b− a) = r. The result follows from Theorem 13.9 and the proof is complete.

I mentioned previously that integration as here defined is equivalent to Riemann integration.

It’s time to establish this. But, due to the weak structure I’m using, I need one more standard result

about general partitions.

Theorem 13.15. For each r ∈ IR+, there exists some w ∈ IR

+ such that for all partitions P ′,

where mesh(P ′) < w,

0 ≤ R(f)− L(f, P ′) < r, 0 ≤ U(f, P ′)−R(f) < r.

Proof. This is establish in a portion of proof of Theorem 10.28 in Burrill and Knudsen (1969,

p. 223.)

Theorem 13.16. For each dx,

R(f)− ∗L(f, dx) ∈ µ(0)+, ∗U(f, dx)−R(f) ∈ µ(0)+.

Proof. Assume that there exists some dx such that R(f)− ∗L(f, dx) /∈ µ(0)+. Since 0 ≤ R(f)−∗L(f, dx), then there exists some r ∈ IR

+ such that R(f)− ∗L(f, dx) ≥ r. Now let arbitrary w ∈ IR+.

Then, 0 < dx < w. But, Theorem 13.15 holds for our simple partitions where mesh(P (∆x)) =

∆x. Hence, there exists some w1 ∈ IR+ such that for each P (∆x), when ∆x < w1, then R(f) −

L(f, P (∆x)) = R(f) − L(f,∆x) < r. By *-transform of this conclusion, it follows that R(f) −

72

Page 73: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

L(f, dx) < r since 0 < dx < w1; a contradiction. Hence, for each dx, R(f)− ∗L(f, dx) ∈ µ(0)+. In

like manner, it follows that ∗U(f, dx)−R(f) ∈ µ(0)+ and the proof is complete.

Theorem 13.17. If R∫ b

a f dx ∈ IR, then R∫ b

a f dx =∫ b

a f dx, for each dx.

Proof. Let R∫ b

a f dx ∈ IR. Consider arbitrary dx. Then R(f) = R(f) = R∫ b

a f dx and, from

Theorem 13.16, R∫ b

a f dx− ∗L(f, dx) ∈ µ(0), ∗U(f, dx) −R∫ b

a f dx ∈ µ(0) yield the result.

So, all we need in order to study Riemann integration are the simple partitions and the simple

integral that’s independent from the actual dx that’s used. This is a significant simplification.

Further, it’s obvious that if one wants to generalize the Riemann integral to other functions defined

on [a, b] it’s clear that either the partitions must be of a different type than the simple partition or the

integral must be dependent upon the dx used. The major generalization is called the generalized

integral and how under a definition similar to (B) such a generalized integral is equivalent to the

Lebesgue integral. Anyone who studies the Lebesgue integral from the viewpoint of measure theory

knows the subject can be difficult. It’s a remarkable fact that the Lebesgue integral can be viewed as

a Riemann-styled integral under definition (B) for specially selected partitions and specially selected

values for the function. From the nonstandard viewpoint, one major difference is that the Lebesgue

integral is not infinitesimal independent. The infinitesimals needed are generated by objects called

L-microgauges (Herrmann, 1993, p. 217.) But, all of this is well beyond the material in this book.

It’s a useful fact that any dx can be used to obtain the integral. This allows many standard

results to be established easily. Recall that by definition∫ a

a f dx = 0,∫ a

b f dx = −∫ b

a f dx.

Theorem 13.18. (i) Suppose that∫ b

a f dx ∈ IR and c ∈ [a, b]. Then∫ c

a f dx,∫ b

c f dx ∈ IR and∫ b

af dx =

∫ c

af dx+

∫ b

cf dx.

(ii) If c ∈ [a, b] and∫ c

a f dx,∫ b

c f dx ∈ IR, then∫ b

a f dx ∈ IR and∫ b

a f dx =∫ c

a f dx+∫ b

c f dx.

Proof. (i) Clearly, the result holds for c = a, c = b. So, let c ∈ (a, b) and ∆x = (c−a)/n, n ∈ IN′.

Then all the points in the simple partition created by ∆x for [a, c] are points in the ∆x generated

simple partitions for [c, b] and [a, b]. Hence, it follows that

L(f,∆x, [a, b]) = L(f,∆x, [a, c]) + L(f,∆x, [c, b]) ≤

U(f,∆x, [a, c]) + U(f,∆x, [c, b]) = U(f,∆x, [a, b]).

Let Λ ∈ IN∞ and dx = (c− a)/Λ. By *-transform and the standard part operator and the fact that∫ b

a f dx ∈ IR, we have that

st( ∗L(f, dx, ∗ [a, b])) = st( ∗L(f, dx, ∗ [a, c])) + st( ∗L(f, dx, ∗ [c, b])) =

st( ∗U(f, dx, ∗ [a, c])) + st( ∗U(f, dx, ∗ [c, b])) = st( ∗U(f, dx, ∗ [a, b])).

But, st( ∗U(f, dx, ∗ [a, c]))−st( ∗L(f, dx, ∗ [a, c])) ≥ 0, st( ∗U(f, dx, ∗ [c, b]))−st( ∗L(f, dx, ∗ [c, b])) ≥0 imply that st( ∗U(f, dx, ∗ [a, c])) = st( ∗L(f, dx, ∗ [a, c])) and st( ∗U(f, dx, ∗ [c, b])) =

st( ∗L(f, dx, ∗ [c, b])) and the result follows.

(ii) This follows by considering the same type of simple partition as in (i) and applying the

standard part operator and the proof is complete.

Now let’s consider a few of the most significant properties of the integral.

73

Page 74: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 13.19. Let∫ b

a f dx ∈ IR. For each y ∈ [a, b], let F (y) =∫ y

a f dx. Then F is uniformly

continuous on [a, b]. If f is continuous at c ∈ [a, b], then F ′(c) = f(c).

Proof. From Theorem 13.18, F (y) ∈ IR. Thus, F is a function from [a, b] into IR. Since f is

bounded and again by applying Theorem 13.18, it follows that for any x, y ∈ [a, b], |F (y)− F (x)| ≤M |y − x|, where |f(z)| ≤ M, ∀ z ∈ [a, b]. Hence, by *-transform, if p, q ∈ ∗ [a, b] and p − q ∈ µ(0),

then | ∗F (p) − ∗F (q)| ≤ M |p − q| ∈ µ(0) implies by Theorem 10.5 that F is uniformly continuous

on [a, b].

Assume that f is continuous at c ∈ [a, b]. Now considering the integral as the function F (y) =∫ y

a f dx, y ∈ [a, b], our previous integral properties can by translated into properties about F , where

for z, y ∈ [a, b],∫ y

z f dx = F (y)−F (z). By *-transform, these ∗F function properties are relative to

z, y ∈ ∗ [a, b]. Let p ∈ µ(c) such that p+ c ∈ ∗ [a, b]. First, assume that p < c. From the continuity of

f , | ∗f(p)− f(c)| = γ ∈ µ(0). Let g(x) = f(x)− f(c). Then from the hyper-properties for ∗G(x), we

have that G(c)− ∗G(p) = F (c)− ∗F (p)−f(c)(c−p). Moreover, |F (c)− ∗F (p)−f(c)(c−p)| ≤ γ(c−p).

Consequently, ∣∣∣∣

F (c)− ∗F (p)

c− p− f(c)

∣∣∣∣∈ µ(0).

In like manner, for the case that p > c, and the result that F ′(c) = f(c) follows.

Corollary 13.20. If∫ b

af dx ∈ IR, p, q ∈ ∗ [a, b] and p − q ∈ µ(0), then ∗F (p) − ∗F (q) =

∗∫ p

q∗f dx ∈ µ(0).

It’s beyond the scope of this book to establish a necessary and sufficient for∫ b

af dx to exist. The

facts are that there are some very unusual functions that are integrable. For example, consider the

non-negative rational numbers (in lowest form) q/p, p > 0. Define on [0, 1] the function f(p/q) = 0

and, for each irrational r, f(r) = r. Then∫ 1

0 f dx exists. I leave it to the reader to find the exact

value. It is rather easy, however, to use our methods to show that the value of the integral is

independent from the value of the bounded function at finitely many points in [a, b].

Theorem 13.21. Let f and g be bounded on [a, b] and there exists a non-empty finite set

of numbers D = p0, . . . , pn ⊂ [a, b] such that f and g only differ on D. If∫ b

a f dx ∈ IR, then∫ b

af dx =

∫ b

ag dx.

Proof. Consider any dx. Without loss of generality, we may assume that, for [c, d] ⊂ [a, b], c 6= d,

that f and g differ at most at the end points c, d. Then ∗L(f, dx) = m0 dx +∑Λ−2

1 mi dx +

mΛ−1 dx + mΛ(b − xΛ). Hence, st(∗L(f, dx)) = st(

∑Λ−21 mi dx) = st( ∗L(g, dx)). In like manner,

st( ∗U(f, dx)) = st( ∗U(g, dx)) and the result follows.

74

Page 75: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

14. WHAT DOES THE INTEGRAL MEASURE?

In this chapter, I present a slightly advanced look at the type of physical properties that the

integral will measure. Also I continue to assume that f is bounded on [a, b].

Definition 14.1 (Additive Function.) A function B: [a, b]× [a, b] → IR is additive if for each

simple partition P (∆x) = a = x0 < · · · < xn ≤ xn+1 = b

B(xi, xi+1) = B(xi, x) +B(x, xi+1), xi ≤ x ≤ xi+1, i = 0, . . . , n.

(This is not the only definition in the literature for this type of additive function.)

I note that in general B(xi, xi) = 0. Of course, it’s immediate that if∫ b

af dx ∈ IR, then

B(x, y) = F (y)− F (x) =∫ y

x f dx is additive on [a, b]. But, does the converse hold? If you are given

a specific additive function B, then ∗B has meaning for any fine partition generated by dx since the

definition of B is relative to the partition points and subdivision closed intervals.

Definition 14.2. (Admissible for f .) A function B that is additive on [a, b] is admissible

for f : [a, b] → IR iff there exists some dx and for the fine partition P (dx) = a = x0 < · · · < xΛ ≤xΛ+1 = b it generates, for each i = 0, . . . ,Λ− 1, where b− xΛ = 0, there exists some pi ∈ [xi, xi+1]

such that∗B(xi, xi+1)

dx− ∗f(pi) ∈ µ(0), (IC)

and if b− xΛ 6= 0, then there also exists some pΛ ∈ [xΛ, b] such that

∗B(xΛ, b)

b − xΛ− ∗f(pΛ) ∈ µ(0). (IC)

Theorem 14.3. Let B be admissible for f : [a, b] → IR. Then for each r ∈ IR+, there exist dx, dy

such that

−r(b − a) + ∗L(f, dy) < B(a, b) < ∗Uf, dx) + r(b − a).

Proof. Let r ∈ IR+. Assume that for each dx,

B(a, b) ≥ ∗U(f, dx) = ∗U(f, dx) + r(b − a). (14.4)

I make the following observation about B where B(a, b) ≥ U(f + r,∆x). Let n > 1, and B(a, b) =(∑n−1

0 B(xi, xi+1 +∆x))

+B(xn, b). Then there exists some k ∈ [0, n−1] such thatB(xk, xk+∆x) ≥Mk∆x and if b − xn 6= 0 (or n = 1), then B(xn, b) ≥ Mn(b − xn), where Mi = supf(x) + r |x ∈ [xi, xi + ∆x] which exists by boundedness. Thus, by *-transform and assuming that (14.4)

holds, we have that there exists k ∈ [0,Λ − 1], ∗B(xk, xk+1) ≥ Mk dx and if b − xΛ 6= 0, then

B(xn,Λ) ≥ MΛ(b − xΛ), where Mi = supf(x) + r | x ∈ [xi, xi + dx], which also all exist from

boundedness and I need not consider the ∗sup since by definition the ∗sup = sup . Consequently,

for each p ∈ [xk, xk+1],∗B(xk, xk+1) ≥ ( ∗f(p) + r)dx, k ∈ [0,Λ − 1] and if p ∈ [xΛ, b], then,

∗B(xΛ, b) ≥ ( ∗f(p) + r)(b − xΛ), where b− xΛ ≥ 0. This implies that for each dx, k ∈ [0,Λ− 1],

∗B(xk, xk+1)

dx− ∗f(p) ≥ r, ∀ p ∈ [xk, xk+1

and if b− xΛ 6= 0, then∗B(xΛ, b)

b− xΛ− ∗f(p) ≥ r, ∀ p ∈ [xΛ, b].

75

Page 76: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

This contradicts admissibility. Thus, there exists some dx such that B(a, b) < ∗U(f + r, dx). In like

manner, there exists some dy such that −r(b−a)+ ∗L(f, dy) < B(a, b) and this completes the proof.

Theorem 14.5. If B is admissible for integrable f , then B(a, b) =∫ b

af dx.

Proof. Let r ∈ IR+. Then, from Theorem 14.3, there exists dx, dy such that −r(b − a) +

∗L(f, dy) < B(a, b) < ∗U(f, dx) + r(b − a). The result follows by taking the standard part operator

and the fact that r is arbitrary.

Thus, the integral can be used to calculate the values of an admissible additive function. How-

ever, the converse of Theorem 14.5 does not hold. Indeed, due the fact there are many unusual

integrable functions, the converse does not hold where you define the additive function by the inte-

gral itself. In the rather simple example below, it’s shown that there are additive functions, indeed

integrals, where (IC) holds but not for all dx.

Example 14.6. Define the integrable function f(x) = 0, ∀x ∈ [0, 1), f(x) = 1, ∀x ∈ [1, 2].

Define B(x, y) =∫ y

xf dx for all x, y ∈ [0, 2]. Then B(x, y) is additive on [0, 2]. Let Λ ∈ IN∞, and let

dx = 2/Λ and Λ by *-even. There exists some k ∈ [0,Λ − 1] such that for each p ∈ ∗ [0, 2], p <

xk,∗f(p) = 0 and p ≥ xk,

∗f(p) = 1. We also know that for each k ∈ [0,Λ − 1], mkdx ≤∗B(xk, xk+1) ≤ Mkdx, where mk = inf ∗f(x) | x ∈ [xk, xk+1], Mk = sup ∗f(x) | x ∈ [xk, xk+1].Consequently, we have that for j ∈ [0,Λ− 1], j < k

∗B(xj , xj+1)

dx= 0 = ∗f(p), ∀ p ∈ [xj , xj+1]

and for each j ≥ k∗B(xj , xj+1)

dx= 1 = ∗f(p), ∀ p ∈ [xj , xj+1]

Thus, B is admissible. Now let dy = 2/Γ, but Γ is a *-odd number. Then again we have that

B(x, y) =∫ y

x f dx =∫ y

x f dy. However, there exists i ∈ [0,Γ − 1] such that 1 is the midpoint of

xi, xi+1 and ∗B(xi, xi+1) = dy/2. From, this is follows that the (IC) does not hold for this dy.

The point x = 1 in the above example is a point of discontinuity for f . If you altered the

definition of admissibility to have the (IC) holds for all dx and for all pi ∈ [xi, xi+1] you get a notion

I called supernearness. I show in Herrmann (1993), that an additive function B is supernear to f

iff f is continuous. And, of course, the B is equal to the integral. Let’s complete this chapter by

considering an additional property for an additive function, a property that models various geometric

and physical notions.

Definition 14.7. (Rectangular Property) An additive function A: [a, b] × [a, b] → IR, has

the rectangular property for f iff for any c, d ∈ [a, b], c ≤ d, m(d − c) ≤ A(c, d) ≤ M(d − c),

where, as usual, m = inff(x) | x ∈ [c, d], M = supf(x) | x ∈ [c, d].

What does the addition of the rectangular property do for us? By *-transform, consider dx.

Then for each k ∈ [0, xΛ−1], mkdx ≤ ∗A(xk, xk+1] ≤ Mkdx, and if vΛ+1 = b, b − xΛ 6= 0, then

mΛ(b−xΛ) ≤ ∗A(xΛ, b) ≤ MΛ(b−xΛ), where the mi,Mi are defined in the usual way. Consequently,

in general, for such functions for each k ∈ [0,Λ− 1], there is some pk ∈ [xk, xk+1 such that

∣∣∣∣

∗A(xk, xk+1)

dx− ∗f(p)

∣∣∣∣≤ Mk −mk

76

Page 77: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

and if b− xΛ 6= 0, then there exists some pΛ ∈ [xΛ, b] such that

∣∣∣∣

∗A(xΛ, b)

b− xΛ− ∗f(p)

∣∣∣∣≤ MΛ −mΛ.

This discussion leads to the following theorem.

Theorem 14.8. Assume that A is additive and has the rectangular property for f and that there

exists some dx such that for the simple partition P (dx), whenever k ∈ [0,Λ−1], then Mk−mk ∈ µ(0),

where mk = inff(x) | x ∈ [xk, xk+1], Mk = supf(x) | x ∈ [xk, xk+1]. Further, if b − xΛ 6= 0,

then MΛ − mΛ ∈ µ(0), mΛ = inff(x) | x ∈ [xΛ, b], MΛ = supf(x) | x ∈ [xΛ, b]. Then A is

admissible for f .

Why is the usual application of the integral to functions that are piecewise continuous on [c, d]?

Well, first of all, since the value of the integral is independent from the value of the function at the

end points of the intervals of definition, then all that is needed is to consider why for a specific closed

interval [a, b]. The next theorem shows why the result in Example 14.6 occurs.

Theorem 14.8. A function f is continuous on [a, b] iff for each dx and, hence, each fine

partition P (dx), whenever k ∈ [0,Λ − 1], it follows that Mk − mk ∈ µ(0) and if b − xΛ 6= 0, then

MΛ −mΛ ∈ µ(0).

Proof. Let f be continuous on [a, b]. The f is uniformly continuous. So, let’s consider any dx

and |p − q| ≤ dx, p, q ∈ ∗ [a, b]. Then p − q ∈ µ(0) implies that ∗f(p) − ∗f(q) ∈ µ(0). Consider

the simple partition P (dx). Since, for each h ∈ [0,Λ − 1] there exist p, q ∈ [xk, xk+1] such that

Mk = ∗f(p), mk = ∗f(q) as well as for the case that b− xΛ 6= 0, then the necessity follows.

For the sufficiency, let p− q ∈ µ(0), p, q ∈ ∗ [a, b]. Then there exists dx such that |p− q| ≤ dx.

Consider a P (dx) fine partition. First, assume that for some k ∈ [0,Λ− 1], p, q ∈ [xk, xk+1] or that

p, q ∈ [xλ, b]. Then since Mk−mk ∈ µ(0), ∗f(p)− ∗f(q) ∈ µ(0). If there does not exist some k ∈ [0,Λ]

such that p, q ∈ [xk, xk+1], then p, q are in adjacent intervals by *-transform of the standard case.

So consider 2dx = dy and apply the first case argument to show that ∗f(p)− ∗f(q) ∈ µ(0). Thus, f

is (uniformly) continuous on [a, b] and the proof is complete.

Corollary 14.10. Let A be additive on [a, b] and have the rectangular property for f . If f is

continuous on [a, b], then A(x, y) =∫ y

xf dx, x ≤ y, x, y ∈ [a, b] and the function A is unique.

Thus, if you start with a geometric or physical property that is measured by an additive function

A with the rectangular property for a continuous function f , then A is uniquely modeled by the

integral. On the other hand, although the function f need not be continuous, if∫ b

a f dx ∈ IR, then

for x ≤ y, x, y ∈ [a, b] the function A(x, y) =∫ y

xf dx is additive and has the rectangular property.

77

Page 78: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

15. GENERALIZATIONS

Much of what I’ve covered can be highly generalized. It should be obvious that this nonstandard

approach, although very restricted in the language used, is not depended upon the codomain of the

set of sequences used to obtain the equivalence classes with respect to a free ultrafilter. Hence,

the set IR and all the additional ones used in M can be replaced with the set X ∪ IR, where X is

non-empty. The *-transform method holds and many of the general results, such as Theorem 3.2,

follow since they are all obtained from the properties of the ultrafilter. It would be better to have

a stronger language and a structure where we can use ∈ over variables. But, even in our restricted

language, much can be done. I give just a few brief example.

Definition 15.1. (Real Metric Space.) A nonempty set X is called a metric space iff

there exists a function d:X ×X → IR such that for each x, y, z ∈ X,

(i) d(x, y) = d(y, x) ≥ 0,

(ii) d(x, y) = 0 iff x = y,

(iii) d(x, y) ≤ d(x, z) + d(z, y).

Definition 15.2. (General Finite Points and Monads.) Any q ∈ ∗X is finite iff there

is some p ∈ X such that ∗d(q, p) ∈ G(0). For each p ∈ X, the monad of p is µ(p) = x | (x ∈∗X) ∧ ( ∗d(x, p) ∈ µ(0)) = x | (x ∈ ∗X) ∧ ∀r((r 6= 0) ∧ (r ∈ IR) → ∗d(x, p) < |r|). The set

ns( ∗X) =⋃µ(p) | p ∈ X.

In this more general case, what was previously the set G(0) is now denoted by fin( ∗X), the set

of all finite points in ∗X. And, as before, ns( ∗X) ⊂ fin( ∗X). These sets are equal in the case that

X = IR, but for metric spaces in general they are not equal.

A closed sphere about p ∈ X, S[p, r] = x | d(x, p) ≤ r. A set B ⊂ X, for the metric space

(X, d), is bounded iff there is some closed sphere S[p, r] such that B ⊂ S[p, r]. Now I assume that

the theorem on the *-transform has been established for our structure.

Theorem 15.3. For the metric space (X, d), B ⊂ X is bounded iff ∗B ⊂ fin( ∗X).

Proof. For the necessity, the sentence ∀x((x ∈ S[p, r] → d(x, p) ≤ r) holds in M; and, hence,

in ∗M. Thus, by *-transform, ∀x((x ∈ ∗(S[p, r]) → ∗d(x, p) ≤ r). Consequently, ∗B ⊂ ∗(S[p, r]) ⊂fin( ∗X).

For the sufficiency, assume that B ⊂ X is not bounded. Let p ∈ X . Then the sentence

∀x((x ∈ IR+) → ∃y((y ∈ B)∧ (d(p, y) > x)) holds in M. Thus, by *-transform, letting Λ ∈ IR

+∞, then

there exists some q ∈ ∗B such that ∗d(q, p) > Λ. Now let p′ ∈ X . Then ∗d(q, p′) cannot be a finite

*-real number. For if we assume that ∗d(q, p′) ∈ G(0), then since d(p, p′) ∈ G(0) we would have that∗d(p, q) ≤ d(p, p′) + ∗d(p′, q) ∈ G(0); a contradiction. Hence, q /∈ ∗(S[p′, r]) for any r ∈ IR

+ and any

p ∈ X. This completes the proof.

Corollary 15.4. A sequence S: IN → X is bounded iff ∗S(Λ) ∈ fin( ∗X) for each Λ ∈ IN∞.

The following results are obtained immediately in the same manner as the corresponding real

number results.

Theorem 15.5. For a metric space (X, d), a sequence S: IN → X converges to L iff ∗S(Λ) ∈µ(L) for each Λ ∈ IN∞.

Theorem 15.6. For a metric space, every convergent sequence is bounded.

78

Page 79: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

Theorem 15.7. A point p ∈ X is an accumulation point for a sequence S: IN → X iff there

exists some Λ ∈ IN∞ such that ∗S(Λ) ∈ µ(p).

Corollary 15.8. A sequence S: IN → X has a convergent subsequence iff there exists some

Λ ∈ IN∞ such that ∗S(Λ) ∈ ns( ∗X).

Theorem 15.9. A sequence S: IN → X is Cauchy iff ∗d( ∗S(Λ), ∗S(Ω)) ∈ µ(0) for each Λ,Ω ∈IN∞.

It’s possible for metric space, including the real numbers, to define monads at points q ∈ ∗X−X

by letting µ(q) = x | ∗d(x, q) ∈ µ(0). Then it follows that S: IN → X is Cauchy iff there exists

some q ∈ ∗X such that ∗S(Λ) ∈ µ(q), ∀Λ ∈ IN∞. One of the most significant metric spaces is the

normed linear (vector) space. I consider a linear space over the real numbers for my example.

If V is a linear space over the real numbers, then a norm is a map ‖ · ‖:V → IR with the properties

that, for each x, y ∈ V , (i) ‖x‖ ≥ 0. (ii) For each r ∈ IR, ‖rx‖ = |r| ‖x‖. (iii) ‖x+ y‖ ≤ ‖x‖+ ‖y‖.The metric is defined by letting d(x, y) = ‖x− y‖. Then you now apply nonstandard analysis to

this space along with its additional linear space properties. For example, we have that µ(p) = p+γ |γ ∈ µ(0) for such a metric space, in general. Nonstandard analysis has been applied extensively

to linear spaces. For the major generalization known as the topological spaces, where I have

established some immediately of the original results, we need a structure more directly related to

set-theory and such an appropriate structure is not what I would consider as elementary in character.

79

Page 80: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

APPENDIX

Theorem A1. Let F be any filter on X. Then there exists an ultrafilter UX ⊃ F .

Proof. You can either use the set-theoretic axiom that states that this statement holds; or use

Zorn’s Lemma, which is equivalent to the Axiom of Choice. Let G be the set of all filters that contain

F . Suppose that C is a chain with respect to ⊂ in G. I show that⋃ C is a filter that obviously would

be a upper bound for this chain and is contained in G. Clearly, ∅ /∈ ⋃ C. Let A ∈ ⋃ C. Then A ∈ F1

for some F1 ∈ ⋃ C. Hence, if A ⊂ B, then B ∈ F1 implies that B ∈ ⋃ C. Now let A,B ∈ ⋃ C. SinceG is a chain with respect to ⊂, then there is some F2 ∈ G such that A, B ∈ F2. Hence, A ∩B ∈ F2

implies that⋃ C is a filter that contains F . By Zorn’s Lemma, there is a member of G that is a

maximal member U with respect to ⊂ . If U is not an ultrafilter, then there would be a filter U1 not

equal to U such that U ⊂ U1. But, then U1 ∈ G; a contradiction of the (⊂) maximal property for U .This completes this proof.

I assume that the reader knows what I mean by a first-order language L with equality, where

the constants represent objects in IR, IR2, . . ., and P(IR),P(IR2), . . . . Equality is interpreted to be the

identity on IR or set-theoretic equality elsewhere. The variables are denoted by Roman font. The

first class of atomic formula are x ∈ Y and, for n > 1, (x1, . . . , xn) ∈ Y, where Y is a constant, and

all possible permutations of members of the n-tuples, where xk is either a constant or variable. I

leave to the reader the trivial cases where the various expressions only contain constants and assume

that, for all other formula, at least, one of the symbols that can differ from a constant is a variable.

(The Y includes the +, ·,≤. Our result below holds for many other collections of atomic formula

that describe members of more comprehensive structures, but I don’t use them for this monograph.)

Finally, a = b, where a, b are both variables or, at most, one is a constant. (Note: the symbols = is

interpreted as a special binary relation within our language). Only a special set K of formula built

from these atomic formula is used. Further, P ∈ K if and only if P has only bounded quantifiers.

That is each quantifier contained in the P is restricted to subsets of IR or IRn. Indeed, in this book, all

P with bounded quantifiers are equivalent to a form ∀x((x ∈ X) → · · ·) or the form ∃x((x ∈ X)∧· · ·).The reason I’m using these special forms is that the *-transform (Leibniz) property for such formula

can be established without the Axiom of Choice. Given any P, then ∗P is obtained by placing every

constant A in P by ∗A. (Note: The +, ·,≤ are constants that technically should carry the ∗notation,

but it’s customary to drop this notation when the context is known.) In what follows, (a) = a and

x ∈ IR is considered in two context, either a constant for a member of IR or as varying over a subset

of IR as the case may be. Let K be our set of formula and

M = 〈IR, . . . , IRn, . . . ,P(IR), . . . ,P(IRn), . . . ,+, ·,≤〉

∗M = 〈∗IR, . . . , ∗IRn, . . . ,P(∗IR), . . . ,P(∗IRn), . . . ,+, ·,≤〉.

Theorem A2. Let P (x1, . . . , xp) ∈ K contain at least one variable and X is a member of M.

Define A = (x1, . . . , xp) | ((x1, . . . , xp) ∈ X) ∧ (P holds in M). Then

∗A = (x1, . . . , xp) | ((x1, . . . , xp) ∈ ∗X) ∧ ( ∗P holds in ∗M).

Proof. Let P = (x ∈ Y ), where Y ⊂ IRn. Then A = x | (x ∈ X) ∧ (x ∈ Y ) = x | x ∈

X ∩ x | x ∈ Y . By Theorem 3.2 (vi)(xi), ∗A = ∗X ∩ ∗Y = x | (x ∈ ∗X) ∧ (x ∈ ∗Y ). LetP = (x = y), (or P = (x = x)). Let X ⊂ IR, X ×X = Y and A = (x, y) | ((x, y) ∈ Y ∧ (x = y).

80

Page 81: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

The result follows from ∗X × ∗X = ∗Y and if a, b ∈ X and a = b, then [A] = [B]. For the case that

X ⊂ IRn, n > 1, we need either to identify X ×X with the obvious 2n-ary relation or we need to

extend the structure to include such objects and extend the results in Theorem 3.2 to cover these

objects. This also depends upon your definition of the n-tuple. The case where x = a or a = x also

follows in like manner. Note that since function or term symbols are not used in the language, then

the = can be considered as used to generate specific relations that are elements of our structure.

For atomic formula (x1 . . . xp) ∈ Y , the result follows by application of Theorem 3.2 and the

definition of ∗Y noting, of course, that (x1, . . . , xp) ∈ X.

Since every first-order formula is equivalent to a formula that has all of the quantifiers to the

left of a formula that contains no quantifiers but only formula built from atomic formula and the

connectives ∧,∨,→,↔,¬. As for the connectives, it is well know that if we assume that the result

holds for quantifier free formula V and W, then all we need to do to show, by induction, that the

result holds in general for quantifier free formula is to show that it holds for V ∧ W and for ¬V.For all but the atomic formula (x1, . . . , xp) ∈ Y this is immediate by Theorem 3.2. Now let V be

the expression (x1, . . . , xp) ∈ Y and ∗A = (x1, . . . , xp) | (x1, . . . , xp) ∈ ∗X) ∧ ((x1, . . . , xp) ∈ ∗Y ).Then consider ∗B = (x1, . . . , xp) | (x1, . . . , xp) ∈ ∗X)∧ ((x1, . . . , xp) /∈ ∗Y ). The *-transfer holds

since ∗B = ∗X − ∗A from Theorem 3.2.

Now let V = (x1, . . . , xp, y1, . . . , yq) ∈ Y, W = (x1, . . . , xp, z1, . . . , zr) ∈ Z, where I assume

the possibility that both V and W contain x1, . . . , xp and the other constants or variables are

distinct from these. Let (x1, . . . , xp, y1, . . . , yq, z1, . . . , zr) = (x1, . . . , zr). Let B = x1, . . . , zr) |(x1, . . . , zr) ∈ X) ∧ ((x1, . . . , xp, y1, . . . , yq) ∈ Y ) and C = x1, . . . , zr) | (x1, . . . , zr) ∈ X) ∧((x1, . . . , xp, z1, . . . , zr) ∈ Z). Then A = (x1, . . . , zr) | (x1, . . . , zr) ∈ X)∧ (V ∧W ) = B ∩C. The

result holds from the induction hypothesis, in this case, since ∗A = ∗B ∩ ∗C.

As mentioned, any first-order formula is equivalent to one which can be written as V =

(qxn+1) · · · (qx1)W, where x1, . . . , xn+1 are free variables in W and W is a finite combination

via ∧ and ¬ of all of our quantifier free atomic formula. Hence, represent this formula by

W (y1, . . . , yp, x1, . . . , xn+1). We can always assume that (qxn+1)V = (∃xn+1)V (for if not, con-

sider ¬V ) and V is also in this special quantifier form. If n = 0, then the result has been es-

tablished. Assume the result holds for an appropriate member of K with the number of quan-

tifiers ≤ n. Under our requirements, xn+1 is restricted to a member Z of our structure. Let

D = (y1, . . . , yp), xn+1) | ((y1, . . . , yp), xn+1) ∈ X × Z ∧ (qxn . . . qx1)W, where X ⊂ IRp is also

in the standard structure. Then, by induction, and Theorem 3.2, ∗D = (y1, . . . , yp), xn+1) |((y1, . . . , yp), xn+1) ∈ ∗X × ∗Z ∧ (qxn . . . qx1)

∗W. Let A = (y1, . . . , yp) | ((y1, . . . , yp) ∈X) ∧ (∃xn+1((xn+1 ∈ Z) ∧ (qxn . . . qx1)W )). Using this and a simple modification of the proof of

Theorem 3.2 (x), it follows that the domain of ∗D = ∗A, where ∗A = (y1, . . . , yp) | ((y1, . . . , yp) ∈∗X) ∧ (∃xn+1((xn+1 ∈ ∗Z) ∧ (qxn . . . qx1)

∗W )) = (y1, . . . , yp) | ((y1, . . . , yp) ∈ ∗X) ∧ ∗V . Byinduction this completes the proof.

Theorem A3. Let V ∈ K be a sentence with necessary quantifiers or be compose only of

connected atomic formula expressed only in constants. Then V holds in M iff ∗V holds in ∗M.

Proof. Since V is a sentence it has no free variables. If V contains no quantifiers, then V only

contains constants and the result follows from the definition of the hyper-extension and Theorem

3.2.

Now assume that V contains quantifiers and that it is written in the equivalent form (prenex

normal form) V = (qxn · · · qx1)W and qxn = ∃xn. (If this is not the case, consider the negation.) To

81

Page 82: arXiv:math/0310351v6 [math.GM] 5 Oct 2010 · arxiv:math/0310351v6 [math.gm] 5 oct 2010 nonstandardanalysis-asimplifiedapproach-roberta.herrmann 1

Nonstandard Analysis Simplified

say that V holds in M means that A = xn | (xn ∈ X) ∧ ((qxn−1 · · · qx1)W holds in M) 6= ∅,where X is the domain for ∃xn. But A 6= ∅ iff ∅ = ∗∅ = ∗A = xn | (xn ∈ ∗X) ∧((qxn−1 · · · qx1) ∗W holds in ∗M). This completes the proof.

REFERENCES

Bahrens, M. (1972), A local inverse function theorem, in Victoria Symposium on Nonstandard

Analysis, (Lecture Notes in Mathematics #369, Springer-Verlag, NY, pp. 34-36.

Burrill, and Kundsen, (1969), Real Analysis, Holt, Rinehart and Winston, NY.

Herrmann, R. A. (1991), Some Applications of Nonstandard Analysis to Undergraduate Math-

ematics - Infinitesimal Modeling, Elementary Physics, and Generalized Functions, In-

structional Development Project, Math. Dept., U. S. Naval Academy, Annapolis, MD.

http://arxiv.org/abs/math/0312432 http://www.serve.com/herrmann/cont2s.htm.

Jech, T. J. (1971), Lecture Notes in Set Theory, Lectures Note In Mathematics # 217, Springer-

Verlag, NY.

Luxemberg, W. A. J. (1962), Non-Standard Analysis - Lectures on A. Robinson’s Theory of In-

finitesimals and Infinitely Large Numbers, Math. Dept., California Institute of Technology

Bookstore, Pasadena, CA.

Robinson, A. (1966), Non-standard Analysis, North-Holland, Amsterdam.

Robinson, A. (1961), Non-standard analysis, Nederl. Akad. Wetensch. Proc. Ser. A62, and

Indag. Math, 23:432-440.

Rudin, W. (1953), Principles of Mathematical Analysis, McGraw Hill, NY.

Suppes, D. (1960), Axiomatic Set Theory, D. Van Nostrand, NY.

82