complexity of bezout’s theorem v: polynomial time · operations to nd d l 1approximate zeros of f...

26
Complexity of Bezout’s theorem V: Polynomial time Michael Shub IBM, TJ Watson Research Center Yorktwn Heights, NY-10598-0218 e-mail: shub@ watson.ibm.com Steve Smale Mathematics Department University of California Berkeley, CA-94720 e-mail: smale@ math.berkeley.edu 1. Introduction The main goal of this paper is to show that the problem of finding approximately a zero of a polynomial system of equations can be solved in polynomial time, on the average. The number of arithmetic operations is bounded by cN 4 where N is the number of input variables and c is a universal constant. Let us be more precise. For d =(d 1 ,...,d n ) each d i a positive integer, let H (d) be the linear space of all maps f :C n+1 C n , f =(f 1 ,...,f n ), where each f i is a homogeneous polynomial of degree d i . The notion of an approximate zero z in projective space P (C n+1 ) of f has been defined in [Bez I, Bez II, Bez IV, Malajovich-Mu=F1oz] 1 and below. It means that Newton’s method converges quadratically, immediately, to an actual zero ζ of f , starting from z . Given an approximate zero, an ε approximation of an actual zero can be obtained with a further log | log ε| number of steps. A probability measure on the projective space (of lines) P (H (d) ) was de- veloped in [Kostlan], [Bez I] and “average” below refers to that measure. Let N =dimension H (d) as a complex vector space. Main Theorem. Fixing d, the average number of arithmetic operations to find an approximate zero of f P (H (d) ) is less than cN 4 , c a universal constant, unless n 4 or some d i =1. Remark. If n 4 or some d i =1, we get cN 5 . The result is also valid in the non-homogeneous case f :C n C n . The import of the Main Theorem can be understood especially clearly in the case of quadratic polynomials. Thus consider the case d = (2,..., 2) of this theorem. Here we have that the average arithmetic complexity is bounded by a polynomial function of the dimension n since an easy count shows that N n 3 . This seems quite surprizing in view of the history of complexity results for polynomial systems (see [Bez I] for references). 1 Bez I, Bez II, Bez IV refer to the corresponding papers [Shub-Smale]. 1

Upload: others

Post on 20-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Complexity of Bezout’s theorem V:

Polynomial time

Michael Shub

IBM, TJ Watson Research Center

Yorktwn Heights, NY-10598-0218

e-mail: [email protected]

Steve Smale

Mathematics Department

University of California

Berkeley, CA-94720

e-mail: smale@ math.berkeley.edu

1. Introduction

The main goal of this paper is to show that the problem of finding approximatelya zero of a polynomial system of equations can be solved in polynomial time,on the average. The number of arithmetic operations is bounded by cN4 whereN is the number of input variables and c is a universal constant.

Let us be more precise. For d = (d1, . . . , dn) each di a positive integer, letH(d) be the linear space of all maps f : Cn+1 → Cn, f = (f1, . . . , fn), whereeach fi is a homogeneous polynomial of degree di.

The notion of an approximate zero z in projective space P (Cn+1) of f hasbeen defined in [Bez I, Bez II, Bez IV, Malajovich-Mu=F1oz]1 and below. Itmeans that Newton’s method converges quadratically, immediately, to an actualzero ζ of f , starting from z. Given an approximate zero, an ε approximation ofan actual zero can be obtained with a further log | log ε| number of steps.

A probability measure on the projective space (of lines) P (H(d)) was de-veloped in [Kostlan], [Bez I] and “average” below refers to that measure. LetN =dimension H(d) as a complex vector space.

Main Theorem. Fixing d, the average number of arithmetic operations tofind an approximate zero of f ∈ P (H(d)) is less than cN4, c a universal constant,unless n ≤ 4 or some di = 1.

Remark. If n ≤ 4 or some di = 1, we get cN5.

The result is also valid in the non-homogeneous case f : Cn → Cn.The import of the Main Theorem can be understood especially clearly in

the case of quadratic polynomials. Thus consider the case d = (2, . . . , 2) ofthis theorem. Here we have that the average arithmetic complexity is boundedby a polynomial function of the dimension n since an easy count shows thatN ≤ n3. This seems quite surprizing in view of the history of complexity resultsfor polynomial systems (see [Bez I] for references).

1Bez I, Bez II, Bez IV refer to the corresponding papers [Shub-Smale].

1

Page 2: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

The special case of quadratic systems has extra significance in view of theNP completeness theorems of [BSS].2

Here it is shown that the decision problem “quadratic systems” is NP com-plete over IR or over C. This problem is given k quadratic inhomogeneousequations (homogeneous over IR is sufficient), in n variables, to decide if thereis a common zero. For various reasons, it seems unlikely that there is a polyno-mial time algorithm, even with exact arithmetic (in the sense of [BSS]) for thisproblem.

Moreover in the recent “weak model” of [Koiran], quadratic systems def-initely do not admit a polynomial time algorithm, so that P 6= NP , as wasshown in [Cucker-Shub-Smale].

One might reasonably ask about the analog of the Main Theorem for theworst case, rather than the average. In a trivial sense the corresponding conclu-sion can’t be true since some polynomial systems have no approximate zeros.

But there seems to be a deeper sense in which the result (or a modificationthereof) fails for the worst case. The algorithms we use here are robust inthat they work well in the presence of round off error (see [Kim, Malajovich-Mu=F1oz]).

This could be made more formal, more conceptual, by the introduction ofa “δ-machine”, see especially [Malajovich-Mu=F1oz], but also ([Priest], [Shub],[Smale, 1990]) for background.

A δ-machine is defined to introduce a relative error δ at each computationnode of a [BSS] machine.

Then it could no doubt be proved that a δ-machine would be sufficient inour Main Theorem, where δ > 0 could be well-estimated.

On the other hand for n > 1, polynomial systems may have one dimensionalsets of solutions (“excess components”) and that fact seems to imply, that theworst case complexity problem is undecidable with δ-machines, for any δ > 0.Even the linear case produces an argument. These ideas need formalizationand development, but one would expect that to happen in view of [Malajovich-Mu=F1oz].

The algorithm of the Main Theorem, developed in [Bez I, Bez II, Bez IV],is a homotopy method, with steps based on a version (projective) of Newton’sMethod. There is a weak spot in its present use in that the existence of a startsystem-zero pair (g, ζ) is proved, but not constructively. Thus the algorithmsdepends on d = (d1, . . . , dn), and even on a probability of failure σ. It is notuniform in the sense of [BSS] in d and σ.

In Section 2 an obvious candidate for (g, ζ) is given. If our (highly likely)conjecture stated there is true, then the uniformity of the algorithms is achieved.

The Main Theorem has the following generalization, which includes the casestudied in [Bez IV]. We say that z1, . . . , zl are l (distinct) approximate zeros off ∈ P (H(d)) if they converge under iteration of (projective) Newton’s methodto l distinct roots ζ1, . . . , ζl of f .

Generalized Main Theorem. Fixing d, the average number of arithmetic2BSS refers to the paper [Blum-Shub-Smale].

2

Page 3: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

operations to find D ≥ l ≥ 1 approximate zeros of f ∈ P (H(d)) is less thanc l2N4, c a universal constant, unless n ≤ 4 or some di = 1 in which casec l2N5 suffice.

Here D =∏ni=1 di is the Bezout number.

2. Main Theorem, Weak Version

Let H(d) be as in Section 1 and we suppose it endowed with the Hermitian innerproduct in [Kostlan], [Bez I] invariant under the unitary group U(n+ 1). ThenS(H(d)) denotes the unit sphere in H(d) and

V = {(f, ζ) ∈ S(H(d)) × P (Cn+1) | f(ζ) = 0} .Then let Σ′ be the set of singular points of the restriction π1 : V → S(H(d))(i.e.(f, ζ) such that the derivative Dπ1 : Tf,ζ(V ) → Tf(S(H(d))) is singular).Compare all this with the similar notions for V of [Bez I, Bez II,= Bez IV]. Infact one has the fibration V → V with fibers SO(2) induced by the fibrationS(H(d)) → P (H(d)); thus the vertical distance to Σ′ of [Bez I] defines a similarvertical distance ρ (same notation) and neighborhood Nρ(Σ′) of Σ′ in V .

Let Lg be the space of great circles of S(H(d)) which contain g ∈ S(H(d)).Let ψ : S(H(d)) − {±g} → Lg, ψ(f) = Lf be the map which sends f into theunique great circle containing f and g.

For such an f we may define Lf = π−11 (Lf ) ⊂ V . If Lf ∩ Σ = ∅, where

Σ = π1(Σ′) is the discriminant locus [Bez I], then Lf is a one dimensionalsubmanifold in V oriented by going from g to f omitting −g. If in addition, ζis a zero of g, then there is a unique arc in Lf starting at (g, ζ) and ending atthe first point of π−1

1 (f) met on Lf . Let us call that arc L(f, g, ζ).

Remark. L(f, g, ζ) may be interpreted as a path of zeros of “the homotopy”tf + (1 − t)g as t goes from 0 to 1.

Our Hermitian structure on H(d) induces natural Riemannian metrics andprobability measures on S(H(d)), P (H(d)) and Lg see [Bez II, Bez IV]. More-over with these measures, the natrual maps S(H(d)) → P (H(d)), ψ : S(H(d)) −{±g} → Lg are measure preserving, in the usual sense that meas ψ−1A = measA.

Fixing (g, ζ) as above let σ = σ(ρ, g, ζ), 0 < σ < 1 be the probabilitythat L(f, g, ζ) meets Nρ(Σ′) for f ∈ S(H(d)). Later we will see how σ may beinterpreted as the “probability of failure”.

Theorem 1. For each ρ > 0, there is a (g, ζ) ∈ V such that

σ ≤ cρ2N2n3D3/2 .

Here, as throughout this paper, c is some universal constant. Moreover

D = maxi=1,...,n

(di)

3

Page 4: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Proposition. We have n3D3 ≤ cN unless some di = 1 or n ≤ 4 in whichcase there is a slightly weaker estimate.

The proof is left to the reader (use(D + nn

)≤ N).

Using this proposition and [Bez I] for the case n = 1, one can use Theorem 1to get the estimate:

σ ≤ cρ2N3 .

Theorem 1 has a ready interpretation in terms of the condition number off ,

µg,ζ(f) = max(h,z)∈L(f,g,ζ)

µnorm (h, z) .

Here see [Bez I, Bez II, Bez IV] for µnorm(h, z) as well as the condition numbertheorem

ρ(h, z) =1

µnorm(h, z), (h, z) ∈ V , (or V ) .

LetSg,ζ,ρ = {f ∈ S(H(d)) − {±g} | L(f, g, ζ) ∩Nρ(Σ′) 6= ∅}

Theorem 2. Given ρ > 0, there is a g ∈ S(H(d)) such that

1D

∑ζ

g(ζ)=0Vol Sg,ζ,ρ

Vol S(H(d))≤ cρ2N2n3D3/2 .

Note that Theorem 1 is a consequence of Theorem 2, since the left hand sideof Theorem 2 is an average over the zeros ζ of g. For at least one ζ, one getsless than the average. Hence there exist a pair (g, ζ) ∈ V such that

σ =Vol Sg,ζ,ρ

Vol S(H(d))≤ cρ2N2n3D3/2

proving Theorem 1.

Conjecture. The pair (g, ζ) of Theorem 1 given by gi(z) = zdi−10 zi, i =

1, . . . , n, and ζ = e0 = (1, 0, . . . , 0) makes the conclusion of Theorem 1 true.

The truth of this conjecture would make our algorithms more constructiveand in fact algorithms in the sense of [BSS] with input, (d, σ, f).

Remark. Let Sg,ρ =⋃

ζ

g(ζ)=0

Sg,ζ,ρ. So

Vol (Sg,ρ) ≤∑ζ

Vol Sg,ζ,ρ .

4

Page 5: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

In this way it is seen that Theorem 2 is a sharp form of Theorem 2 of [Bez IV].In fact that suggests a proof. Theorem 2 is proved in the next section following[Bez IV], but with a multiplicity function taken into account.

Next we use the Main Theorem of [Bez I] and Theorem 1 above to obtain aweak version of our main result.

Main Theorem (weak version). Let be given a probability of failure σ, 0 <σ < 1. Then there exists (g, ζ) ∈ V such that a number of projective Newtonsteps k sufficient to find an approximate zero of input f ∈ S(H(d)) is

k ≤ cN3

σ.

(If n ≤ 4, or some di = 1, one obtains onlycN4

σ).

Proof. The Main Theorem of [Bez I] asserts that k ≤ cD3/2

ρ2. But by The-

orem 1, we can take σ = cρ2N2D3/2n3. Thus by the Lemma, we obtain theestimate of our Theorem.

The number of arithmetic operations of a projective Newton step can be

bounded by cN , so we getcN4

σarithmetic operations.

This theorem is easier to prove than the Main Theorem, the weakness coming

from the factor1σ

which can’t be averaged. Thus we must be able to replace1σ

by1

σ1−ε , Sections 4, 5, 6, and 7 are devoted to this.

3. The proof of Theorem 2 of Section 2

As we have noted Theorem 2 of Section 2 uses a sharpened form of Theorem 2 of[Bez IV]. This suggests a proof. To this end we sharpen accordingly, Theorem Cof [Bez II] and then Proposition 4a, 4b of [Bez IV].

Let M2ρ : S(H(d)) → Z+ be defined by: M2ρ(f) is the number of roots of fin N2ρ(Σ′) (perhaps ∞) or more properly, the cardinality of π−1

1 (f) ∩N2ρ(Σ′).Here we are following the notation of Section 2.

Theorem 1. For any ρ > 0

1Vol S(H(d))

∫S(H(d))

M2ρ(f) ≤ cρ4n3N2D .

This is a sharper version of Theorem C of [Bez II], but the same proof works.

Define for (g, ζ) ∈ V , Lg,ζ,ρ ⊂ Lg as follows: Suppose that

L ∩ Σ = ∅ (L, Σ ⊂ S(H(d))) .

5

Page 6: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Then π−11 (L) → L is a D-fold covering map and π−1

1 (L) − π−11 (−g) consists of

D open arcs in V . Let Ag,ζ denote the arc among then that contains ζ. Thendefine:

Lg,ζ,ρ = {L ∈ Lg |Ag,ζ ∩Nρ(Σ′) 6= ∅} .

Lemma 1. With notation as above,∑ζ

g(ζ)=0Vol Sg,ζ,ρ

Vol S(H(d))≤∑

ζ

g(ζ)=0Vol Lg,ζ,ρ

Vol Lg .

The proof follows from the fact that ψ : S(H(d))−{±g} → Lg preserves theprobability measures and that

Sg,ζ,ρ ⊂ ψ−1(Lg,ζ,ρ) .

Lemma 2. For any ρ > 0, d = (d1, . . . , dn) there is a g ∈ S(H(d)) such that∑ζ

g(ζ)=0Vol Lg,ζ,ρ

Vol Lg ≤(cρ2

D3/2

)−1 Vol S1

Vol S(H(d))

∫S(H(d))

M2ρ(f) .

(Here Vol S1 = 2π) .

Note that Lemmas 1 and 2, and Theorem 1 give the proof of Theorem 2 ofSection 2. One has to just check that the constants come out correctly.

Thus it remains to prove Lemma 2. For this we sharpen Propositions 4a,4b, of [Bez IV] as follows.

Let L denote the space of great circles in S(H(d)).

Proposition. a) There is a g ∈ S(H(d)) such that

1Vol Lg

∫Lg

∫f∈L

M2ρ(f) ≤ 1Vol L

∫L

∫f∈L

M2ρ(f) .

Moreover, b)

1Vol L

∫L

∫f∈L

M2ρ(f) =Vol S1

Vol S(H(d))

∫S(H(d))

M2ρ(f) .

The proof follows so closely that of Propositions 4a, 4b of [Bez IV] that weleave it to the reader.

Corollary. For any d, ρ there is a g ∈ S(H(d)) such that

1Vol Lg

∫Lg

∫L

M2ρ(f) ≤ Vol S1

Vol S(H(d))

∫S(H(d))

M2ρ(f) .

6

Page 7: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Let τρ : Lg → Z+ be defined by: τρ(L) is the number of Ag,ζ meetingNρ(Σ′).

Then from the definitions∑ζ

g(ζ)=0

Vol Lg,ζ,ρ =∫L∈Lg

τρ(L) .

¿From the Corollary of Theorem 1 of [Bez IV] we have

τρ(L) ≤(cρ2

D3/2

)−1 ∫f∈L

M2ρ(f) .

Therefore we obtain for any g ∈ S(H(d)),∑ζ Vol Lg,ζ,ρVol Lg ≤ 1

Vol Lg

(cρ2

D3/2

)−1 ∫Lg

∫L

M2ρ(f) .

The corollary of the previous proposition now finishes the proof.

4. Integral Geometry

The goal here is to estimate the volume of certain real algebraic sets. Thearguments go back to Crofton and Santal=F3, but we use a modern form closerto [Bez II, Bez IV].

The following theorem illustrates what we are doing.

Theorem 1. Let M ⊂ P (IR l) be a real algebraic variety, given by the vanishingof real homogeneous equations with its complexification having dimension m anddegree δ, over C. Then the m-dimensional volume of M is less than or equal toδVol P (IRm+1).

The affine version can be dealt with by the same methods and in fact is in[Smale, FTA] for the 1-dimensional case in IR2.

Let V ⊂ S(H(d)) × P (Cn+1) be as in Section 2 with restrictions of theprojections denoted by π1 : V → S(H(d)), π2 : V → P (Cn+1). Let L be agreat circle in S(H(d)), with L ∩ Σ = ∅. Then π−1

1 (L) is a 1-dimensional (real)submanifold in V . Also π2(π−1

1 (L)) is a curve B in P (Cn+1).

Theorem 2. The length of B is less than or equal to 2D2.

We sketch some basic results on integration, especially Fubini’s theorem, ina Riemannian manifold setting (the Coarea Formula, see [Morgan]).

Suppose F : X → Y is a surjective map from a Riemannian manifold X to aRiemannian manifold Y , and suppose the derivativeDF (x) : Tx(X) → Tf(x)(Y )

7

Page 8: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

is surjective for almost all x ∈ X . The horizontal subspace Hx of Tx(X) isdefined as the orthogonal complement to kerDf(x).

The horizontal derivative of F at X is the restriction of DF (x) to Hx.The Normal Jacobian NJF (x) is the absolute value of the determinant of thehorizontal derivative, defined almost everywhere on X .

Example 1. Suppose a compact Lie group G acts transitively and isometricallyon a manifold S. Fixing s0 ∈ S, the normal Jacobian of the map G → S,g → gs0 is a constant.

More generally it is easily seen that:

Proposition 1. Let F : X → Y be a map of Riemannian manifolds, equivari-ant under the action of a compact group G of isometries of X and Y . If G actstransitively on X then the normal Jacobian is a constant.

Fubini’s theorem takes the following form.

Coarea Formula. Let F : X → Y be a map of Riemannian manifolds satis-fying the above surjectivity conditions. Then for ϕ : X → IR∫

X

ϕ(x) =∫Y

∫F−1(y)

ϕ(x)1

NJF (x).

Here the usual integrability conditions of Fubini’s theorem are supposed.

Next suppose that G is a compact Lie group acting transitively and isomet-rically on the manifold S. Let N be a submanifold of S such that the subgroupIN of G leaving N invariant acts transitively on N . Thus the quotient space

GN = G/IN = {gN | g ∈ G}represents the various images of N under applications of elements of G.

Our application will be to the case S is real projective space P (IR l), G is theorthogonal group O(l) and N is P (IRk+1) considered as inbedded in P (IR l) as acoordinate k subspace. In this case GN can be identified with the GrassmannianGk of k-dimensional linear subspaces of P (IR l).

Returning to the general setting let W ⊂ GN × S be the submanifold

W = {(gN, s) | s ∈ gN} .Let p1 : W → GN , p2 : W → S be the restrictions of the natural projections.

The following can be easily proved:

Proposition 2. The above W is indeed a submanifold, the product action ofG on GN ×S leaves W invariant, and acts isometrically and transitively on W .Moreover p1 and p2 are equivariant under G.

Corollary to Propositions 1 and 2. The normal Jacobians of p1 and p2

are constant.

8

Page 9: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Since G acts on S it acts also on the tangent bundle T (S) by the derivative.It also acts on the associated bundle Gm(T (S)) with fiber, all m planes throughthe origin in Ts(S). We say that the action of G on S is m-transitive if this lastaction is transitive. Note that in our application, m-transitivity is satisfied forall the relevant integers m.

Proposition 3. Let M be an m-dimensional submanifold of S. Suppose thatG as above is also m-transitive on S. Let M = p−1

1 (M) ⊂ W . Then therestriction p2|M : M → GN has normal Jacobian a constant c (possibly notdefined everywhere) depending only on G,S,N and m.

Proof. Define an associated bundle E(T (W )) = E overW of T (W ) as follows.Let w ∈ W : we will define the fiber Ew by

Ew = {Dp2(w)−1(L) ⊂ Tw | L ∈ Gm(Tp2(w)(S))} .

Then the induced action of G on E is transitive by our hypothesis of m-transitivity. Let Hw be the orthogonal space to kerDp1(w) in Ew. ThenNJp1|M (w) is the determinant of the restriction of the derivative of p1 to Hw.By our transitivity we are finished, noting also that the surjectivity of the deriv-ative holds everywhere if at one point.

Theorem 3. Let M be as above. Then

Vol M =c

Vol W0

∫gN∈GN

Vol (gN ∩M)

where c is the constant of Proposition 3 and W0 = p−12 (s0) for some fixed s0 ∈ S.

The volumes are of course in the appropriate dimensions.

Proof. Apply the Coarea Formula and Proposition 2 to p2 restricted to M toobtain:

Vol M = Vol M Vol W0 .

Next apply the same argument to p1 restricted to M to obtain:

Vol M =∫gN∈GN

Vol (gN ∩M)(

1NJp1|M

).

By Proposition 3, and by elimination of Vol M we obtain the result.

Returning to our special case of projective spaces recall that Gk denotes thespace of k-linear subspaces of P (IR l).

Theorem 4. Let M ⊂ P (IR l) be an m-submanifold. Then

Vol M =Vol P (IRm+1)

Vol P (IRm+k−l+2)1

Vol Gk

∫L∈Gk

Vol (L ∩M) .

9

Page 10: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Proof. It suffices to prove that

c

Vol W0=

Vol P (IRm+1)Vol P (IRm+k−l+2)

1Vol Gk

.

Apply Theorem 3 to M = P (IRm+1). So

Vol P (IRm+1) =c

Vol W0

∫L∈Gk

Vol (L ∩ P (IRm+1)) .

The theorem follows, noting that Vol (L ∩ P (IRm+1)) = Vol P (IRm+k−l+2).

We now pass to the proof of Theorem 1. Thus let MC ⊂ P (Cl) be thecomplexification of M ⊂ P (IR l) of the theorem. So M = MC∩P (IR l), P (IR l) ⊂P (Cl). The real dimension of M is less than or equal to m and we suppose thatit is m. The generic (l−m−1) linear subspace in P (IR l) meets M transversallyand in at most δ points since its complexification can meet MC in at most δpoints of transversal intersection. Thus∫

L∈Gl−m−1

Vol (L ∩M) ≤ δVol (Gl−m−1) .

Since Vol P (IR1) = 1, and we are finished by Theorem 4.We proceed to the proof of Theorem 2.Real projective 2n+1 space P (IR2(n+1)) fibers over P (Cn+1) with S1 fibers,

by the isometric action of the unit complex numbers= mod±1. Let

q : P (IR2(n+1)) → P (Cn+1)

be this fibration. Denote by A, q−1(B) where B is as in Theorem 2. Note thatA is a surface.

Lemma. a) The length of B equals1π

area A.

b) A generic (2n − 1) linear subspace of P (IR2(n+1)) meets A in at= most D2

points.

We first show that Theorem 2 follows from the Lemma.

We use Theorem 4 just as in the proof of Theorem 1. Thus

AreaA ≤ D2 Vol P (IR3)Vol P (IR1)

≤ Vol P (IR3)D2

and so length B ≤ Vol P (IR3)π

D2 proving Theorem 2.

So it remains to prove the Lemma.

10

Page 11: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Part a) of the Lemma is again a (rather simple) case of the coarea formula.Now consider b). The idea is to lift the setting to P (IR2n+2) and then complex-ity.

Observe that L ⊂ P (H(d)), so π−11 L ⊂ L × P (Cn+1) ⊂ P (H(d)) × P (Cn+1)

and of course π−11 L ⊂ V .

The following diagram helps:

π−11 (L) = (L × P (Cn+1)) ∩ V -

π2P (Cn+1)

? ?

q∗?

q

q−1∗ π−11 (L) = (L × P (IR2n+2)) ∩ V -

π2P (IR2n+2)

Here q∗ : L × P (IR2n+2) → L × P (Cn+1) is induced by q and V = q−1∗ (V ).

Thus A = π2q−1∗ π−1

1 L.Now Lmay be described as tg1+(s−t)g2 for particular g1, g2 ∈ H(d), t, s ∈ IR

and thus L may be identified with P (IR2) ⊂ P (C2). Then q−1∗ π−1

1 L = X maybe defined by the 2n= real homogeneous equations

tRef + (s− t)Reg = 0

tImf + (s− t)Img = 0 .

These equations are homogeneous of degree 1 in (s, t) and= (d1, . . . , dn) in(xj , yj) where zj = xj +

√−1 yj . Complexifiying these equations in s, t, xj , yj ,we obtain a varietyXC in P (C2)×P (C2(n+1)). The generic 2n−1 linear subspaceK ⊂ P (IR2n+2) has= the property that the complexification P (C2)×KC meetsXC in at most D2 points of transversal intersections (via elementary intersectiontheory).

This yields the upper bound D2 for the number of real intersection points,finishing the proof of the Lemma and hence Theorem 2.

11

Page 12: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

5. Some approximate zero Theory

In this section we do some of the α-theory and approximate zero= theory of[Smale, ICM] and [Bez I] in the projective Newton setting. See also [Malajovich-Mu=F1oz].

We take a slightly different perspective than [Bez I], but still relying on it inpart. In this account we do not attempt to get the best constants.

Let f : Cn+1 → Cn, f = (f1, . . . , fn) where fi is analytic and homogeneousof degree di, e.g. f ∈ H(d).

Recall that the projective Newton method is a map Nf : Cn+1 →= Cn+1,defined by X → X−(Df(x)|Nullx)

−1f(x) and an induced map Nf : P (Cn+1) →

P (Cn+1) which we have denoted by the same letter.= We also sometimes iden-tify x and its equivalence class in P (Cn+1). Also Nf is obviously not definedeverywhere, so the above represents a certain abuse of notation.

A point x ∈ Cn+1 or P (Cn+1) will be called an approximate zero 1 for f ifthe sequence xi defined by x = x0 and Nf (xi) = xi+1 is defined for all natural

numbers i, and there is a zero ζ of f such that dR(xi, ζ) ≤(

12

)2i−1

dR(x0, ζ).=

Here dR is the Riemannian distance in P (Cn+1) and ζ is the= associated zeroof x, i.e. ζ = limxi.

The next theorems are devoted to giving criteria for a point to be an ap-proximate zero in terms of invariants α, β0, γ0 and the distance function dR.

β0(f, x) =1

||x||∥∥∥(Df(x)|Nullx)

−1f(x)

∥∥∥γ0(f, x) = sup

k≥2

(1k!

∥∥∥(Df(x)|Nullx)−1Dkf(x)

∥∥∥) 1k−1

||x||

α(f, x) = β0(f, x)γ0(f, x)

β0, γ0, α are taken as infinite if (Df(x)|Nullx)−1 does not exist.

Note. We have not restricted Dkf(x) to Nullx as in [Bez I]. This change doesnot effect the Theorems of [Bez I]; they still hold with this γ0.

If ζ is a simple zero of f , i.e. (Df(ζ)|Nullζ)−1 exists, then ζ is a fixed point of

Nf . We begin by showing that Nf contracts discs of a certain size in P (Cn+1),centered at ζ, towards ζ.

Theorem 1. There are constants c > 0 and u∗ > 0 such that given f as above,and x, ζ ∈ P (Cn+1) with

f(ζ) = 0

dR(x, ζ) ≤ 11In [Smale, ICM] there are approximate zeros of the first and= second kind. Our approx-

imate zeros here correspond to the second kind.

12

Page 13: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

dR(x, ζ)γ0(f, ζ) ≤ u∗ ,

then ||DNf (x)|| < c dR(x, ζ)γ0(f, ζ). Here DNf (x) : TxP (Cn+1) → TNf (x)P (Cn+1)is the derivative.

Remark. Theorem 1 is sharpr if we maximize u∗ and minimize c. In any

case in the sequel we assume u∗ ≤ 12c

.

Let Br(x) denote the closed ball of radius r around x.

Corollary 1. If r < min(1,u∗

γ0(f, ζ)), then Nf : Br(ζ) → P (Cn+1) is well

defined and Nf (Br(ζ)) ⊂ Br(ζ). It is a contraction with contraction constantc rγ0(f, g), so Nf (Br(ζ)) ⊂ Br(ζ), r′ = c r2γ0(f, ζ).

Proof. Observe that Br(ζ) is convex. By Theorem 1 the length of the imageof a curve of length L in Bζ(r) is at most c rγ0(f, ζ)L. This establishes theassertions on the contraction constant. Applying the contraction estimate tothe straight lines from ζ to x, x ∈ Br(ζ), shows that Nf (Br(ζ)) ⊂ Br′(ζ),r′ = c r2γ0(f, ζ). As c rγ0(f, ζ) < 1, Nf is indeed a contraction andNf (Br(ζ)) ⊂Br(ζ).

Contraction mappings have a convenient property, which we pause to record.

Let X be a complete metric space with metric d, φ : X → X a contractionmap with contraction constant k and unique fixed point p.

Proposition 1. Let φ,X, p, k be as above. Then for any x ∈ X

11 + k

d(x, φ(x)) ≤ d(x, p) ≤ 11 − k

d(x, φ(x)) .

Proof. Both inequalities are standard. We prove only the left hand one.

d(x, φ(x)) ≤ d(x, p) + d(φ(x), p) ≤ (1 + k)d(x, p) .

Corollary 2. If f(ζ) = 0, dR(x, ζ) < 1 and dR(x, ζ)γ0(f, ζ) < u∗. Then x isan approximate zero of f with associated zero ζ.

Proof. By Corollary 1,

dR(Nf (x), ζ) ≤ c dR(x, ζ)2γ0(f, g)

and by induction

dR(Nkf (x), ζ) ≤ (c dR(x, ζ)γ0(f, ζ))2

k−1dR(x, ζ) .

Since u∗ ≤ 12c

, c dR(x, ζ)γ0(f, g) ≤ 12.

13

Page 14: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Theorem 1 follows immediately from the next two propositions which are ofindependent interest.

Proposition 2. Let f be as above. For x ∈ P (Cn+1)

||DNf (x)|| ≤ 2α(f, x) .

Proof. Let

Ex = x+ Nullx

ENf (x) = Nf(x) + NullNf (x) .

We use Ex and ENf (x) as charts for P (Cn+1) at x and Nf(x) respectively in theobvious way.

Let v ∈ Nullx.

DNf(x)(v) = π (Df(x)|Nullx)−1DxDf(x)|Nullx(v) (Df(x)|Nullx)

−1f(x)

where π is the orthogonal projection onto the null space of Nf (x).Thus ||DNf (x)v||Cn+1 ≤ 2γ(f, x)β(f, x)||v||Cn+1 . Recall that Nf (x) ∈ x +

Nullx so ||Nf (x)||Cn+1 ≥ ||x||Cn+1 .Now

||DNf (x)v||TNf (x)P (Cn+1) =||DNf (x)v||Cn+1

||Nf(x)||Cn+1

≤ ||DNf (x)v||Cn+1

||x||Cn+1

≤ 2α(f, x)||v||Cn+1

||x||Cn+1

= 2α(f, x)||v||TxP (Cn+1) .

For the next proposition we use a simple geometry lemma.

Lemma. Let x, y ∈ Cn+1 − {0}. If dR(x, y) < 1 in P (Cn+1) then x ∈ λy +

Nullλy for some λ and||x− λy||||λy|| = tan dR(x, y).

Proof.||x− λy||

||x|| = sindR(x, y) by Proposition 4, Section I of [Bez I]. So

||x− λy||||λy|| = tandR(x, y).

Proposition 3. There exist constants c > 0, u∗ > 0 such that, if

dR(x, ζ) < 1 and

dR(x, ζ)γ0(f, ζ) < u∗ .

14

Page 15: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Then 2α(f, x) < c dR(x, ζ)γ0(f, ζ).

Proof. It follows from the lemma that

dR(x, ζ) ≤ ||x− λζ||||λζ|| ≤ tan(1)dR(x, ζ)

where x ∈ λζ + Nullλζ. Also, γ0(f, λζ) = γ0(f, ζ) so

dR(x, ζ)γ0(f, ζ) and u =||x− λζ||||λζ|| γ0(f, g)

also differ at most by a multiple of tan(1).Now apply Proposition 2, Section III-2 of [Bez I] to conclude that 2α(f, x) <

2κ2u

ψ(u)2. κ = κ(u) and ψ(u) are close to one for u small so we are done. Here

we have been using the notation of [Bez I].

Next we prove a version of the α-Theorem.

Definition. Let γ0(f, x) = max(γ0(f, x), 1) and α(f, x) = γ0(f, x)β0(f, x).

Theorem 2 (Projective α-Theorem). There is an αproj > 0 such that ifα(f, x) < αproj then x is an approximate zero of f .

Compare this to [Malajovich-Mu=F1oz].

In fact, we will prove a little more:

Proposition 4. There are constants α∗ > 0, c > 0 such that if α(f, x) < α∗then there is a zero ζ of f and dR(x, ζ)γ0(f, ζ) < cα(f, x).

First we prove Theorem 2 from Proposition 4.

Proof. Just let αproj = min(α∗,u∗c

) and apply Proposition 4, Corollary 2.Now we prove Proposition 4.By the Domination Theorem (following the notation of that theorem) (The-

orem 2, Section I-2 of [Bez I]) for α(f, x) < α0 there is a zero ζ of f in

Ex = x+ Nullx such that ||x− ζ|| < τ(α(f, x))γ(f, x)

.

Thus

||x− ζ||||x|| γ0(f, x) < τ(α(f, x)). (∗)

Now if α(f, x) is small so is β0(f, x) and so is||x− ζ||||x|| by the Domination

Theorem again. Thus||x− ζ||||x|| and dR(x, ζ) differ by a multiplicative constant

15

Page 16: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

close to one and dR(x, ζ) is also small. Since ζ is a zero, kerDf(ζ)−1 = Null(ζ)and ∥∥∥(Df(ζ)|Nullζ)

−1Df(ζ)|Nullx

∥∥∥ ≤ 1 .

Hence in Proposition 2, Section III-2 of [Bez I], κ may be taken as 1 and

γ0(f, ζ) ≤ γ0(f, x)ψ(τ(α(f, x))(1 − τ(α(f, x))))

.

Substituting in (∗)dR(x, ζ)γ0(f, ζ) ≤ cα(f, x)

for α(f, x) small enough. Finally α(f, x) < α(f, x) so if αproj is small enoughwe are done.

6. The Homotopy

The goal of this section is to give the proof of Theorem 1 below.

Throughout this section we suppose that (ft, ζt) is a curve in V − Σ′, 0 ≤t ≤ 1. Except for Proposition 1 and Lemma 1, we assume moreover that ft canbe represented as ft = tf +(1− t)g for some f, g ∈ S(H(d))). Let γ be an upperbound for 1 and γ0(ft, ζt), 0 ≤ t ≤ 1.

Theorem 1. With ft as above, there is a partition

t0 = 0 , ti < ti+1 , tk = 1

withx0 = ζ0 , xi = Nfti

(xi−1) , i = 1, . . . , k

well-defined for each i and xi is an approximate zero of fti with associated zeroζti . Also

k ≤ cµ(g,ζ)(f)(1 +D3/2L)

where L is the length of the curve ζt. Moreover, (as we will see) ti can be easilycalculated at ti−1.

Recall that Nft is given by projective Newton’s method.

Towards the proof we have:

Proposition 1. There exist universal constants α∗, u∗ with the following prop-erty. Suppose

t , t+ 4t ∈ [0, 1] , xt ∈ P (Cn+1)

satisfy:

γdR(xt, ζt) ≤ u∗

γβ0(ft′ , xt) ≤ α∗ all t′ ∈ [t, t+ 4t]

16

Page 17: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

then for all t′ ∈ [t, t + 4t], γdR(x′, ζt′) ≤ u∗, where x′ = Nft′ (xt) and xt is anapproximate zero of ft′ with associated zero ζt′ .

Moreover given any positive constant K we may take u∗ ≤ Kα∗. In fact

K =148

in what follows.

Proof of Proposition 1. As long as α∗ < αproj, xt is an approximate zerofor ft′ from Theorem 2 of Section 5. That ζt′ is the associated zero is a simplecontinuity argument. As usual there is a constant K1 close to one such that,

dR(Nft′ (xt), xt) ≤ K1β0(ft′ , xt) ≤ K1α∗

γ.

So from Proposition 1 of Section 5,

dR(xt, ζt′) ≤ 2K1α∗

γ

and by Corollary 1 of Section 5

dR(Nf ′t(xt), ζt′) ≤ c

(2K1α

γ

)2

γ =(2K1α

∗)2cγ

c the constant of Corollary 1.

Choose α∗, u∗ so that:2K2

1c(α∗)2 < u∗

and u∗ < Kα∗.

Lemma 1. There are universal constants K > 0, u∗∗ > 0 with the followingproperty. If f ∈ P (H(d)), f(ζ) = 0 and

dR(x, ζ)γ0(f, ζ) ≤ u∗∗

dR(x, ζ) ≤ 1D

then µnorm(f, x) ≤ Kµnorm(f, ζ).

For the proof we may take x ∈ ζ + Nullζ.

Following [Bez I] and the notation there,

µnorm(f, x) = ||f ||∥∥∥(Df(x)|Nullx)

−1 ∆(d1/2i )∆(||x||di)

∥∥∥≤ ||f ||

∥∥∥(Df(x)|Nullx)−1Df(x)|Nullζ

∥∥∥ ∥∥∥(Df(x)|Nullζ)−1Df(ζ)|Nullζ

∥∥∥∥∥∥∥∥(Df(ζ)|Nullζ)−1 ∆(d1/2

i )∆(||ζ||di)|| ||∆(( ||x||

||ζ||)di)∥∥∥∥∥

17

Page 18: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

= µnorm(f, ζ)∥∥∥(Df(x)|Nullx)

−1Df(x)|Nullζ

∥∥∥ ∥∥∥(Df(x)|Nullζ)−1Df(ζ)|Nullζ

∥∥∥∥∥∥∥∥∆(( ||x||

||ζ||)di)∥∥∥∥∥ .

Let r0 =||x− ζ||||ζ|| and u = r0γ0(f, ζ) so r0 and dR(x, ζ) differ by a multi-

plicative constant close to 1, and the same for u and dR(x, ζ)γ0(f, g).By Proposition 1, Section III-2 of [Bez I]∥∥∥(Df(x)|Nullx)

−1Df(x)|Nullζ

∥∥∥ ≤ (1 + r20)1/2

1 − r0

((2−u)uψ(u)

) .By Lemma 3 (2), Section II-1 of [Bez I]∥∥∥(Df(x)|Nullζ)

−1Df(ζ)|Nullζ

∥∥∥ ≤ (1 − u)2

ψ(u)

and these quantities are both bounded as soon as dR(x, ζ)γ0(f, g)= is smallenough.

Finally

||x||||ζ|| = (1 + r20)

1/2 ≤(

1 +(K1

D

)2)1/2

for K1 near 1

so( ||x||||ζ||

)di

is also bounded and we are done.

Now let

Wt =∥∥∥(Dft(xt)|Nullxt)

−1 ((f − g)(xt))∥∥∥

Bt =∥∥∥(Dft(xt)|Nullxt)

−1D(f − g)(xt)

∥∥∥Proposition 2.

a) Bt ≤ 2µnorm(ft, xt).

b) β− ≤ β0(ft′ , xt) ≤ β+

whereβ± =

4tWt ± β0(ft, xt)1 ∓4tBt

4t = |t′ − t|

as long as 4t ≤ 1Bt

.

18

Page 19: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Proof. We may assume ||xt|| = 1. Then

Bt ≤∥∥∥(Dft(xt)|Nullxt)

−1∥∥∥ ‖D(f − g)(xt)‖

≤ µ(ft, xt) (||Df(xt)|| + ||Dg(xt)||)≤ 2µ(ft, xt)

using Proposition 2, Section III-1 of [Bez I]. Finally

µ(ft, xt) ≤ µnorm(ft, xt)

proving a).Since

β0(ft′ , xt) =∥∥∥(t′ − t) (Dft′(xt)|Nullxt)

−1 (f − g)(xt) + (Dft′(xt)|Nullxt)−1 f(xt)

∥∥∥Proposition 2 b) follows from:

Lemma 2.

a)∥∥∥(Dft′(xt)|Nullxt)

−1Dft(xt)|Null=xt

∥∥∥ ≤ 11 − |t′ − t| ||Bt|| .

b)∥∥∥(Dft′(xt)|Nullxt)

−1Dft(xt)|Null=xt

∥∥∥min

≥ 11 + |t′ − t| ||Bt|| .

Proof.

Dft′(xt)|Nullxt = Dft(xt)|Nullxt + (t′ − t)D(f − g)(xt)|Nullxt

so

(Dft(xt)|Nullxt)−1Dft′(xt) = I + (t′ − t) (Dft(xt)|Nullxt)

−1D(f − g)(xt) .

Now the minimum norm∥∥∥(Dft′(xt)|Nullxt)−1Dft(xt)|Nullxt

∥∥∥min

=1∥∥∥(Dft(xt)|Nullxt)−1Dft′(xt)|Nullxt

∥∥∥≥ 1

1 + |t′ − t|Bt .

Which proves b) of Lemma 2. Part a) follows from the additional fact that if

||I −A−1B|| < c < 1 then ||B−1A|| < 11 − c

.

Defineµ = µg,ζ(f) = max

tµnorm(ft, ζt) .

19

Page 20: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Choose γ = D3/2µ. This is permissible by Proposition 3, Section I-3 of[Bez I].

Set α∗∗ = min(α∗, u∗∗), α∗, u∗∗ of Proposition 1, Lemma 1 above respec-tively.

Also 4t = |t′ − t|.Proposition 3. With notation as above, there exists a universal constant c asfollows. Given t with

γdR(xt, ζt) ≤ u∗

then there is a 4t such that

β+(4t) ≤ α∗∗

γ

and4t ≥ c

µor else dR(ζt, ζt+4t) ≥ cα∗∗

γ.

In fact 4t is easily computed as will be seen. Also there is an obviousadjustment to make in case t+ 4t > 1.

Proof. Ifβ+|4t= 1

2βt≤ α∗∗

γ

let 4t =1

2βt. Then by Proposition 2

β(ft′ , xt) ≤ α∗∗

γ.

Moreover by the same proposition, Bt ≤ 2µnorm(ft, xt). Then by Lemma 1we obtain

4t =1

2Bt≥ 1

4µnorm(ft, xt)≥ 1

4Kµnorm(ft, ζt).

Otherwise let 4t be the solution of

β+(4t) =α∗∗

γ.

Then 4t ≤ 12Bt

and it only remains to show that

dR(ζt, ζt′) ≥ cα∗∗

γ.

Lemma 3. Under the conditions of Proposition 3

a) dR(xt, ζt) , dR(xt′ , ζt′) ≤ 148

α∗∗

γ.

20

Page 21: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

b) β0(ft, xt) ≤ 18α∗∗

γ.

c) β−(4t) ≥ 16α∗∗

γ.

d) dR(xt, xt′) ≥ 112

α∗∗

γ.

e) dR(ζt, ζt′) ≥ 124

α∗∗

γ.

Since e) gives our proof of Proposition 3 it only remains to prove Lemma 3.The first part of a) is in the hypothesis and since (Proposition 2)

β0(ft′ , xt) ≤ β+(4t) =α∗∗

γ,

proposition 1 yields the second part of a).Use a) (first part), that β0(ft, ζt) = 0 and Proposition 2, Section II-1 of

[Bez I] to easily obtain b).For c) we argue as follows:

Recall 4tBt ≤ 12. Since

β+(4t) =4tWt + β0(ft, ζt)

1 −4tBt =α∗∗

γ,

using b)

4tWt ≥ 12α∗∗

γ− β0(ft, ζt) ≥ 3

8α∗∗

γ.

Therefore

β−(4t) =4tWt − β0(ft, ζt)

1 + 4t Bt ≥ 23

(38− 1

8

)α∗∗

γ≥ 1

6α∗∗

γ

We obtain d) using Proposition 2, β0(ft′ , xt) ≥ β−(4t) and c). This uses thedefinition of β0 as the Newton vector and the exponential map.

Finally e) follows from

dR(ζt, ζt′) ≥ dR(xt, xt′) − dR(xt, ζt) − dR(xt′ , ζt′)

using a) and d).

We finish this section with the proof of Theorem 1. We use Proposition 1and 3. Let t0 = 0, ti = ti−1 + 4t according to Proposition 3.= So at each step4t satisfies one of the alternatives in b).

Since ∑dR(ζti , ζti−1) ≥ L , µ = µ(g,ζ)(f) .

We get the result.

21

Page 22: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

7. The Main Theorem

The goal of this section is to prove the Main Theorem of Section 1. To this endwe first prove two theorems on the number of projective Newton steps sufficientto find a zero.

Theorem 1. Fix d = (d1, . . . , dn) and a probability of failure σ, 0 < σ < 1.Then there exists (g, ζ) ∈ V such that the number k of projective Newton steps,starting from (g, ζ), sufficient to find an approximate zero of input f ∈ S(H(d))is

k ≤ cN3

σ1−ε , ε =1

logD

(orcN4

σ1−ε if some di = 1 or n ≤= 4).

Thus the set of f where the algorithm fails to produce such an approximatezero in k steps has probability measure less than σ.

For the proof we have:

Proposition 1. Fix (d) = (d1, . . . , dn) and suppose (g, ζ) ∈ V , f ∈ S(H(d))−{±g} and 0 ≤ ε ≤ 1 are given. Then

k ≤ c(µ(g,ζ)(f))2−ε(D2)εD3/2

steps of Projective Newton’s method are sufficient to produce an approximatezero of f .

In Section 2 we have defined µg,ζ(f) and an arc L(f, g, ζ) in V ⊂ S(H(d))×P (Cn+1). Let L be the length of π2(L(f, g, ζ)).

22

Page 23: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Lemma.a) L ≤ 2D2.

b) L ≤ πµg,ζ(f).

Part a) of the lemma is a consequence of Theorem 2 of Section 4. Part b) isa projective space version of the Proposition, Section 1D of [Bez IV]. The proofis the same noting that µ(h, z) ≤ µnorm(h, z) for all (h, z) ∈ V , and that thelength of a great circle in S(H(d)) is 2π.

Returning to the proof of the Proposition, we have from Theorem 1, Sec-tion 6, that

cµ(g,ζ)(f)LD3/2

steps of projective Newton suffice. Hence by the lemma b),

cµ(g,ζ)(f)2−εLεD3/2

steps suffice.Finally from the lemma a)

cµ(g,ζ)(f)2−εD2εD3/2

steps suffice.

Proof of Theorem 1. In Proposition 1 take ε =2

logD so that D2ε is a

universal constant.By Proposition 1 we need to show there exists (g, ζ) ∈ V such that the

function µ(g,ζ)(f)2−εD2εD3/2 is bounded above bycN3

σ1−ε for a subset of f ∈S(H(d)) of probability measure at least 1 − σ.

Solve the equation σ = cρ2N2n3D3/2 for ρ. Apply Theorem 1, Section 2 forthis ρ and the condition number theorem to conclude the existence of (g, ζ) suchthat (

µ(g,ζ)(f))2(1−ε) ≤ cN2n3D3/2

σ1−ε

for all f in a subset of probability measure at least 1 − σ. Now Proposition 1,Section 2 and a little arithmetic finish the proof.

Theorem 2. The average number of projective Newton steps sufficient tofind an approximate zero of f ∈ S(H(d)) is less than or equal to c(logD)N3.(c logDN4 if some di = 1 or n ≤ 4).

In Theorem 2 we employ a quasi-algorithm. This construction fails to bean algorithm because its employs an infinite sequence (gi, ζi) ∈ V i = 1, 2, 3, . . .without exhibiting them.

The idea is quite simple. Start with (g, ζ) as in Theorem 1 (in this Section)

to insure a “small” chance of failure say σ =12

initially. If on input f thealgorithm fails, halve the chance of failure and start over.

23

Page 24: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

More formally let parameters of our quasi-algorithm (gi, ζi) ∈ V be given by

Theorem 1 (in this Section), with probability of failure σi =12i

, i = 1, 2, 3, 4.

Let K(σ) =cN3

σ1−ε (orcN4

σ(1−ε) if some di = 1 or n ≤ 4).

For input f , i = 1, do K(σi) projective Newton steps for the homotopy(1 − t)gi + tf starting at ζc as in Theorem 1 of Section 6.

If XK(σi) is an approximate zero of f by the alpha test, Theorem 2, Section5 halt and output XK(σi).

If not set i = i+ 1 and repeat (some f).The average number of steps of this algorithms is less than or equal to

∞∑i=1

σi

∑j≤i

K(σj)

≤ c′∞∑i=1

σiK(σi) ≤ c′′∫ 1

0

K(σ) dσ .

The first inequality follows by summing a geometric series. For the second notethat σi = |σi− σi−1|, i = 1, . . . with σ0 = 1, that K is monotone on the interval

[σi, σi−1] andK(σi)K(σi−1)

= 2(1−ε) so the Riemann sum

∞∑i=1

|σi − σi−1|k(σi) ≤ 2(1−ε)∫ 1

0

K(σ) dσ .

Finally∫ 1

0

K(σ) dσ = c logDN3 (or c logDN4 if some di = 1 or n ≤ 4).

See [Shub–Smale, 1986] for more arguments of this sort.

Proof of the Main Theorem. To prove the Main Theore, we need onlymake the passage from the number of Newton steps to the number of arithmeticoperations. This argument uses well-known facts from numerical analysis aboutthe number of arithmetic operations needed for approximations, for solving linearproblems, etc. We omit the details.

Proof of Generalized Main Theorem. We sketch some of the changesnecessary for the proof.

Theorem 1 of Section 2 has the follow version which also follows from The-orem 2 of Section 2.

Fixing g ∈ S(H(d)) and ζ1, . . . , ζl ∈ P (Cn+1), l distinct zeros of q. Let σl =σl(ρ, g, ζ1, . . . , ζl) 0 < σl < 1 be the probability for f ∈ S(H(d)) that for somei = 1, . . . , l, L(f, g, ζi) meets Nζ(Σ′).

Theorem (1 Section 2)’. For each ρ > 0 and l, 1 ≤ l ≤ D there is ag ∈ S(H(d)) and distinct zeros ζ1, . . . , ζl ∈ P (Cn+1) of g such that

σl ≤ ρ2n3D3/2l .

24

Page 25: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

Now we can apply Proposition 1 to each of the l homotopies starting at (g, ζi),i = 1, . . . , l as in the proof of Theorem 1. To prove an l-zero version (change an

approximate zero to l approximate zeros and k ≤ cN3

σ1−ε to k ≤ cN3l2

σ1−ε ) one factorof l is for the probability estimate the other because we follow l homotopies.The l-zero version of Theorem 2 follows similarly.

References

[1] Blum, L., Shub, M., and Smale, S. [1989], On a theory of computation andcomplexity over the real numbers: NP -completeness, Recursive funcionsand Universal Machines, Bull. Amer. Math. Soc. 21 1-46. (Referred to asBSS in text).

[2] Cucker, F., Shub, M., and Smale, S. [1993], Separation of Complexityclasses in Koiran’s weak model, preprint.

[3] Kim, M.-H., [1988], Error analysis and bit complexity: Polynomial rootfinding problem, Part I, preprint, Bellcore, Morristown, NJ.

[4] Koiran, P. [1993], A weak version of the Blum, Shub and Smale model,34th Found. of Comp. Sc. pp. 486–495.

[5] Kostlan, E. [1987], Random polynomials and the statistical fundamentaltheorem of algebra, Preprint, Univ. of Hawaii.

[6] Malajovich-Mu=F1oz, G. [1993], On the complexity of path-following New-ton algorithms for solving systems of polynomial equations with integercoefficients, Ph. D. Thesis, University of California at Berkeley.

[7] Morgan, F. [1988], Geometric Measure Theory, A Beginners Guide, Acad-emic Press.

[8] Priest, D. [1992], On properties of floating point arithmetics: numericalstability and cost of accurate computations, Ph. D. Thesis, University ofCalifornia at Berkeley.

[9] Shub, M. [1993], On the work of Steve Smale on the theory of compu-tation, in From Topology to Computation: Proceedings of the Smalefest,(eds. Hirsch, M., Marsden, J., and Shub, M.) Springer, pp. 281–301.

[10] Shub, M. and Smale, S. [1986], Computational complexity, on the geometryof polynomials and a theory of cost: Part II, SIAM J. Computing 15 145–161.

[11] Shub, M. and Smale, S. [1993], Complexity of Bezout’s Theorem I: Geo-metric aspects (referred to as Bez I), J. of the Amer. Math. Soc. 6, 459–501.

25

Page 26: Complexity of Bezout’s theorem V: Polynomial time · operations to nd D l 1approximate zeros of f 2 P(H(d)) is less than cl2N4,ca universal constant, unless n 4 or some d i =1in

[12] Shub, M. and Smale, S. [1993], Complexity of Bezout’s Theorem II: Vol-umes and probabilities (referred to as Bez II), Computational AlgebraicGeometry (F. Eyssette and A. Galligo, Eds.) Progress in Mathematics,Vol. 109, Birkhauser, 267–285.

[13] Shub, M. and Smale, S. [1993], Complexity of Bezout’s Theorem III: Con-dition number and packing (referred to as Bez III), J. of Complexity 9,4–14.

[14] Shub, M. and Smale, S. [1993], Complexity and Bezout’s Theorem IV:Probability of Success, Extensions, (referred to Bez IV), preprint.

[15] Smale, S. [1981], The fundamental theorem of algebra and complexity the-ory, Bull. Amer. Math. Soc. (N.S.) 4, 1–36.

[16] Smale, S. [1986], Algorithms for solving equations, Proceedings of the In-ternational Congress of Mathematicians, Berkeley, CA, American Mathe-matical Society, Providence, R.I., pp. 172–195 (referred to as ICM).

[17] Smale, S. [1990], Some remarks on the foundations of numerical analysis,SIAM Rev. 32, 211–220.

26