the role of multi-dimensional wavelet approximation in geodetic applications: de-noising,...

Survey Review, 41, 311 pp.26-43 (January 2009)

Contact: M M El-Habiby e-mail: [email protected] © 2009 Survey Review Ltd. 26 DOI 10. 1179/003962608X389988

COMPARISON AND ANALYSIS OF NON-LINEAR LEAST SQUARES METHODS FOR 3-D COORDINATES

TRANSFORMATION

M. M. El-Habiby, Y. Gao and M. G. Sideris Department of Geomatics Engineering, Faculty of Engineering, University of Calgary,

2500 University Dr. N.W., Calgary, Alberta, Canada, T2N 1N4.

ABSTRACT

Four different methods are evaluated by solving the Molodensky 3-D coordinate transformation problem. These methods are Steepest Descent, Trust region, Gauss Newton and Levenberg-Marquardt. Also, the problem has been solved using the traditional combined least-squares adjustment. The solutions of these methods are compared by the number of iterations required for the objective function to converge to its minimum value. Externally, the RMSE of the transformed check stations of the geodetic network (curvilinear coordinates) are compared to the RMSE obtained by transforming the same set of check stations using the transformation parameters recommended by the Egyptian Survey Authority. KEYWORDS. Coordinate transformation. Optimisation procedures. National geodetic networks. WGS84

INTRODUCTION

Extension of the existing geodetic control using traditional terrestrial techniques has become impractical nowadays due to time and cost concerns. Instead, modern space techniques using artificial satellites, particularly the most recent global positioning system GPS, are employed. Therefore, the combination of terrestrial and satellite networks is essential to benefit from satellite measuring technology and accuracy. The GPS results are delivered in the satellite global average terrestrial system of WGS84 datum, while the terrestrial geodetic network results are expressed in the geodetic coordinate system of the national local or regional geodetic datum, which is generally different from the WGS84 datum. Transformation will be required between the two systems.

The combination of GPS and terrestrial coordinate systems requires a certain mathematical model with appropriate transformation parameters for carrying out the transformation and combination between these different systems of coordinates. The classical mathematical models for datum transformation (e.g. Bursa, Molodensky, Veis…etc.)[13] have been found suitable for this purpose, due to the use of one set of rotations only. The more advanced models, such as the Krakiwsky – Thomson model [22], although they involve more than one set of rotations, are also found un-practical due to the needed tight constraints and ambiguous definition of the inner zone of data surrounding the immediate vicinity of the point under transformation. It has been found that the computed transformation parameters from these modern models usually vary from one point to another. This variation is basically due to the accumulated errors and biases during the establishment and computations of terrestrial networks [6].

The main problem, however, is to find constant or fixed values for the required transformation parameters that will be reliable enough and valid for the entire region of interest. Of course, such a requirement necessitates the availability of a sufficiently well-distributed number of common points of known coordinates in the two systems

M M EL-HABIBY, Y GAO AND M G SIDERIS

27

under transformation. In addition, due to the nature of the problem, a least squares adjustment should be conducted. The 3-D transformation problem has been solved before by [4], [18], [27], [5], and [8] by using traditional least squares adjustment techniques. There are many other optimization methods towards non-linear least squares adjustment but their effectiveness differs from one to another. In other words, if one algorithm works for one problem, it is not guaranteed that it will work on others. It is, therefore, important to investigate the characteristics and performance of different optimization algorithms in solving one of the main geodetic problems, which is the datum transformation of geodetic networks [11].

The main objective of the current research can be explicitly stated as follows: the investigation and testing of the major effects associated with using different optimization procedures in solving the 3-D coordinate transformation problem between national geodetic networks and new GPS satellite datum of WGS84.

In this paper, several non-linear least squares estimation algorithms are tested and analyzed to solve the 3-D coordinate transformation problem for geodetic networks. Their efficiency in solving this kind of transformation problems has been evaluated. The paper is organized as follows: first, the 3-D coordinate transformation problem is described and the Egyptian geodetic networks are introduced as an example. The mathematical background of different nonlinear least squares optimization algorithms is then presented. Numerical results are presented to assess their performance by comparing them to the solution from a linear least squares adjustment method (combined case).

3-D COORDINATES DATUM TRANSFORMATION

A terrestrial geodetic datum is defined as an adopted uniform surface, which is usually a rotational ellipsoid, specified by its size and shape defining parameters, and rigidly fixed to the geoid through a certain defined set of positional parameters. Such a terrestrial geodetic datum could be a regional non-geocentric datum or a global geocentric datum, if its geometric center does not coincide or is coincident with the center of gravity of the earth, respectively. Hence, such a geodetic datum will be used as a basis for geodetic computations, in addition to modeling the earth gravity field irregularities and their effects on geodetic terrestrial measurements. Actually, the geodetic datum defines a 3-D geodetic coordinate system, to which all established geodetic control networks should be related. Such terrestrial networks are usually established by relative geodetic measurements taken by terrestrial surveying instruments and applying certain classical terrestrial survey techniques, such as triangulation, trilteration, Hybrid [4].

Conversely, modern geodetic control networks are established by modern space satellite techniques, out of which the global positioning system GPS is the most up-to-date. As mentioned before, GPS is a three-dimensional positioning technique, in terms of Cartesian or curvilinear coordinates. This is the case since an artificial geodetic satellite like GPS is orbiting the earth and controlled by its external gravitational field, which is directly connected with the center of gravity of the earth. In other words, all satellite geodetic datums must be geocentric. Any terrestrial geodetic datum can be completely defined by six topocentric positional parameters at a certain point in the network known as the “datum initial point” in addition to the two parameters defining the size and shape of the adopted reference ellipsoid [25].

Different countries use the GPS satellite technology to update and refine their geodetic datum because most of these countries already have their own national datum. This leads to the need for datum transformation from the GPS datum to their

NON-LINEAR LEAST SQUARES FOR 3-D COORDINATES TRANSFORMATION

28

local national datum. In this research study, discussion will be for countries with areas around one million kilometres square. Hungary is one of these countries where this kind of transformation was needed. Seven parameter 3-D transformations have been performed between two 3-D systems. The Hungarian national GPS frame network consists of 43 points with given EURF and HD72 ellipsoidal coordinates and with WGS84 as a reference system. The existing reference system is called Hungarian-Datum 1972 (HD72), which was developed by applying the ellipsoid of the IUGG Geodetic Reference System-1976. Traditional least squares adjustment has been used in solving the coordinate transformation problem; the results of the transformation were in the range of 0.20 to 0.3 ms for 2-D Helmert transformation [18].

Datum transformation from the Korean geodetic system to WGS-84 is done by [27]. They are aiming to find out the relation between the Korean local datum and the world geodetic system. The Korean datum employs the Bessel ellipsoid as a reference with its origin fixed at Tokyo, Japan. The tie of GPS surveys to existing triangulation monuments and vertical benchmarks enables the transformation of GPS datum to result in the national datum. Fifty-eight triangulation points are used to determine the seven transformation parameters between the Korean datum and WGS84. Bursa seven-transformation parameters model is used. Also, traditional least squares adjustment is used in the estimation of the transformation parameters. All the common points are used in the computations, which find differences between the Korean Datum and WGS84 horizontal coordinates for the same point in the range of approximately -360m to 310m in northward direction and 180m to 210m in eastward direction, no internal accuracy is calculated in this research work. Kutoglu (2004) introduced a case study for a GPS project (GPS Control Network Project of the Istanbul Metropolitan Area) in Istanbul City, Turkey. The project was to compute the coordinates of the new referenced GPS points (ITRF94) in the national datum ED50 (European Datum 1950). Bursa seven transformation parameters were used. Also, traditional least squares adjustment has been applied to determine the transformation parameters between the two datums. The main reason for giving the example above is that the traditional least squares method is usually the only method used when dealing with this kind of problem. It is very difficult to compare these results to each other because the accuracy after the transformation depends not only on the transformation formulae or the optimization method used, but also on the internal accuracy of the system. The internal accuracy of all these systems is obviously different and depend on the circumstances of each local network and its internal accuracy.

This research, the case study will be Egypt. The terrestrial datum in Egypt is the EGD of Helmert 1906; all the terrestrial observations are taken relative to it. The size and shape parameters defining the Helmert ellipsoid are the semi major axis

m6378200 a = and the reciprocal flattening 298.3 1/f = . The initial point is the “Venus” station on the Moquattam hill (near Cairo on the east side of the Nile) [2]. The satellite geodetic datum, newly used in Egypt, is the GPS satellite datum of WGS84, whose size and shape defining parameters are m6378137 a = and 298.257 1/f = . It is a global geodetic datum that defines the 3-D average terrestrial or conventional global coordinate system. Nearly all the local and regional terrestrial geodetic datums have been redefined on the basis of satellite geocentric positioning approaches. Finally, the main advantage of working with geodetic satellite datums, instead of working with terrestrial regional geodetic datums, will be the easy, rapid and high productivity of modern GPS surveying techniques as compared to the old traditional surveying techniques. Many modern practical and scientific geodetic applications are related to this new satellite datum as opposed to the old one.


29

The Molodensky coordinate transformation model has been used in this case study. It describes the relationship between any two different three-dimensional coordinate systems one global satellite system and one terrestrial system, by seven unknown parameters. These parameters are three shift components ]Z,Y,X[ 000 , three rotation parameters ),,( zyx ωωω of the terrestrial network as connected to the datum initial point relative to the geodetic terrestrial system, and one unique scale factor )k1( + for the entire geodetic network to account for the scale error or distortion k between the satellite and terrestrial systems. The mathematical expression of the Molodensky coordinate transformation model [3] takes the following form:

0 (1 )pGPS iHelmert ipHelmertX X x k R x= + + + ∆ (1)

The Molodensky model assumes parallelism between the regional terrestrial geodetic system and that of the global satellite system, as shown in Figure 1.

Fig.1. Molodensky seven parameter transformation model between Egyptian Geodetic Datum (EGD)

and WGS84

In addition, the rotations and the scale are applied only on the topocentric vector ipx∆ between any terrain point p and the datum initial point i . This means that the

rotation and the scale parameters in this model are applied on the geodetic terrestrial network itself, which is being rotated and scaled. The matrix R in equation 1 is a combined rotation that consists of three rotations, relative to the terrestrial geodetic system. If R is expanded and approximated by considering the rotation angles to be very small, the final form of equation 1 in matrix notation, becomes the following:

Helmertip

ip

ip

xy

xz

yz

Helmerti

i

i

0

0

0

GPSp

p

p

zyx

11

1)k1(

zyx

ZYX

ZYX

−−

−++

+

=

∆∆∆

ωωωωωω

(2)

The Molodensky model (equation 2) contains three parametric condition equations into seven unknown parameters. Hence, at least three control points common in both systems are required for the determination of these parameters. In practice, redundancy will be introduced for better accuracy and reliability.


30

NON-LINEAR LEAST SQUARES: DIFFERENT OPTIMIZATION ALGORITHMS

Optimizing a problem or optimization as a general aspect can be defined as the determination of a number of parameters through minimization or maximization. The problem of determining parameters through optimization is formulated with an objective function to find the parameters in their optimal values. The nonlinearity title for the least squares adjustment comes from a non-linear combination of parameters in the objective function. Nonlinear unconstrained problems have been extensively studied since 1960 [6]. Most of the optimization algorithms for the least squares estimation procedures for the non-linear parameters revolve round two general approaches. The first approach is the expansion of the model as a Taylor series; corrections to the unknown estimated parameters are applied at the consecutive iteration steps, based on local linearity. The second approach is mainly built on the modification of the steepest descent method, which is a gradient method. Neither method can stand alone because of their simplicity and lack of convergence to the optimal solution. An optimal interpolation between the two approaches can be used in the representation of a nonlinear model [13].

The following subsection introduces the general numerical characteristics of a number of iterative algorithms for solving non-linear least squares problems. All the methods that will be mentioned belong to the iterative descent family. Five optimization methods will be used in the following with respect to the datum transformation problem. The methods include Steepest Descent, Trust region, Gauss-Newton, Levenberg-Marquardt method, and the traditional combined least squares adjustment with weighted observations [20]. These methods are optimal in case of small residuals, which are well suited to our case study. Only the first four methods will be described in the following section; because the combined least squares adjustment method has been well documented in geodetic literature.

In general, a non-linear optimization problem without constraints can be described by the following minimization equation [22]; [23]:

RRFRxxF nn →∈ :,),(min (3)

where x unknown parameters to be estimated

)(xF objective function nR n unknowns belong to the Re(real number)

The solution of this minimization problem is done through iterative algorithms as mentioned before. The main target is to determine a new value to the unknown parameters leading to the descent of the objective function value (minimizing) [26]. The solution of the minimization problem can usually solved through the following general scheme:

1 , 0,1,2,.....k k k kx x t d k+ = + = (4)

where kk xx ,1+ estimated unknown parameters at iteration steps kk ,1+ respectively

kt positive scalar kd direction vector

Generally, for the different optimization methods used in the current research study, the following steps can be followed [23]:


31

1. Provide a set of initial values for the unknowns and start the iteration at 0=k . 2. Determine the direction vector kd . 3. Determine a positive scalar kt according to an adopted line search strategy, so

that )()( 1 kk xFxF ≤+ . 4. Assess predefined conditions that the obtained solution must satisfy. 5. Stop the iteration if the predefined conditions are satisfied and take 1+kx as the

solution of equation 3. Otherwise, k is incremented by one and repeat the procedure steps from 2 to 5.

The least squares issue is taken into account for these methods when the objective

function will be the weighted sum of squares: 2)x(Hy

21)x(F −= (5)

where y m observation vector

)(xH nonlinear vector function mapping nR into mR The positive definite matrix )( kxQ is assumed to be as follows:

1)]kx(Hx1

yCT)kx(Hx[ −∂−∂ (6)

where yC = positive definite observation covariance matrix

and since )x(eC)x(H)kx(Fx1

yx−−∂=∂ (7)

in addition to )()( xHyxe −= (8)

This minimization least squares procedure will be implemented in the nonlinear methods that will be discussed later, in the following subsections, in parallel with the iterative descent algorithms.

The main difference between the optimization methods used, which depend on this general procedure, is the choice of the direction vector kd and the scalar kt . In the current research study the gradient methods are used by the selection of the direction vector kd as the values of the partial derivatives of the objective function )x(F with respect to the independent variables, in addition to the value of the objective function itself. Nevertheless, the information obtained from the pervious iteration step is used. When dealing with practical problems, some modifications are necessary, which will be described in the following for different optimization methods.

Iterative Descent Gradient Algorithm First, a brief introduction will be given for the descent methods in general, which

will help in explaining the different assumption for the steepest descend method, as a gate for introducing the following optimization algorithms. As mentioned before, the direction vector kd is named ‘descent’ if the positive scalar kt satisfies the following conditions [16]:

( ) ( )k k k kF x t d F x+ < (9)


32

By applying Taylor expansion to ( )k k kF x t d+ at kx :

( ) ( ) ( )xk k k k k k kF x t d F x t F x d higher order terms+ = + ∂ + (10)

the higher order terms are neglected, leading to

( ) 0x k kF x d∂ < (11)

with a chance to choose positive scalar kt to satisfy equation 9. Different direction vectors kd are shown in the following figure [20]:

Fig.2. Objective function contours and the descent directions at value kx

The direction vector that satisfies equation 11 is in the descent direction. Generally, the descent direction can be introduced as

( ) ( )xk k kd Q x F x= − ∂ (12)

where )( kxQ is an arbitrary positive – definite matrix depending on the estimated values at

the iteration step k . This will lead to the general descent form:

1 ( ) ( )xk k k k kx x t Q x F x+ = − ∂ (13)

The choice of the two variables in equation 13 )"x(Q&t" kk will orient the procedure to certain optimization algorithms.

Steepest Descent Method The Steepest Descent Method is one of the oldest and widely used optimization

methods, mainly for minimizing a function of several variables. In this method, a unit value is chosen for the arbitrary definite positive matrix )( kxQ , so the steepest descent iteration equation (equation 13) will take the following format:

1 ( ) ( )xk k kx x t x F x+ = − ∂ (14)

.

)F(x kx∂

d k

x̂

F (xk) = const.


33

The gradient of the function )x(F at a point is the direction of the most rapid increase in the value of the function at that point. The descent direction can be obtained by reversing the gradient (or multiplying it by -1).

The main advantage of the steepest descent method is its simplicity because of the use of partial derivatives no higher than first order. Conversely, this simplicity is the main disadvantage of this method, since its effectiveness depends on the initial assumption to the unknown parameters to avoid falling in a local minimum. This deficiency is the result of the tendency of this method to zigzag, especially when the contour of the objective function is elongated, as shown in Figure 3.

Fig.3. Steepest descent tendency to zigzag

The steepest descent method is always searching for the nearest minimum so the obtained solution ends up with a local minimum [12]. The minimization problem for the steepest descent method, when the objective function will be the weighted sum of squares (least squares adjustment), equations 5 and 7, will be as follows:

11 ( ) ( ) ( )x yk k kx x t x H x C e x−+ = − ∂ (15)

Trust Region Method The Trust Region Method is one of the basic descent methods and it is still one of the most powerful methods to deal with nonlinear least squares problems. The positive definite matrix in this method is chosen to have the following form [13]:

2 1( ) [ ( ) ]xxk k k kt Q x F x Rα −= ∂ + (16)

where kα non-negative scalar

R positive definite matrix Therefore, equation 13 will take the trust region form

2 1[ ( ) ] ( )1x x F x R F xxx xk k k k kα −= − ∂ + ∂+ (17)

The formulation of the descent algorithms in this form (Trust Region method) improves the stability of the solution when starting with an initial assumption far from the solution. The use of a positive definite matrix R allows the positiveness of equation 16, with a large non-negative scalar. By controlling kα , the lack of a descent direction can be prevented, even with a positive deficiency for )(2

kxx xF∂ . With this method, a trial

.


34

to minimize the objective function )(xF is always introduced in the procedure. Suppose that the procedure starts at a certain value 0x , it then needs to change to a lower function value. The basic idea is to approximate the objective function )(xF into a simpler function f to reflect the behavior of the function )(xF in the neighborhood around the point 0x , namely the trust region M. To have this minimization, an initial increment δ is computed by an approximate minimization to the objective function )(xF : i.e., minimizing the approximated function )(δf over the neighborhood of the initial values [21].

( ),Min f Mδ δ ∈ (18)

The initial value is updated to )( δ+x , leading to one of these two cases: either )()( xFxF <+δ is fulfilled or not. In the first case, the value will be updated to )( δ+x . In

the second case, the initial value will stay as it is, 0x , and a new increment will be chosen after the M region of trust is changed (shrunk). The main problem in this method is the definition and computation of the approximation f (defined at the current point x), the modification of the trust region M, and solving the minimization problem of equation 18, all lead to the determination of the incrementδ . The trust region based on minimization of weighted least squares terms will take the following form:

1 1 2 11[ ( ) ( ) ( ) ( ) ] ( ) ( )1 x y x y xx x yx x H x C H x e x C H x R H x C e xk k kα− − −−= + ∂ ∂ − ∂ + ∂+

(19)

because

2 1 1 2( ) ( ) ( ) ( ) ( )xx x y x y xxF x H x C H x e x C H x− −∂ = ∂ ∂ − ∂ (20)

Furthermore, that if R is equal to the unit matrix, the trust region method can be introduced as an intermediate method between simple Newton methods and steepest descent. Also, the value of the scalar positive kα can control the approximation between quadratic approximation (Newton’s methods) and linear approximation (steepest descent method), as shown in Figure 4.

Fig.4. Trust region a method between Newton and steepest descent method

The Gauss-Newton Method The Gauss-Newton Method is a central method. It can be derived from the Newton-

Raphson method with the implementation of the minimum sum of squares in the

dSD dTR

dN

xk

F (xk) = const.


35

Newton's model [19]. In this method, the objective function will be the weighted sum of squares:

2)(21)( xHyxFGN −= (21)

where y m observation vector

)(xH nonlinear vector function mapping nR into mR The positive definite matrix )( kxQ is assumed to be as follows:

1 1[ ( ) ( )]TH x C H xx k y x k− −∂ ∂ (22)

where

yC positive definite observation covariance matrix

and since 1( ) ( ) ( )x yGNF x H x C e xx k−∂ = −∂ (23)

in addition to

( ) ( )e x y H x= − (24)

The Gauss-Newton method can be expressed by the following equation:

1 1 1[ ] ( )1T Tx x J C J J C e xk k y y

− − −= ++ (25)

where the Jacobean matrix J is the first derivative of the mapping vector )(xH . The Gauss-Newton method can jam up at a point where the Jacobian J is rank

deficient. To avoid this problem, a positive definite matrix can be added to the term 1]1[ −− JlCTJ which will lead to an algorithm known as Levenberg-Marquardt method [10].

The Levenberg-Marquardt Method The Levenberg-Marquardt Method can be considered as a modified Gauss-Newton

algorithm using the trust region procedure. A scalar quantity is introduced to control the magnitude and the direction of the descent. This quantity determines the use of either the steepest descent method direction or the Gauss-Newton method direction. The scalar quantity is symbolized as α . The search direction is determined as a subproblem from the solution to a set of linear equations, in the following form [12]:

1[ ] ( )k kTJ C J I d J F xy α− + = − (26)

In case of having 0=α , the algorithm will be the same as the Gauss-Newton Method. If α tends to be a relatively big value, theoretically infinity, it will turn to the simple steepest descent. The Levenberg-Marquardt method difficulty is in controlling the size of α in order to be efficient in each iteration step. The method used to control the step size is to estimate the relative nonlinearity of the objective function F(x) using a linear predicted sum of squares )x(f klp :


36

1 1 1( ) ( ) ( )lp k k k kf x J x d F x− − −= + (27)

and a cubically interpolated estimate of the minimum )(xf cl , where )(xf cl is obtained by cubically interpolating )( kxf and )( 1−kxf [13]. If )( klp xf is greater than )(xf cl , then α is reduced; otherwise, it is increased to ensure that )()( 1 kk xfxf <+ at each iteration step so it retains its descent algorithm towards the solution [14], [9];, [17].

NUMERICAL RESULTS AND ANALYSIS

A comparison among different least squares methods will be conducted through the solution of the 3-D coordinates transformation problem for geodetic networks. Sixteen common stations with known coordinates in EGD and WGS84 are used in the computations of the Molodensky seven transformation parameters. This step is followed by the transformation of the 23 check stations, using the transformation parameters determined from the different least squares algorithms. The analysis of the solutions from different methods will be carried out in two major steps. First, a demonstration of the computed transformation parameters and the algorithms characteristics are provided through a comparison among different methods. Second, a comparison of the RMSE at 23 checkpoints is conducted among different methods.

Data Used The 16 common points are available between the terrestrial geodetic network

related to the old datum Helmert 1906 (non-geocentric datum) and the satellite geodetic network (geocentric datum). These points are well distributed (according to the current situation of the network) as shown in Figure 5 (a), which is helpful for the convergence of the solution. The second part of the data is 23 common points, which are used as check points to assess the reliability of the obtained transformation parameters. The distribution of these points is given in Figure 5 (b).

Results and analysis The first method used to find a solution for this non-linear least squares problem is the steepest descent. This method failed to find a solution because the iterated solutions diverged and never reached a solution even with forcing a significant number of iterations. This failure can easily be seen in Figure 6. The main drawback of this method is the need for a good initial value x0, which must be very close to the solution. Such a requirement is practically very difficult. Also, the use of the original steepest descent is not practical for most cases because of its tendency to Zigzag, as mentioned before. The steepest descent method will minimize the cost function just for ten iterations; after that, its value will not change and the procedure will terminate, which gives a solution very far from the expected. This is mainly because it falls in a local minimum not the absolute one. The output transformation parameters can be seen in Table 1.

From Table 1 it is clear that all the values of the seven transformation parameters are far from the expected solution in the first trial, where all the unknown parameters have zeros as the initial value. This is because the iteration stopped after it reached the first local minimum. In order to show this drawback, another trial was conducted by assigning the parameters with initial values very close to the expected solution, obtained from other methods, so they are guaranteed to be close to it.

Even though the solution looks to be close to the values announced by the Egyptian Surveying Authority (ESA), it is false as the iteration falls in the closest local minimum close to the initial values of the unknown parameters. Nevertheless, this trail


37

is generally not practical, because a close assumption to the solution (unknowns) is rarely to be known. The question that can be raised now is why this method is used in this research; even though it was previously expected to fail. The answer is that this method is useful for showing the other methods’ advantages. This will highlight the modification introduced to the other methods to overcome this weakness; however, they also depend on similar gradient algorithms.

Fig.5a Sixteen common stations used in the computation of the seven transformation parameters

Fig. 5 b 23 check common stations


38

In the trust region method, it will converge to the expected solution but after a large

number of iterations that is equal to 16495. This is a very large number, but finally the algorithm reached a reasonable minimization for the objective function and the values of the parameters are close to the one announced by the ESA, as shown in Figure 6 and Table 3. From Figure 6, it is clear that the minimization rate of the objective function is very high at the first part of the iteration procedure and it decreases slightly in the second part of the curve, then, it becomes steep again until it nearly turns constant, at the minimum sum of squares and equal to 23.856.

100 101 102 103 104 1050

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 105

No. of iterations

cost

func

tion

valu

eSteepestGauss-NewtonLevenberg-MarquardtTrust Region

Fig.6. Relation between the number of iterations and the value of the cost (objective function) for 4

different optimization algorithms

Table 1. Steepest descent approach: first and second trials

Values Transformation parameters

First Trial Second Trial

X0 (m) -1.8443E-08 -126.000000010044

Y0(m) 1.68E-08 113.000000010077

Z0(m) -1.4E-09 -10.65000001

Xω (s) 17.74357 -0.34734

Yω (s) 24.16675 -0.341387

Zω (s) 24.41236 -1.3463

k (ppm) -53.9413 2.74917

From Figure 6, the control of the approximation can be recognized in the first part

of the trust region cost function value curve, which corresponds to the steepest descent, and it coincides on the curve obtained from the standalone steepest descent trial. As mentioned before, the trust region is a modified descent method. This modification is


39

reached from the introduction of the scalar value kα , which changes in each iteration step according to the evaluation of the objective function. In the beginning of the iteration process, the control of the trust region is oriented to the steepest descent, until it is trapped by the first local minimum. This is similar to the steepest descent first trial, which starts with the same initial values as our case here (Trust region). After a while, the controlling of the approximation started by assuming different values for kα enforcing global properties, or in other words moving to the quadratic approximation depending on Newton’s methods, until it reaches the solution after a number of iterations.

This kind of control improves the solutions’ robustness of the solution to some extent. For the trust region solution, global convergence is introduced into the solution on the contrary to the steepest descent method. The reason for the huge number of iterations here is the way the iteration is done. The iteration procedure in the trust region method depends on forming a neighborhood (trust region) around the initial values of the unknown parameters; if this minimizes the cost function for the currently used step factor kα , it will move to another point, searching for lesser values for the function. Otherwise, the trust region (neighborhood area) is decreased around the assumption of the unknown parameters at this point, and uses the new neighborhood to estimate a new step 1+kα . Consequently, it is clear that for each value for the unknown parameters, the iteration can proceed for several iterations, before moving to another center for the trust region. This will introduce a significant increase of number of iteration steps compared to other methods, which do not create any neighborhood around the point and use the step directly in creating new values for the unknown parameters.

In the case of the Gauss-Newton method, the solution will converge more rapidly than the trust region method and in fewer of iterations, as shown in Figure 6. The number of iterations here is equal to 13. The main reason for the quick convergence of the Gauss-Newton method is that the observations (16 common points) have small residuals (Table 3). The solution is shown in Table 2. Because of the small values of the residuals, the Gauss-Newton method will ensure a quadratic fast approximation of the function, without needing to enforce the global parameter like in the trust region..

Table 2. The solution for the four succeeded optimization procedures solved the 3-D Transformation problem

Transformation parameters Trust Region Gauss-Newton Levenberg-

Marquardt Combined

least squares

Recommend transformation parameters [5]

X0 (m) -126.331 -126.3720 -126.3720 -125.7668 -125.9300

Y0(m) 113.6059 113.6344 113.6344 113.4115 112.2900

Z0(m) -10.649295 -10.6561 -10.6561 -10.4108 -9.6100

Xω (s) -0.45846 -0.4625 -0.46226 -0.53303 -0.16710

Yω (s) -0.38517 -0.39395 -0.39375 0.10559 -0.1351

Zω (s) -1.44530 -1.45432 -1.45358 -1.69241 -1.90690

K(ppm) 3.085000 3.102675 3.102657 3.414906 3.13740


40

This method showed a very good reputation in our nonlinear problem. In some other cases, the procedure can fall in a local minimum and fail to reach an absolute minimum and this depends mainly on the degree of non-linearity of the model. The degree of non-linearity in our case here is weak because of the small values of the transformation parameters in comparison with the coordinate values. In this case, the Gauss-Newton method will need to be modified in a way similar to the trust region algorithm. As mentioned before the Levenberg-Marquardt method is a modification of the Gauss Newton method. This modification is oriented for controlling the approximation required to be used in the algorithm of the solution, to be either linear or quadratic. This control will be through updating a positive scalar α according to the evaluation of the direction of the value of the objective function (descent). In this case, the algorithm reached the solution in 19 iterations, which are slight greater than the Gauss-Newton Method. However, from Figure 6, it can be seen that the two curves are almost identical but with six more ineffective iteration trials for Levenberg-Marquardt; because the algorithm is trying to change the value of the positive scalar to be sure of reaching an absolute minimum. Owing to the small values of the residuals in ratio to the observations, the implementation of Levenberg-Marquardt does not make a great difference in the solution of this problem. Nevertheless, the Levenberg-Marquardt method neither introduced nor enhanced the minimization process more than the original Gauss-Newton method. Consequently, it can be said that the two methods are almost similar in their behavior during solving this problem of 3-D transformation

Table 3. Residuals of the observations (16 common points) using Gauss-Newton RESIDUAL

point X Y Z point X Y Z 1 -0.785 -0.33157 0.530892 9 -0.1879269 0.61963858 1.337837 2 0.632721 0.336022 0.146952 10 1.2113846 -0.330635608 -0.80374 3 0.00188 -0.75466 -0.7879 11 -0.1993715 0.005747872 -0.27717 4 -0.04594 1.186604 1.248978 12 -0.2587492 -0.088886518 -0.78392 5 -0.51016 -1.20177 0.080968 13 -0.4118953 0.146445891 0.778373 6 0.252075 -0.18248 -0.37269 14 -0.1796459 -0.286204561 0.194761 7 -0.61793 -0.86837 -0.80885 15 1.3194063 1.350235822 -0.30494 8 -0.56716 -0.0828 1.847306 16 -0.6631413 -0.597105701 0.062434

It is worth mentioning that the update of the positive scalar controls the direction of the descent. Figure 7 shows that α changes from each iteration to another. Its value is successfully decreased through the first part of the iteration process and it works effectively until it reaches iteration no. 10. It starts to vary strongly, to decrease the cost function value to less than 23.86, but it fails. Therefore, the solution terminates because of the stability of two successive values of the unknown parameters. The seven transformation parameters of this method are given in Table 2, above.

The final algorithm used in solving this 3-D transformation problem was the traditional combined least square adjustment with weighted residuals. The importance of using this method is its effectiveness in updating the values of the unknown parameters, according to the evaluation of their values at each iteration step, and to certain pre-mentioned accuracy that will lead to a descent direction. This method is compared with those methods, previously discussed, which are not frequently used in the geodetic applications. The results obtained from this method are illustrated in Table 2. This algorithm reaches the solution in three iteration steps. The iteration stops after the unknown parameters reached the required accuracy from the difference of the last two successive values of the iteration, which is 1 mm in the distances, 0.1 sec in the


41

angles, and 10-12 in the case of the scale factor. This solution is compared to the one already recommended by the Egyptian surveying authority, which was done by El-Tokhey [5]; their standard deviations are given in Table 4.

Fig.7. Variation of the positive definite scalar α through the iteration process

Table 4. A comparison between the estimated standard deviations of the Molodensky seven parameters transformation model between the current combined least squares

solution and the El Tokhey [5] solution

Transformation parameters

Current solution

Eltokhey [5] solution

X0 (m) 0.165473 0.8400

Y0(m) 0.165065 0.1300

Z0(m) 0.164779 0.4200

Xω (s) 0.102089 0.3000

Yω (s) 0.161649 0.4200

Zω (s) 0.191845 0.4800

K(ppm) 0.460764 1.3800

El Tokhey [5] derived the latest most reliable set of transformation parameters, of

Molodensky seven-parameter model, for the transformation from EGD to WGS84, using the ASU97 adjusted EGD values [1]. This solution is available, and is recommended to the Egyptian Survey Authority, at least for the moment [7]. El Tokhey [5] transformation parameters were also based on the seven parameter Molodensky model, and were deduced by using the 15 common stations between the old Egyptian geodetic network, with the ASU97 adjusted coordinates and the established GPS HARN network [4]. The results obtained from these investigations are illustrated in Table 2, in terms of the transformation parameters values, and table 4 for the corresponding standard deviations.

Finally, a comparison among the different methods or algorithms used in solving the problem is done through examining their effect on the transformed stations of the geodetic networks from the old system (Helmert 1906) to the new system of (WGS84). The root mean square error of the transformed points, which can be seen Figure 4, illustrates the accuracy in the three coordinates components. The RMSE is computed for the difference between the coordinates of the stations transformed using the new computed transformation parameters and the already known measured values for these stations.


42

From Figure 8, it is clear that RMSE of different methods can vary according to the curvilinear coordinate itself. In the case of the latitude, the best solution (min RMSE) was for the Gauss-Newton method, followed by the Levenberg-Marquardt. In the case of the longitude, the traditional combined least squares has the minimum RMSE. Also in the case of the geodetic height the combined case is the best and has the minimum RMSE.

Although there are some differences between the methods, it is not significantly off the practical accuracy required by the Egyptian Surveying Authority (ESA). However, the results of the steepest descent algorithm have not been taken into account because this solution depends on the assumption of the initial values close to the right solution that is not practical. This algorithm failed using the initial values introduced in all the other optimization algorithms.

Fig.8. RMSE of 3-D coordinate differences between the transformed and the measured data

CONCLUSIONS

Most of the methods used are gradient based techniques. As expected, the steepest descent method has failed to reach a solution in this non-linear problem. This failure is mainly because of its local behavior in minimizing the objective function. The trust region succeeded in determining a solution to the transformation problem. In the trust region method, the use of a positive scalar factor helps in the solutions’ robustness, but this leads to having a huge number of iterations as a result of the procedure of creating a trust region in each iteration step.

The Newton methods with the second partial derivative can reach a solution of quadratic optimization problem in one iteration step. It has worked successfully in this non-linear problem, even with its local behavior, and this has no effect on the solution because of the type of nonlinearity in this problem. The Levenberg-Marquardt method is an improvement on the Gauss Newton method because it controls the approximation between the Gauss-Newton and the steepest descent in order to introduce global properties. This method has no significant effect since the traditional Gauss-Newton properties are enough for solving this non-linear problem.

Reaching solutions successfully by all methods except the steepest descent is from the weak non-linearity of this model since the values of the rotation angles are very small. It is recommended to apply the trust region and Levenberg-Marquardt methods to other non-linear problems because of their global behaviour.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

TrustRegion

Combined Levenberg-Marq

Gauss-Newton

RM

SE (m

) LatitudeLongitudeHeight


43

References

1. Awad, M. I., 1997. Studies towards the rigorous adjustment and analysis of the Egyptian Primary Geodetic Networks using Personal Computers. Ph.D. thesis, department of public works, Faculty of Engineering, Ain Shams University, Cairo, Egypt.

2. Cole, J. H., 1944. Geodesy in Egypt. Report published at the Surveying Department, Ministry of Finance, Government press, Cairo, Egypt.

3. El Hoseny, M. S., 1990. Doppler satellite control and its application in the Egyptian geodetic network. Survey Review, Vol.30, No.235. pp. 221, London, UK.

4. El Habiby, M. M., 2002. Effects of Coordinate Transformation from Helmert 1906 Old System to WGS84 GPS New System on Geodetic Networks and Related Mapping System in Egypt. M.Sc Thesis, Public Works Department, Faculty of Engineering, Ain Shams University, Cairo, Egypt.

5. El Tokhey, M. E., 2000. Towards transforming the Egyptian primary geodetic coordinates to the WGS84 coordinates system. The scientific bulletin Vol.35 No.3, the Faculty of Engineering, Ain Shams University, Cairo, Egypt.

6. Eriksson, J., 1996. Optimization and regularization of nonlinear least squares problems. Ph.D. Thesis, UMINF – 96. 09. Department of Computing Science, Umea University, Umea, Sweden.

7. ESA, 2001. Final report on the Egyptian survey Authority ESA- Ain Shams University ASU joint committee on deciding upon the ESA proposed program for the transformation of ESA geodetic and mapping operations form the old Egyptian datum EGD of Helmert 1906 to the new global satellite datum of WGS84. ESA Head quarters, Giza, Egypt.

8. Fayad, A. T., 1996. Merging both GPS and terrestrial data in the computations of the geodetic control points. Ph.D. thesis, Department of Public Works, Faculty of Engineering, Ain Shams University, Cairo, Egypt.

9. Fletcher, R., 2000. Practical methods of optimization. John Wiley & Sons; 2 edition. 10. Jacoby, S., Kowalik, J. and Pizzo, J., 1972. Iterative methods for nonlinear optimization problems.

Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA. 11. Kutoglu, H. S., 2004. Figure condition in datum transformation. Journal of Surveying Engineering,

ASCE, August, 2004. 12. Luenberger, D., 1984. Linear and nonlinear programming. Addison-Wesley Publishing Company. 13. Marquardt, D. W., 1963. An Algorthim for least squares estimation of nonlinear parameters. J. Soc.

Indust. Appi. Math, Vol. No. 2, June, 1963, USA. 14. MathWorks, Inc., 2003. Optimization toolbox for use with Matlab: computation, visualization,

programming. www.mathworks.com . 15. Nassar, M., El Tokhey, M., El Maghraby, M. and Issa, M., 1999. Analysis and assessment of the

quality of Finnmap first order GPS network establishment on the Egyptian Eastern Desert. ASU Survey Group, surveying engineering consultation and research unit, Department of Public works, Faculty of Engineering, Ain Shams University, Technical report No. 45.

16. Naylor, A. W., and Sell, G. R., 1982. Linear operator theory in engineering and science. Springer . 17. Nocedal, J., and Wright. S. J., 1999. Numerical optimization. Springer series in operations

research, Springer-Verlag. 18. Papp, E., Szucs, L. and Vagra, J., 2002. Hungarian GPS network transformation into different

datums and projection systems. Periodica Polytechnica Ser. Civ. Eng., Vol. 46, No. 2, pp. 199-204. 19. Pope, A., 1974. Two approaches to Nonlinear Least Squares Adjustments. The Canadian Surveyor,

vol.28, no.5. 20. Ramsin, H. and Wedin, P., 1977. A comparison of some algorithm for the nonlinear least squares

problem. Bit, Vol. 17, pp. 72-90, the Swedish Institute of Applied Mathematics, Stockholm. 21. Shultz, G., Byrd, R. and Schnabel, R., 1988. Approximate solution of the trust region problem by

minimization over two dimensional subspaces. Mathematical Programming, Vol. 40, pp.247-263, Holland.

22. Teunissen, P.J.G., and Knickmeyer, E.H., 1988. Nonlinearity and Least Squares. CISM Journal ACSG, Vol.42, No.4, pp.321-330.

23. Teunissen, P.J.G., 1990. Nonlinear Least Squares. Manuscripta Geodaetica, Vol.15, pp.137-150. 24. Thomson, D.B., 1976. Combination of geodetic networks. Technical report No.30, Department of

Surveying Engineering, University of New Brunswick, Canada. 25. Vanichek, P. and Krakiwisky, E., 1982. Geodesy: the concepts. North Holland Publishing

Company, Amsterdam, Netherlands. 26. Venkataraman, P., 2002. Applied optimization with MATLAB programming. A Wiley-Interscience

Publication, New York, USA. 27. Yang, C. and Kim, S., 2001. The current status of GPS network, datum transformation and real

time Kinematic GPS positioning in Korea. New Technology for a New Century, International Conference, FIG Working Week 2001, Seoul, Korea 6–11 May 2001.

the role of multi-dimensional wavelet approximation in geodetic applications: de-noising,...

Documents