Holidays

The method of conjugate directions. Associated directions Associated directions

RELATED DIRECTIONS

A pair of directions emanating from a point P of the surface S and such that the lines containing them are the conjugate diameters of the Dupin indicatrix of the surface S at the point R. In order for directions ( du:dv), at the point P of the surface S were S. n., it is necessary and sufficient that the condition

Where L, M And N- coefficients of the second quadratic surface form S, calculated at point R. Examples: asymptotic directions, principal directions.

Lit.: Pogorelov A. V., Differential, 5th ed., M., 1969.
E. V. Shikin.

Mathematical encyclopedia. - M.: Soviet Encyclopedia. I. M. Vinogradov. 1977-1985.

See what "CONNECTED DIRECTIONS" is in other dictionaries:

    Section of geometry, in which geometrical are studied. images, primarily curves and surfaces, by mathematical methods. analysis. Usually in DGs the properties of curves and surfaces are studied in the small, that is, the properties of arbitrarily small pieces of them. Besides, in … Mathematical Encyclopedia

    1) The sum of the squares of the lengths of the conjugate half-diameters of an ellipse is a constant value, equal to the sum of the squares of the lengths of its semiaxes. 2) The area of ​​\u200b\u200bthe parallelogram described around the ellipse, the sides of which have conjugate directions, is constant and equal to ... ... Mathematical Encyclopedia

    A direction on a regular surface in which the curvature of a normal section of the surface is zero. In order for the direction at the point P to be A. n., it is necessary and sufficient to fulfill the condition: where are the internal coordinates on the surface, and L, M and N ... ... Mathematical Encyclopedia

    Numerical methods is a branch of computational mathematics devoted to mathematical. description and study of the processes of numerical solution of linear algebra problems. Among the tasks of L. a. two are of greatest importance: the solution of a system of linear algebraic. equations ... ... Mathematical Encyclopedia

    A network of lines on a surface formed by two families of lines such that at each point on the surface the lines of the network of different families have conjugate directions. If the coordinate network is C. s., then the coefficient M of the second quadratic form ... ... Mathematical Encyclopedia

    SO 34.21.308-2005: Hydraulic engineering. Basic concepts. Terms and Definitions- Terminology SO 34.21.308 2005: Hydraulic engineering. Basic concepts. Terms and definitions: 3.10.28 outport: water area limited by wave protection dams in the upstream of the hydroelectric complex, equipped with mooring facilities and intended for placement ... Dictionary-reference book of terms of normative and technical documentation

    I I. The history of the development of railways. Zh. the road, in the form in which it exists now, was not invented immediately. The three elements, its constituents, rail track, means of transportation and motive force have gone through each separate stage of development, ... ... Encyclopedic Dictionary F.A. Brockhaus and I.A. Efron

    Wage- (Wages) The most important means of increasing the interest of workers Participation of workers in the share of newly created material and spiritual wealth Contents Contents. > wages are the most important means of increasing interest… … Encyclopedia of the investor

    Diversification- (Diversification) Diversification is an investment approach aimed at reducing financial markets The concept, main methods and goals of diversifying production, business and financial risks in the currency, stock and commodity markets Contents ... ... Encyclopedia of the investor

    XIII. Internal affairs (1866-1871). On April 4, 1866, at four o'clock in the afternoon, Emperor Alexander, after his usual walk in the Summer Garden, was getting into a carriage when an unknown person fired a pistol at him. At this moment, standing in ... ... Big biographical encyclopedia

Step 1. Set the starting point X(0) and system N linearly independent directions; a case is possible when s(i)=e(i)i= 1, 2, 3,..., N.

Step 2. Minimize f(x) when moving sequentially along ( N+1) directions; in this case, the previously obtained minimum point is taken as the initial one, and the direction s(N) used for both the first and last searches.

Step 3. Determine a new conjugate direction using the generalized property of the parallel subspace.

Step 4: Replace s(l) on s(2) etc. Replace s (N) related direction. Go to step 2.

In order to apply the described method in practice, it must be supplemented with procedures for checking the convergence and linear independence of the direction system. Checking for linear independence is especially important when the function f(x) is not quadratic.

It follows from the method of constructing the algorithm that in the case when the objective function is quadratic and has a minimum, the minimum point is found as a result of the implementation N cycles including steps 2, 3 and 4, where N- the number of variables. If the function is not quadratic, then more than N cycles. At the same time, we can give a rigorous proof that, under some assumption, the Powell method converges to a local minimum point with superlinear speed (see definition below).

convergence rate. The considered method allows constructing a sequence of points x(k), which converges to the solution x*. The method is called converging, if inequality

≤ 1, where (3.39)

= x - X*, (3.40)

executed on every iteration. Since the calculations usually operate with finite decimal fractions, even the most efficient algorithm requires an infinite sequence of iterations. Therefore, of primary interest are the asymptotic properties of the convergence of the methods under study. We will say that the algorithm has convergence of order r(see), if

, (3.41)

Where WITH- a constant value. From formula (3.39) it follows that for r= 1, the inequality C ≤ 1 holds. If r= 1or r= 2, then the algorithm is characterized linear or quadratic rate of convergence respectively. At r= 1and WITH= 0 the algorithm is characterized superlinear convergence rate.

Example 3.6. Powell's Conjugate Direction Method

Find the minimum point of a function

f(x) = 2x + 4x x – 10x x+ x,

if starting point is given X(0) = T , where f(x (0)) = 314.

Step 1. s(1) = T , s(2) = T .

Step 2. (a) Find a value of λ such that

f (x (0) + λ s(2)) → min.

We get: λ* - 0.81, whence

x(l) = T - 0,81 T = T, f(x(l)) = 250.

(b) Find a value of λ such that f (x(1) + λ s(1)) → min.

λ* = – 3,26, x (2) = T,f(x (2)) = 1.10.

(c) Find a value of λ such that f (x(2) +λ s(2)) → min.

λ* = – 0.098, x (3) = T,f(x (3)) = 0.72.

Step 3. Put s (3) = x (3) - x (1) = [-3.26,-0.098]T. After normalization, we get

s (3) = = [ 0,99955, 0,03]T.

Let's put s (1) = s (2) , s (2) = s (3) and go to step 2 of the algorithm.

Step 4. Find a value of λ for which f (x(3) +λ s(2)) → min.

λ* = – 0.734, x (4) = T,f(x (4)) = 2,86.

Note. If f(x) was a quadratic function, then the resulting point would be the solution to the problem (if we neglect the rounding error). In this case, iterations should be continued until a solution is obtained.

The search directions obtained during the implementation of the method are shown in fig. 3.13.

The results of computational experiments suggest that the Powell method (complemented by the procedure for checking the linear dependence of directions) is at least as reliable as other direct search methods, and in some cases is much more efficient. Therefore, the problem of choosing a direct search algorithm is often (and reasonably) resolved in favor of the Powell method.

This concludes the consideration of methods for direct search for solutions in unconstrained optimization problems. The following section describes methods based on the use of derivatives.

gradient methods

In the previous section, methods were considered that make it possible to obtain a solution to the problem based on the use of only the values ​​of the objective function. The importance of direct methods is undoubted, since in a number of practical engineering problems information about the values ​​of the objective function is the only reliable information that the researcher has.

f(x) = 2x + 4x x – 10x x+ x

Rice. 3.13. Solving the problem from example 3.6 using the Powell conjugate directions method.

On the other hand, when using even the most efficient direct methods, obtaining a solution sometimes requires an extremely large number of function value calculations. This circumstance, along with a completely natural desire to realize the possibility of finding stationary points [i.e. e. points satisfying the first-order necessary condition (3.15a)] leads to the need to consider methods based on the use of the objective function gradient. These methods are iterative in nature, since the components of the gradient turn out to be non-linear functions of the controlled variables.

In what follows, it is assumed everywhere that f(x), f(x) And f(x) exist and are continuous. Methods using both first and second derivatives are discussed only briefly and mainly in connection with more useful methods. Particular attention is paid to the detailed presentation of the methods conjugate gradients, which are based on the concept of conjugacy of directions introduced above, and the so-called quasi-Newtonian methods, which are similar to Newton's method, but use only information about the first derivatives. It is assumed that the gradient components can be written in an analytical form or calculated with a sufficiently high accuracy using numerical methods. In addition, methods for numerical approximation of gradients are considered." All methods described are based on an iterative procedure implemented in accordance with the formula

x = x +α s(x) (3.42)

Where x- current approximation to the solution X*; α - parameter characterizing the step length; s(x) = s- search direction in N-dimensional space of controlled variables x i , i = 1, 2, 3,..., N.Determination method s(x) and α at each iteration is related to the features of the applied method. Usually the choice of α is carried out by solving the minimization problem f(x) in the direction s(x). Therefore, when implementing the methods under study, it is necessary to use efficient one-dimensional minimization algorithms.

3.3.1. Cauchy method

Suppose that at some point space of controlled variables, it is required to determine the direction of the fastest local descent, i.e., the largest local decrease in the objective function. As before, we expand the objective function in a neighborhood of the point in a Taylor series

f(x) = f()+ f() ∆x+… (3.43)

and discard terms of the second order and higher. It is easy to see that the local decrease in the objective function is determined by the second term, since the value f() fixed. Largest reduction f associated with the choice of such a direction in (3.42), which corresponds to the largest negative the value of the scalar product that appears as the second term of the expansion. It follows from the property of the scalar product that the indicated choice is provided for

s() = - f(),(3.44)

and the second term takes the form

–α f() f().

The considered case corresponds to the fastest local descent. Therefore, on the basis the simplest gradient method lies the formula

x = x -α f(x), (3.45)

where α is a given positive parameter. The method has two disadvantages: firstly, it becomes necessary to choose an appropriate value of α , and, secondly, the method is characterized by slow convergence to the minimum point due to the smallness f around this point.

Thus, it is advisable to determine the value of α at each iteration

x = x -α f(x), (3.46)

The value of α is calculated by solving the minimization problem f (x(k +1)) along the direction f(x) using one or another one-dimensional search method. The considered gradient method is called the steepest descent method, or Cauchy method, since Cauchy was the first to use a similar algorithm to solve systems of linear equations.

Search along a straight line in accordance with formula (3.46) provides a higher reliability of the Cauchy method compared to the simplest gradient method, but the rate of its convergence in solving a number of practical problems remains unacceptably low. This is quite understandable, since the changes in variables directly depend on the magnitude of the gradient, which tends to zero in the vicinity of the minimum point, and there is no mechanism for accelerating the movement to the minimum point at the last iterations. One of the main advantages of the Cauchy method is related to its stability. The method has an important property, which is that, for a sufficiently small step length, iterations ensure the fulfillment of the inequality

f (x) ≤ f (x). (3.47)

With this property in mind, we note that the Cauchy method, as a rule, can significantly reduce the value of the objective function when moving from points located at significant distances from the minimum point, and therefore is often used as an initial procedure when implementing gradient methods. Finally, using the example of the Cauchy method, one can demonstrate individual techniques that are used in the implementation of various gradient algorithms.

Example 3.7. Cauchy method

Consider the function

f(x) = 8x + 4xx + 5x

and use the Cauchy method to solve the problem of its minimization.

Solution. First of all, we calculate the components of the gradient

= 16x + 4x , = 10x + 4x .

In order to apply the steepest descent method, we set the initial approximation

x (0) = T

and using formula (3.46) we construct a new approximation

x = xf(x)


f(x) = 8x + 4xx + 5x

Rice. 3.14. Cauchy iterations using the quadratic interpolation method.

Table 3.1.Results of calculations by the Cauchy method

k x x f(x)
1 -1.2403 2.1181 24.2300
2 0.1441 0.1447 0.3540
3 -0.0181 0.0309 0.0052
4 0.0021 0.0021 0.0000

We choose α In the way that f (x(1)) → min.; α = 0.056. Hence, x (1) = [ 1,20, 2.16]T Next, find the point

x = x -α f(x),

calculating the gradient at a point x and searching along a straight line.

Table 3.1 presents the data obtained during iterations based on a one-dimensional search using the quadratic interpolation method. The sequence of obtained points is shown in fig. 3.14.

Despite the fact that the Cauchy method is not of great practical importance, it implements the most important steps of most gradient methods. The block diagram of the Cauchy algorithm is shown in fig. 3.15. Note that the algorithm terminates when the modulus of the gradient or the modulus of the vector ∆x becomes small enough.


Rice. 3.15. Block diagram of the Cauchy method.

3.3.2. Newton's method

It is easy to see that the "best" local search strategy using a gradient is applied in the Cauchy method. However* movement in the direction opposite to the gradient leads to a minimum point only when the level lines of the function f are circles. Thus, the direction opposite to the gradient is, in general, Not may be acceptable global the direction of the search for optimum points of nonlinear functions. The Cauchy method is based on a successive linear approximation of the objective function and requires the calculation of the values ​​of the function and its first derivatives at each iteration. In order to build a more general search strategy, one should involve information about the second derivatives of the objective function.

We again expand the objective function in a Taylor series

f(x)=f(x)+ f(x) ∆x+½∆x f(x)∆x+O(∆x³).

Discarding all terms of the expansion of the third order and higher, we obtain a quadratic approximation f(x):

(x; x) = f(x) + f(x ) T ∆x + ½∆x f(x)∆x,(3.48)

Where (x;x)- approximating function variable X, built on a point x . Based on the quadratic approximation of the function f(x) form a sequence of iterations in such a way that at the newly obtained point x gradient approximating function vanishes. We have

(x;x) = + f(x)+ f(x) = 0, (3.49)


Similar Documents

    Consideration of the effectiveness of applying the methods of penalties, unconditional optimization, conjugate directions and steepest gradient descent to solve the problem of finding the extremum (maximum) of a function of several variables in the presence of an equality constraint.

    test, added 08/16/2010

    Analysis of adjoint functor theorems. Natural transformation as a family of morphisms. Characterization of properties of reflective subcategories. Introduction to universal arrows. Consideration of the features of the method for constructing adjoint functors.

    term paper, added 01/27/2013

    Rotation transformation technique and its significance in solving algebraic systems of equations. Obtaining the resulting matrix. Orthogonal transformations by reflection. Iterative methods with residual minimization. Solution by the method of conjugate directions.

    abstract, added 08/14/2009

    Methods for solving systems of linear algebraic equations, their characteristics and distinctive features, features and applications. The structure of the orthogonalization method and the conjugate gradient method, their varieties and conditions, stages of practical implementation.

    term paper, added 01.10.2009

    Numerical methods for finding an unconditional extremum. Problems of unconditional minimization. Calculation of the minimum of a function by the method of coordinate descent. Solution of linear programming problems by graphical and simplex methods. Working with the MathCAD program.

    term paper, added 04/30/2011

    Formation of the Lagrange function, Kuhn and Tucker conditions. Numerical optimization methods and block diagrams. Application of methods of penalty functions, external point, coordinate-wise descent, conjugate gradients to reduce conditional optimization problems to unconditional.

    term paper, added 11/27/2012

    Mathematical model of the problem. Solution of the transport problem by the method of potentials. The value of the objective function. A system consisting of 7 equations with 8 unknowns. Problem solving by the graphical method. Selection of the half-plane corresponding to the inequality.

    control work, added 06/12/2011

    Methods for finding the minimum of a function of one variable and a function of many variables. Development of software for calculating the local minimum of the Himmelblau function by the method of coordinate descent. Finding the minimum of a function using the golden section method.

    term paper, added 10/12/2009

    Solving systems of linear algebraic equations by simple iteration. Polynomial interpolation of a function by Newton's method with divided differences. Root-mean-square approximation of a function. Numerical integration of functions by the Gauss method.

    term paper, added 04/14/2009

    Basic information about the simplex method, evaluation of its role and importance in linear programming. Geometric interpretation and algebraic meaning. Finding the maximum and minimum of a linear function, special cases. Solution of the problem by the matrix simplex method.

The high rate of convergence of Newton's method is due to the fact that it minimizes the quadratic function

Where A is a symmetric positive definite matrix of size nxn , in one step. Quasi-Newtonian methods allow you to find the minimum of a quadratic function in steps. The idea of ​​the method of conjugate directions is based on the desire to minimize a quadratic function in a finite number of steps. More precisely, in the methods of conjugate directions, it is required to find directions such that a sequence of one-dimensional minimizations along these directions leads to finding the minimum of the function 2.1, i.e., for any, where

It turns out that the indicated property is possessed by the system of directions mutually conjugate with respect to the matrix A

Let A be a symmetric positive definite matrix of size .

Definition 2.1. Vectors (directions) are called conjugate (with respect to matrix A) if they are nonzero and. Vectors (directions) are called mutually conjugate (with respect to the matrix A) if they are all non-zero and. (2.3)

Lemma 3.1. Let the vectors be mutually conjugate. Then they are linearly independent.

Proof. Let this be false, i.e., for some. Then , which is possible only for, since the matrix A is positive definite. The resulting contradiction proves the lemma.

Consider the minimization problem on R n functions 2.1. We will solve it by method 2.2. If the vectors , are mutually conjugate, then method 3.2 can be called the method of conjugate directions. However, usually this name is used only for those methods in which it is the desire to achieve the condition of mutual conjugation that determines the choice of directions. The implementation of a completely new idea can also lead to the fulfillment of the same condition.

Theorem 3.1. If the vectors h k in method 2.2 are mutually conjugate, k=0,1,…, m-1 , then for the function f, given by formula 2.1,

, (2.4)

where is the linear subspace spanned by the indicated vectors.

Proof. Taking into account 2.2 and Definition 2.1, we have

(2.5)

Using this equality, we get

(2.6)

Consequence. If the vectors h k in method 2.2 are mutually conjugate, k=0,1,…, n-1 , then for the function f, given by formula 2.1, and an arbitrary point

Thus, method 2.2 allows finding the minimum point of the quadratic function 2.1 in no more than n steps.

2.2. Method of conjugate directions of zero order.

The algorithm consists of a sequence of loops, k th of which is determined by the starting point t 0 (k) and directions of minimization p 0 (k), p 1 (k), …, p n -1 (k) . On the zero cycle as t 0 (0), an arbitrary point is chosen as p 0 (0), p 1 (k), …, p n -1 (k) are the directions of the coordinate axes.

Another k-th cycle consists in the sequential solution of one-dimensional problems

This determines the step from point to point

where are such that

After finishing k-th cycle starting point and directions of minimization (k+1) th cycle are determined by the formulas

The stopping criterion can be the fulfillment of the inequality , where is a preselected small positive number.

Theorem 3.2. If the vectors in method 2.5-2.7 are nonzero, then for the function f given by formula 2.1

Proof. Taking into account the corollary of Theorem 3.1, it suffices to show that the vectors are mutually conjugate. Let be. Assuming that vectors are mutually conjugate, we prove that a vector is conjugate to vectors.

Note that and, therefore, the point t n (k) , according to formulas 2.5, obtained from the point t n - k (k) using a sequence of one-dimensional minimizations along the directions . This, by virtue of Theorem 2.1, means that

Likewise, dot t 0 (k) received from point t n - k +1 (k) using a sequence of one-dimensional minimizations along the same directions, and therefore

The assertion being proved now immediately follows from Lemma 2.2, since .

The assumption of Theorem 2.2 that they are different from zero is not always satisfied. The system of vectors can, for some k turns out to be linearly dependent (or "almost" linearly dependent), as a result of which the method may not be able to find the minimum even of a quadratic function.

Let us describe a modification of method 2.5–2.7 that leads to an efficient minimization algorithm.

After finishing k of the th cycle, the fulfillment of the inequalities is checked. If at least one of them is fulfilled, then a stop is made. Otherwise, the fulfillment of the inequality is checked

, (2.16)

If it is satisfied, then the directions of minimization (k+1) th cycle remain the same, i.e.

If not, then the directions of minimization (k+1) th cycle is determined by the formulas

In both cases, the starting point (k+1) th cycle is calculated in the same way as in the original algorithm.

Service assignment. The online calculator is designed to find the minimum of a function Powell's method. The decision is made in Word format.

Function entry rules:

  1. All variables are expressed through x 1 ,x 2
  2. All mathematical operations are expressed through conventional symbols (+, -, *, /, ^). For example, x 1 2 +x 1 x 2 is written as x1^2+x1*x2 .

Powell Method refers to direct methods (zero-order methods). This method most effectively minimizes functions that are close to quadratic. At each iteration of the algorithm, the search is carried out along the system of conjugate directions.
Two search directions S i , S j are called conjugated, if S j T H S j =0, i≠j, S i T H S i =0, i=j.
where H is a positive definite square matrix.
Justification of the use of conjugate directions in optimization algorithms. In the Powell method, H=▽²f(x k) is the matrix of second partial derivatives. The ideas of Powell's method are related to the quadratic function f(x).
The main idea is that if at each stage of the search the minimum of the quadratic function f(x ) is determined along each of p (p< n) - сопряженных направлений и если затем в каждом из направлений делается шаг до минимальной точки, то полное перемещение от начала до шага с номером p сопряжено ко всем поднаправлениям поиска.
The idea of ​​using conjugate directions underlies a number of algorithms.
Let f(x ) be a quadratic function and the minimization process starts at the point x 0 with the initial direction S 1 . For convenience, we take this vector as a unit vector, i.e. (S 1) T S 1 =1. Then the vector x 1 =x 0 +λ 1 ·S 1 and the step length λ 1 is determined from the condition of the minimum function in this direction i.e.
.
For a quadratic function
, (1)
and, thus, the optimal value of λ at the first step is determined in accordance with the relation
, (2)
where H=▽²f(x k).
From the point x 1, the minimization process must be carried out in another conjugate direction S 2 and, at the same time,
(S 2) T H ).
In general, a system of n linearly independent search directions S 1 , S 2 ,..., S n is called conjugate with respect to some positive definite matrix H if (S i) T H S j =0, 0 ≤ i ≠ j ≤ n.
Since the conjugate directions are linearly independent, any vector in the space E n can be expressed in terms of S 1 , S 2 ,..., S n as follows:
Where . (3)
For some matrix H, there always exists at least one system of n mutually conjugate directions, since the eigenvectors of the matrix H themselves represent such a system.
Note that the following relation is true for a quadratic function, which will be required in what follows:
. (4)
To verify its validity, consider the matrix . Multiplying it from the right by H S k gives
,
if put .
Generally speaking, the general rule is that if adjoint directions are used to find the minimum of a quadratic function f(x ), then this function can be minimized in n steps, one in each of the adjoint directions. Moreover, the order in which conjugate directions are used is immaterial.
Let us show that this is indeed the case. Let f()=b +H x .
At the minimum point ▽f(x *), and this point is x *=-H T b .
Note that ▽ T f(x k) S k =(S k) T ▽f(x k).
Since x 1 \u003d x 0 +λ 1 S 1, (5)
where λ 1 is determined in accordance with relation (2):
,
then the minimum is in the next conjugate direction by similar formulas i-1 +λ i ·S i) in the direction of S i to get λ i , which leads to the following expression (based on (2))
. (7)
Besides,
and (S i) T ▽f(x i-1)=(S i) T ,
since all (S i) T H S k =0, ∀i≠k, 0 and H -1 b through the system of conjugate vectors S i as follows (by analogy with (3)):
,
.
Substituting these expressions into (7), we obtain
x n \u003d x 0 -x 0 + H -1 b \u003d H -1 b. (9)
Thus, the point x n , obtained as a result of minimizing the quadratic function at the nth step, coincides with the minimum point of the quadratic function f(x ).
Let us show that for conjugate directions, if f(x ) is minimized each time in the conjugate direction S j in accordance with formula (2), then the following equality holds:
(x j) T ▽f(x l), 1 ≤ j ≤ l-1 ,
when using no more than n directions, that is, ▽f(x l) is orthogonal to the conjugate directions used.
For a quadratic function ▽f( k is an arbitrary point from which to start searching in conjugate directions. Since ▽f( k-1) T gives
.
The first term on the right side (S k-1) T ·▽f(x k)=0, since the gradient at point x k is orthogonal to the direction of the previous descent if the point is obtained by minimizing the function in that direction. In addition, all other terms under the sum sign disappear due to the conjugacy of the directions S k-1 and S j , and thus
(S j) T ▽f(x l)=0, 1≤j≤l-1 . (10)

Powell's algorithm

The transition from the point x k 0 to the point x k n at the k -th step of the Powell algorithm is carried out in accordance with the formula:
.
In this case, the minimization of the original function along the conjugate directions S k 1 , ... ,S k n is carried out sequentially. The result of minimization in each of the conjugate directions is a system of parameters λ 1 k ,...,λ n k , for which the function is minimal in each of the conjugate directions:
, .
The initial system of conjugate directions can be chosen parallel to the axes of the coordinate system. At the end of each iteration of the Powell algorithm, it is necessary to choose a new system of conjugate directions, because if this is not done, then we get a simple coordinate-wise search. The construction of the new system is based on the following theorem.

Theorem: If, at the initial point x 0 of the search in the direction of the vector S, the minimum of the function f ( x ) is located at the point x a , and at the initial point x 1 ≠ x 0, the search for the minimum of the function f ( x ) in the same direction S leads to the point x b , then when f(xb)

Proof. Using the previously obtained results (10), we can write that in the first case
S T ▽f(x a)=S T (H x a +b )=0,
similarly, in the second case one can write
S T ▽f(x b)=S T (H x b +b )=0,
Subtracting the second expression from the first expression, we get that
S T H (x b -x a)=0,
Therefore, the vectors S and (x b -x a) are conjugate.
This theorem can be directly extended to the case of several conjugate directions as follows. If, starting from the point x 0 , the point x a is determined after using several conjugate directions p (p The following figure serves as an illustration of the theorem.




Drawing.
Let at the initial moment for a two-dimensional problem the search is carried out from the point x 0 along the directions parallel to the coordinate axes: S 0 1 and S 0 2 . The points x 0 1 , x 0 2 , x 0 3 were successively found (see Fig.).
Thus, we have identified 2 conjugate directions in which to search: S 0 2 and (x 0 3 -x 0 1). In the system of initial directions, S 0 1 should be replaced by (x 0 3 -x 0 1), which is a complete displacement from the first minimum. Search directions at the next stage:
S 1 1 \u003d S 0 2,
S 1 2 \u003d x 0 3 -x 0 1.

The second stage starts with minimization along the direction S 1 2 , then, if necessary, moving in the direction S 1 1 . But in the case of a quadratic function of two variables, after minimization in two conjugate directions, a minimum point will be reached.
In the general case, the k -th step of the Powell algorithm uses n linearly independent search directions. The search starts from the point x k 0 and is carried out according to the following algorithm:
1. Starting from point , in directions S k 1 , ... , S k n . In this case, points x k 1 , ... , x k n are found that minimize the original function in given directions, and x k 1 =x k 0 +λ 1 S k 1 = x k 1 +λ 2 S k 2 , ..., x k n =x k n-1 +λ n S k n .
2. The search carried out at the first stage can lead to linearly dependent directions if, for example, in one of the directions S i it is not possible to find a smaller value of the function. Therefore, the 2 directions can become collinear. Therefore, in a system of conjugate directions, one should not replace the old direction with a new one if, after such a replacement, the directions of the new set become linearly dependent.
On the example of a quadratic function, Powell showed that when normalizing search directions in accordance with the relation:
(S k i) H S k i =1, i=1,n ,
the determinant of the matrix whose columns represent the search directions takes on a maximum value if and only if the S k i are mutually conjugate with respect to the matrix H . He came to the conclusion that the direction of full movement at the kth step should replace the previous direction only if the replacement vector increases the determinant of the search direction matrix. Since only then the new set of directions will be more effective.
For such a check, an additional step is taken from the point x k n in the direction (x k n -x k 0), corresponding to a complete movement at the k -th stage, and a point (2x k n -x k 0) is obtained. Step 3 is done to verify that the determinant of the search direction matrix is ​​incremented when a new direction is included.
3. Denote the largest decrease f( k m .
Denote:
f 1 \u003d f (x k 0), f 2 \u003d f (x k n), f 3 \u003d f (2x k n -f 1 \u003d f (x k 0),
where x k 0 \u003d x k-1 n, .
Then, if f 3 ≥f 1 and (or) (f 1 -2f 2 +f 3) (f 1 -f 2 -Δ k) 2 ≥0.5*Δ k (f 1 -f 3) 2, then you should use at the (k+1) -th stage the same directions S k 1 , ... , S k n as at the k -th stage, i.e. S k+1 i =S k i , i=1,n , and start search from the point x k+1 0 =x k n or from the point x k+1 0 =2x k n -x k 0 =x k n+1 , depending on the point at which the function takes the minimum value.
4. If the test at step 3 fails, then the minimum f(x) is sought in the direction of the vector S k n+1 drawn from x k ​​0 to x k n: S k n+1 =(x k n -x k 0). The point of this minimum is taken as the starting point at the (k+1) -th stage. And in the system of conjugate directions, everything is preserved, except for the direction S k m , which is replaced by a new direction S k n+1 , but the new direction is placed in the last column of the direction matrix. At the (k+1) -th stage, directions will be used
= .
5. Stopping criterion. The algorithm is terminated if the change in each variable is less than the specified accuracy in the corresponding variable or ||x k n -x k 0 ||≤ε.

Example #1. Using the Powell method, find the minimum point of the function 4(x 1 -5) 2 + (x 2 -6) 2 if the starting point x (0) \u003d (8, 9) T is given.
Solution:
Function Gradient:

Iteration #0.

Let's check the stopping criterion: |▽f(X 0)|< ε

Let's calculate the value of the function at the starting point f(X 0) = 45.
Search direction:
p 1 = T
p 2 = T

Step #1. Let's take a step along the search direction p 2 = T

f(X 1) = 4(8-5) 2 +((h+9)-6) 2 → min
f(X 1) = h 2 +6h+45 → min
Let's find a step h such that the objective function reaches a minimum along this direction. From the necessary condition for the existence of an extremum of the function (f "(x 1) \u003d 0):
2h+6 = 0. Get step: h = -3

Step #2. Let's take a step along another search direction p 1 = T

f(X 2) = 4((h+8)-5) 2 +((6)-6) 2 → min
f(X 2) = 4h 2 +24h+36 → min
Let's find a step h such that the objective function reaches a minimum along this direction. From the necessary condition for the existence of an extremum of the function (f "(x 2) \u003d 0):
8h+24 = 0. Get step: h = -3
Executing this step will lead to the point:

Step #3. Let's re-step along the search direction p 2 = T

f(X 3) = 4(5-5) 2 +((h+6)-6) 2 → min
f(X 3) = h 2 → min
Let's find a step h such that the objective function reaches a minimum along this direction. From the necessary condition for the existence of an extremum of the function (f "(x 3) \u003d 0):
2h = 0. Get the step: h = 0
Executing this step will lead to the point:

Step number 4. We choose the conjugate direction: p 2 \u003d x 3 - x 1
p 2 \u003d T - T \u003d [-3; 0] T

Iteration #1.

Let's check the stopping criterion:
|▽f(X 3)|< ε

Let us calculate the value of the function at the initial point f(X 3) = 0.
Answer: X=T

Example #2. Minimize the function f(x) by the conjugate direction method, ending the calculations at |d(x)/dx|< 10 -3 , i=1,2,..,n.
x 1 4 +2*x 2 4 +x 1 2 *x 2 2 +2*x 1 +x 2
function gradient

+h -0.5 +h -0.7413 +h + 0.09038 +h + 0.02394 +h + 0.000178 +h + 0.000243
-0.741
0.0904
=
-0.759
-0.4074

Answer: X = [-0.759;-0.4074] T

Iteration #2.

▽ f(X 6) =
-0.00093
-0.0103

Let's check the stopping criterion:
|▽f(X 6)|
Let's calculate the value of the function at the new point f(X 6) = -1.443.
Search direction: p 1 = T , p 2 = T
One of the search directions p 2 = T . We complete the iteration process.
Answer: X = [-0.759;-0.4074] T