Nonlinear Programming 13 Numerous mathematical-programming applications, including many introduced in previous chapters, are cast naturally as linear programs. Linear programming assumptions or approximations may also lead to appropriate problem representations over the range of decision variables being considered. At other times, though, nonlinearities in the form of either nonlinear objectivefunctions or nonlinear constraints are crucial for representing an application properly as a mathematical program. This chapter provides an initial step toward coping with such nonlinearities, first by introducing several characteristics of nonlinear programs and then by treating problems that can be solved using simplex-like pivoting procedures. As a consequence, the techniques to be discussed are primarily algebra-based. The final two sections comment on some techniques that do not involve pivoting. As our discussion of nonlinear programming unfolds, the reader is urged to reflect upon the linearprogramming theory that we have developed previously, contrasting the two theories to understand why the nonlinear problems are intrinsically more difficult to solve. At the same time, we should try to understand the similarities between the two theories, particularly since the nonlinear results often are motivated by, and are direct extensions of, their linear analogs. The similarities will be particularly visible for the material of this chapter where simplex-like techniques predominate. 13.1 NONLINEAR PROGRAMMING PROBLEMS A general optimization problem is to select n decision variables x1 , x2 , . . . , xn from a given feasible region in such a way as to optimize (minimize or maximize) a given objective function f (x1 , x2 , . . . , xn ) of the decision variables. The problem is called a nonlinear programming problem (NLP) if the objective function is nonlinear and/or thefeasible region is determined by nonlinear constraints. Thus, in maximization form, the general nonlinear program is stated as: Maximize f (x1 , x2 , . . . , xn ), subject to: g1 (x1 , x2 , . . . , xn ) ≤ b1 , .. .. . . gm (x1 , x2 , . . . , xn ) ≤ bm , where each of the constraint functions g1 through gm is given. A special case is the linear program that has been treated previously. The obvious association for this case is f (x1 , x2 , . . . , xn ) = n X cjxj, j=1 410
13.1 Nonlinear Programming Problems 411 and gi (x1 , x2 , . . . , xn ) = n X ai j x j (i = 1, 2, . . . , m). j=1 Note that nonnegativity restrictions on variables can be included simply by appending the additional constraints: gm+i (x1 , x2 , . . . , xn ) = −xi ≤ 0 (i = 1, 2, . . . , n). Sometimes these constraints will be treated explicitly, just like any other problem constraints. At other times, it will be convenient to consider them implicitly in the same way that nonnegativity constraints are handled implicitly in the simplex method. For notational convenience, we usually let x denote the vector of n decision variables x1 , x2 , . . . , xn — that is, x = (x1 , x2 , . . . , xn ) — and write the problem more concisely as Maximize f (x), subject to: gi (x) ≤ bi (i = 1, 2, . . . , m). As in linear programming, we are not restricted to this formulation. To minimize f (x), we can of course maximize − f (x). Equality constraints h(x) = b can be written as two inequality constraints h(x) ≤ b and −h(x) ≤ −b. In addition, if we introduce a slack variable, each inequality constraint is transformed to an equality constraint. Thus sometimes we will consider an alternative equality form: Maximize f (x), subject to: h i (x) = bi xj ≥ 0 (i = 1, 2, . . . , m) ( j = 1, 2, . . . , n). Usually the problem context suggests either an equality or inequality formulation (or a formulation with both types of constraints), and we will not wish to force the problem into either form. The following three simplified examples illustrate how nonlinear programs can arise in practice. Portfolio Selection An investor has $5000 and two potential investments. Let x j for j = 1 and j = 2 denote his allocation to investment j in thousands of dollars. From historical data, investments 1 and 2 have an expected annual return of 20 and 16 percent, respectively. Also, the total risk involved with investments 1 and 2, as measured by the variance of total return, is given by 2x12 + x22 + (x1 + x2 )2 , so that risk increases with total investment and with the amount of each individual investment. The investor would like to maximize his expected return and at the same time minimize his risk. Clearly, both of these objectives cannot, in general, be satisfied simultaneously. There are several possible approaches. For example, he can minimize risk subject to a constraint imposing a lower bound on expected return. Alternatively, expected return and risk can be combined in an objective function, to give the model: Maximize f (x) = 20x1 + 16x2 − θ [2x12 + x22 + (x1 + x2 )2 ], subject to: g1 (x) = x1 + x2 ≤ 5, x1 ≥ 0, x2 ≥ 0, (that is, g2 (x) = −x1 , g3 (x) = −x2 ). The nonnegative constant θ reflects his tradeoff between risk and return. If θ = 0, the model is a linear program, and he will invest completely in the investment with greatest expected return. For very large θ , the objective contribution due to expected return becomes negligible and he is essentially minimizing his risk.
412 Nonlinear Programming 13.1 Water Resources Planning In regional water planning, sources emitting pollutants might be required to remove waste from the water system. Let x j be the pounds of Biological Oxygen Demand (an often-used measure of pollution) to be removed at source j. One model might be to minimize total costs to the region to meet specified pollution standards: Minimize n X f j (x j ), j=1 subject to: n X ai j x j ≥ bi (i = 1, 2, . . . , m) j=1 0 ≤ xj ≤ uj ( j = 1, 2, . . . , n), where f j (x j ) = Cost of removing x j pounds of Biological Oxygen Demand at source j, bi = Minimum desired improvement in water quality at point i in the system, ai j = Quality response, at point i in the water system, caused by removing one pound of Biological Oxygen Demand at source j, u j = Maximum pounds of Biological Oxygen Demand that can be removed at source j. Constrained Regression A university wishes to assess the job placements of its graduates. For simplicity, it assumes that each graduate accepts either a government, industrial, or academic position. Let N j = Number of graduates in year j ( j = 1, 2, . . . , n), and let G j , I j , and A j denote the number entering government, industry, and academia, respectively, in year j (G j + I j + A j = N j ). One model being considered assumes that a given fraction of the student population joins each job category each year. If these fractions are denoted as λ1 , λ2 , and λ3 , then the predicted number entering the job categories in year j is given by the expressions Gˆ j = λ1 N j , Iˆj = λ2 N j , Aˆ j = λ3 N j . A reasonable performance measure of the model’s validity might be the difference between the actual number of graduates G j , I j , and A j entering the three job categories and the predicted numbers Gˆ j , Iˆj , and Aˆ j , as in the least-squares estimate: n X [(G j − Gˆ j )2 + (I j − Iˆj )2 + (A j − Aˆ j )2 ], Minimize j=1 subject to the constraint that all graduates are employed in one of the professions. In terms of the fractions entering each profession, the model can be written as: Minimize n X j=1 [(G j − λ1 N j )2 + (I j − λ2 N j )2 + (A j − λ3 N j )2 ],
13.2 Local vs. Global optimum 413 subject to: λ1 + λ2 + λ3 = 1, λ1 ≥ 0, λ2 ≥ 0, λ3 ≥ 0. This is a nonlinear program in three variables λ1 , λ2 , and λ3 . There are alternative ways to approach this problem. For example, the objective function can be changed to: n h X i G j − Gˆ j | + |I j − Iˆj | + |A j − Aˆ j .† Minimize j=1 This formulation is appealing since the problem now can be transformed into a linear program. Exercise 28 (see also Exercise 20) from Chapter 1 illustrates this transformation. The range of nonlinear-programming applications is practically unlimited. For example, it is usually simple to give a nonlinear extension to any linear program. Moreover, the constraint x = 0 or 1 can be modeled as x(1 − x) = 0 and the constraint x integer as sin (π x) = 0. Consequently, in theory any application of integer programming can be modeled as a nonlinear program. We should not be overly optimistic about these formulations, however; later we shall explain why nonlinear programming is not attractive for solving these problems. 13.2 LOCAL vs. GLOBAL OPTIMUM Geometrically, nonlinear programs can behave much differently from linear programs, even for problems with linear constraints. In Fig. 13.1, the portfolio-selection example from the last section has been plotted for several values of the tradeoff parameter θ . For each fixed value of θ , contours of constant objective values are concentric ellipses. As Fig. 13.1 shows, the optimal solution can occur: a) at an interior point of the feasible region; b) on the boundary of the feasible region, which is not an extreme point; or c) at an extreme point of the feasible region. As a consequence, procedures, such as the simplex method, that search only extreme points may not determine an optimal solution. Figure 13.2 illustrates another feature of nonlinear-programming problems. Suppose that we are to minimize f (x) in this example, with 0 ≤ x ≤ 10. The point x = 7 is optimal. Note, however, that in the indicated dashed interval, the point x = 0 is the best feasible point; i.e., it is an optimal feasible point in the local vicinity of x = 0 specified by the dashed interval. The latter example illustrates that a solution optimal in a local sense need not be optimal for the overall problem. Two types of solution must be distinguished. A global optimum is a solution to the overall optimization problem. Its objective value is as good as any other point in the feasible region. A local optimum, on the other hand, is optimal only with respect to feasible solutionsclose to that point. Points far removed from a local optimum play no role in its definition and may actually be preferred to the local optimum. Stated more formally, Definition. Let x = (x1 , x2 , . . . , xn ) be a feasiblesolution to a maximization problem with objective function f (x). We call x 1. A global maximum if f (x) ≥ f (y) for every feasible point y = (y1 , y2 , . . . , yn ); † | | denotes absolute value; that is, |x| = x if x ≥ 0 and |x| = −x if x < 0.