×

Close

- Telecommunication Network and Optimization - TNO
- Note
- 1 Topics
**426 Views**- 2 Offline Downloads
- Uploaded 9 months ago

Touch here to read

Page-1

Topic:

UPDATED FALL 2018 1 Network Optimization: Notes and Exercises Michael J. Neely University of Southern California http://www-bcf.usc.edu/∼mjneely Abstract These notes provide a tutorial treatment of topics of Pareto optimality, Lagrange multipliers, and computational algorithms for multiobjective optimization, with emphasis on applications to data networks. Problems with two objectives are considered first, called bicriteria optimization problems (treated in Sections I and II). The main concepts of bicriteria optimization naturally extend to problems with more than two objectives, called multicriteria optimization problems. Multicriteria problems can be more complex than bicriteria problems, and often cannot be solved without the aid of a computer. Efficient computational methods exist for problems that have a convex structure. Convexity is formally defined in Section IV. Section V describes a general class of multicriteria problems with a convex structure, called convex programs. A drift-plus-penalty algorithm is developed in Section VI as a computational procedure for solving convex programs. The drift-plus-penalty algorithm extends as an online control technique for optimizing time averages of system objectives, even when the underlying system does not have a convex structure. An enhanced algorithm with faster convergence time is also described. Section VIII focuses on application of drift-plus-penalty theory to multi-hop networks, including problems of network utility maximization and power-aware routing. Exercises are provided to reinforce the theory and the applications. H OW TO REFERENCE THESE NOTES Sections I-V present material on optimization and Lagrange multipliers that may be newly presented in this manner, but that is well known and/or easily derived from basic definitions (see also [1][2]). Sections VI-VIII present more advanced material on drift-plus-penalty theory for convex programs and data networks. Readers who want to cite this material should cite the related published works [3][4][5]. I. B ICRITERIA OPTIMIZATION Consider a system that has a collection M of different operating modes, where M is an abstract (possibly infinite) set that contains at least one element. Each operating mode m ∈ M determines a two-dimensional vector (x(m), y(m)), where x(m) and y(m) represent distinct system objectives of interest. Suppose it is desirable to keep both objectives x(m) and y(m) as small as possible. We want to find a mode m ∈ M that “minimizes both” x(m) and y(m). Of course, it may not be possible to simultaneously minimize both objectives. This tension motivates the study of bicriteria optimization. Example I.1. (Distance-aware and energy-aware routing) Consider the problem of finding the best route to use for sending a single message over a network. The network has multiple nodes, multiple links that are represented by ordered pairs (i, j) for nodes i and j, a single source node s, and a single destination (or “termination”) node t. Let M represent the set of all available routes, where each route m ∈ M is itself an ordered set of links (i, j) that specify a path from source s to destination t over the network: m = {(i0 (m), i1 (m)), (i1 (m), i2 (m)), ..., (ih(m)−1 (m), ih(m) (m))} where h(m) is the number of hops for route m; i0 (m) = s is the source node; ih(m) (m) = t is the destination node. Suppose each link (i, j) in the network has a link distance dij and a link energy expenditure eij . For each route m ∈ M, let x(m) be the total distance of the route, and let y(m) be the total energy used. Thus, X x(m) = dij (i,j)∈m y(m) = X cij (i,j)∈m It is desirable to choose a route m ∈ M that keeps both objectives x(m) and y(m) small. Example I.2. (Power allocation over one wireless link) Consider the problem of transmitting over a single wireless link. Let p be a variable that represents the amount of power used, and suppose this variable must be chosen over an interval [0, pmax ] for some positive maximum power level pmax . The power used determines the transmission rate µ(p) = log(1 + p). The goal is to operate the system while minimizing power and maximizing transmission rate. Define set M as the interval [0, pmax ]. For each p ∈ M, define x(p) = p as the power used and y(p) = −µ(p) as −1 times the transmission rate achieved (so that minimizing y(p) is the same as maximizing µ(p)). We want to choose p ∈ M to keep both objectives x(p) and y(p) small.

UPDATED FALL 2018 2 Example I.3. (Rate and power over a 3-user wireless system) Consider a wireless device that transmits to three different users over orthogonal links. The device must choose a power vector (p1 , p2 , p3 ) ∈ R3 that satisfies the following constraints: p1 + p2 + p3 pi ≤ pmax (1) > 0 ∀i ∈ {1, 2, 3} (2) where pmax is a positive real number that constrains the sum power usage. For each i ∈ {1, 2, 3}, let µi (pi ) = log(1 + γi pi ) be the transmission rate achieved over link i as a function of the power variable pi , where γi is some known attenuation coefficient for link i. Define M as the set of all (p1 , p2 , p3 ) ∈ R3 that satisfy the constraints (1)-(2). Define x(p1 , p2 , p3 ) = p1 + p2 + p3 y(p1 , p2 , p3 ) = −[µ1 (p1 ) + µ2 (p2 ) + µ3 (p3 )] Thus, x(p1 , p2 , p3 ) represents the sum power used, while y(p1 , p2 , p3 ) is −1 times the sum rate over all three links. The goal is to choose (p1 , p2 , p3 ) ∈ M to keep both x(p1 , p2 , p3 ) and y(p1 , p2 , p3 ) small. Example I.4. (Network utility maximization) Consider the same 3-link wireless system as Example I.3. However, suppose we do not care about power expenditure. Rather, we care about: • Maximizing the sum rate µ1 (p1 ) + µ2 (p2 ) + µ3 (p3 ). • Maximizing the proportionally fair utility metric log(µ1 (p1 )) + log(µ2 (p2 )) + log(µ3 (p3 )). This is a commonly used notion of fairness for rate allocation over multiple users.1 Again let M be the set of all vectors (p1 , p2 , p3 ) ∈ R3 that satisfy (1)-(2). Define x(p1 , p2 , p3 ) = −[µ1 (p1 ) + µ2 (p2 ) + µ3 (p3 )] y(p1 , p2 , p3 ) = −[log(µ1 (p1 )) + log(µ2 (p2 )) + log(µ3 (p3 ))] so that x(p1 , p2 , p3 ) is −1 times the sum rate, and y(p1 , p2 , p3 ) is −1 times the proportionally fair utility metric. The goal is to choose (p1 , p2 , p3 ) ∈ M to minimize both x(p1 , p2 , p3 ) and y(p1 , p2 , p3 ). Example I.1 emphasizes that the set M can have any size and structure that we want, and its elements can be any type of object that we want (in that example, M is a finite set of possible routes). Examples I.2-I.4 show that the set M can be an infinite set of vectors (p1 , p2 , p3 ). Examples I.2-I.4 also show how a bicriteria optimization problem that seeks to maximize one objective while minimizing another, or that seeks to maximize both objectives, can be transformed into a bicriteria minimization problem by multiplying the appropriate objectives by −1. Hence, without loss of generality, it suffices to assume the system controller wants both components of the vector of objectives (x, y) to be small. A. Pareto optimality Define A as the set of all (x, y) vectors in R2 that are achievable via system modes m ∈ M: A = {(x(m), y(m)) ∈ R2 : m ∈ M} Every (x, y) pair in A is a feasible operating point. Once the set A is known, system optimality can be understood in terms of selecting a desirable 2-dimensional vector (x, y) in the set A. With this approach, the study of optimality does not require knowledge of the physical tasks the system must perform for each mode of operation in M. This is useful because it allows many different types of problems to be treated with a common mathematical framework. The set A can have an arbitrary structure. It can be finite, infinite, closed, open, neither closed nor open, and so on. Assume the system controller wants to find an operating point (x, y) ∈ A for which both x and y are small. Definition I.1. A vector (x, y) ∈ A is preferred over (or dominates) another vector (w, z) ∈ A, written (x, y) ≺ (w, z), if the following two inequalities hold • x≤w • y ≤z and if at least one of the inequalities is strict (so that either x < w or y < z). Definition I.2. A vector (x∗ , y ∗ ) ∈ A is Pareto optimal if there is no vector (x, y) ∈ A that satisfies (x, y) ≺ (x∗ , y ∗ ). A set can have many Pareto optimal points. An example set A and its Pareto optimal points are shown in Fig. 1. For each vector (a, b) ∈ R2 , define S(a, b) as the set of all points (x, y) that satisfy x ≤ a and y ≤ b: S(a, b) = {(x, y) ∈ R2 : x ≤ a, y ≤ b} 1 See [6] for a development of proportionally fair utility and its relation to the log(µ) function. The constraints (2) avoid the singularity of the log(µ) function at 0, so that log(µ1 (p1 )) + log(µ2 (p2 )) + log(µ3 (p3 )) is indeed a real number whenever (p1 , p2 , p3 ) satisfies (1)-(2). An alternative is to use constraints pi ≥ 0 (which allow zero power in some channels), but to modify the utility function from log(µ) to (1/b) log(1 + bµ) for some constant b > 0.

UPDATED FALL 2018 3 Pictorially, the set S(a, b) is an infinite square in the 2-dimesional plane with upper-right vertex at (a, b) (see Fig. 1). If (a, b) is a point in A, any other vector in A that is preferred over (a, b) must lie in the set S(a, b). If there are no points in A ∩ S(a, b) other than (a, b) itself, then (a, b) is Pareto optimal. Set$A (a,b)$ S(a,b)$ Pareto$op*mal$$ points$ Fig. 1. An example set A (in orange) that contains an irregular-shaped connected component and 7 additional isolated points. The Pareto optimal points on the connected component are colored in green, and the two Pareto optimal isolated points are circled. The rectangle set S(a, b) is illustrated for a particular Pareto optimal point (a, b). Note that (a, b) is Pareto optimal because S(a, b) intersects A only at the point (a, b). B. Degenerate cases and the compact assumption In some cases the set A will have no Pareto optimal points. For example, suppose A is the entire set R2 . If we choose any point (x, y) ∈ R2 , there is always another point (x − 1, y) ∈ R2 that is preferred. Further, it can be shown that if A is an open subset of R2 , then it has no Pareto optimal points (see Exercise IX-A.5). To avoid these degenerate situations, it is often useful to impose the further condition that the set A is both closed and bounded. A closed and bounded subset of RN is called a compact set. If A is a finite set then it is necessarily compact. It can be shown that if A is a nonempty compact set, then: 1) It has Pareto optimal points. 2) For every point (a, b) ∈ A that is not Pareto optimal, there is a Pareto optimal point that is preferred over (a, b). See Exercise IX-A.11 for a proof of the above two claims. Therefore, when A is compact, we can restrict attention to choosing an operating point (x, y) that is Pareto optimal. II. O PTIMIZATION WITH ONE CONSTRAINT 2 Let A ⊆ R be a set of all feasible (x, y) operating points. Assume the system controller wants to make both components of the vector (x, y) small. One way to approach this problem is to minimize y subject to the constraint x ≤ c, where c is a given real number. To this end, fix a constant c ∈ R and consider the following constrained optimization problem: Minimize: y (3) Subject to: x≤c (4) (x, y) ∈ A (5) The variables x and y are the optimization variables in the above problem, while the constant c is assumed to be a given and fixed parameter. The above problem is feasible if there exists an (x, y) ∈ R2 that satisfies both constraints (4)-(5). Definition II.1. A point (x∗ , y ∗ ) is a solution to the optimization problem (3)-(5) if the following two conditions hold: ∗ ∗ • (x , y ) satisfies both constraints (4)-(5). ∗ • y ≤ y for all points (x, y) that satisfy (4)-(5). It is possible for the problem (3)-(5) to have more than one optimal solution. It is also possible to have no optimal solution, even if the problem is feasible. This happens when there is an infinite sequence of points {(xn , yn )}∞ n=1 that satisfy the constraints (4)-(5) with strictly decreasing values of yn , but for which the limiting value of yn cannot be achieved (see Exercise IX-B.1). This can only happen if the set A is not compact. On the other hand, it can be shown that if A is a compact set, then the problem (3)-(5) has an optimal solution whenever it is feasible.

UPDATED FALL 2018 4 A. The tradeoff function The problem (3)-(5) uses a parameter c in the inequality constraint (4). If the problem (3)-(5) is feasible for some given parameter c, then it is also feasible for every parameter c0 that satisfies c0 ≥ c. Thus, the set of all values c for which the problem is feasible forms an interval of the real number line of the form either (cmin , ∞) or [cmin , ∞). Call this set the feasibility interval. The value cmin is the infimum of the set of all real numbers in the feasibility interval. For each c in the feasibility interval, define ψ(c) as the infimum value of the objective function in problem (3)-(5) with parameter c. In particular, if (x∗ , y ∗ ) is an optimal solution to (3)-(5) with parameter c, then ψ(c) = y ∗ . If A is a compact set, it can be shown that the feasibility interval has the form [cmin , ∞) and that problem (3)-(5) has an optimal solution for all c ∈ [cmin , ∞). The function ψ(c) is called the tradeoff function. The tradeoff function establishes the tradeoffs associated with choosing larger or smaller values of the constraint c. Intuitively, it is clear that increasing the value of c imposes less stringent constraints on the problem, which allows for improved values of ψ(c). This is formalized in the next lemma. ψ(c)" c1" cmin" c2" c3" c" Fig. 2. The set A from Fig. 1 with its (non-increasing) tradeoff function ψ(c) drawn in green. Note that ψ(c) is discontinuous at points c1 , c2 , c3 . Lemma II.1. The tradeoff function ψ(c) is non-increasing over the feasibility interval. Proof. For simplicity assume A is compact. Consider two values c1 and c2 in the interval [cmin , ∞), and assume c1 ≤ c2 . We want to show that ψ(c1 ) ≥ ψ(c2 ). Let (x∗1 , y1∗ ) and (x∗2 , y2∗ ) be optimal solutions of (3)-(5) corresponding to parameters c = c1 and c = c2 , respectively. Then: y1∗ = ψ(c1 ) y2∗ = ψ(c2 ) By definition of (x∗2 , y2∗ ) being optimal for the problem with parameter c = c2 , we know that for any vector (x, y) ∈ A that satisfies x ≤ c2 , we have: y2∗ ≤ y (6) On the other hand, we know (x∗1 , y1∗ ) is a point in A that satisfies x∗1 ≤ c1 ≤ c2 , so (6) gives: y2∗ ≤ y1∗ Substituting y1∗ = ψ(c1 ) and y2∗ = ψ(c2 ) gives the result. Note that the tradeoff function ψ(c) is not necessarily continuous (see Fig. 2). It can be shown that it is continuous when the set A is compact and has a convexity property.2 Convexity is defined in Section IV. The tradeoff curve is defined as the set of all points (c, ψ(c)) for c in the feasibility interval. Exercise IX-A.8 shows that every Pareto optimal point (x(p) , y (p) ) of A is a point on the tradeoff curve, so that ψ(x(p) ) = y (p) . 2 In particular, ψ(c) is both continuous and convex over c ∈ [c min , ∞) whenever A is compact and convex. Definitions of convex set and convex function are provided in Section IV.

## Leave your Comments