I've written about optimization before -- Pareto optimization and mixed-integer linear programming. Both of these are interesting topics: in Pareto optimization we're looking for the best solution to a problem with multiple objectives (e.g., a capitalistic health care system with conflicting objectives of maximizing profit and minimizing poor health); in (mixed-integer) linear programming we're finding the minimum of a function that is linear in all its variables, within a polytope of constraints.

I should start with some basic terms and concepts. Recall that finding the maximum of a function is equivalent to finding the minimum of the negative of that function, so without loss of generality, we can just talk about minimization. A minimum is the smallest value of the objective function, and a minimizer is the coordinates at which the minimum occurs.

A local minimizer is a point x where the objective function has a smaller value than it does at any point in the "neighborhood" of x. You know when you go to the store after a rain and there's one parking spot right near the front that's vacant because it's a big puddle? That point is a local minimizer of the parking lot. A global minimizer is a point x where the objective function has a smaller value than it does anywhere else in the domain.

Global optimization is the act of determining the global minimizer of an objective function. This sounds no different than what we do in linear programming, except in the case of linear programming, thanks to the fact that the objective function is linear, we know that the minimum we've found must be the global minimum.

There's another class of functions for which we can guarantee that the minimum we find is the global minimum: convex functions. Intuitively, a convex function is bowl shaped -- so if we just follow the slope of the function to the bottom of the bowl, we've found the global minimum. Formally, a convex function has the property that if you take two points on the convex function (let's call them x and y), the function always lies below the line between x and y, i.e., f(ax + (1-a)y) ≤ a f(x) + (1-a) f(y) for 0 ≤ a ≤ 1. For one-dimensional functions, this means that the second derivative is always non-negative, and equivalently, for multidimensional functions, the Hessian is positive semidefinite. The implication of this is that there's always a direction of descent that you can follow from any point (other than the minimizer) to find a point where the value of the objective function is smaller.

The trouble is, there are many objective functions we want to minimize that may not have such nice properties. There are lots of real-world objective functions that we know are not convex -- or even worse, we don't even know what they look like or how they behave.

There's a class of continuous functions known as coercive, with the property that for large |x| (magnitude x -- for a 2-D function, this would be (x

^{2}+y

^{2})

^{1/2}), the function approaches positive infinity. These functions look like a big bowl shape on a macroscopic scale, but up close they have some wiggles and wobbles. If a function is coercive, we know that it has to have at least one global minimizer. Intuitively, this is because the function has to be bounded from below, and you could find a bound that just touches the bottom of the function, which would touch right at the global minimizer(s). Unconstrained global optimization (that is, optimizing over the entire function space) is generally not possible if the objective function is not coercive.

But even if the function is coercive, there are no guarantees that we will find the global minimum. Indeed, without any other helpful properties, we can never be certain that we've found the global minimum!

Here's why. Suppose that you are given a "black box" function that you are told is coercive. As it turns out, whatever goes on inside the black box is controlled by your arch-enemy. You would never be able to find the global minimum of the function in a finite amount of time. The reason is because any optimization method that you would try would basically boil down to sampling the function at a bunch of points. Your nemesis would just keep moving the location of the "true global minimizer" to places that you hadn't yet sampled. The only way you would finally find it would be when you had exhausted every single point in the domain, which would take (literally) forever.

So if global optimization is such a hopeless business, what can we do? Well, many objective functions aren't as pathological as the inner workings of your worst enemy's mind, and the global minimum is findable. For the rest of them, we just settle for what we think is probably the global minimum. As for how we actually do the work of global optimization, this blog entry is long enough, and that will have to wait for next time.

## No comments:

Post a Comment