Escaping Local Optima
One of the challenges with heuristics is that they can get
stuck in a local optimum, meaning that they find a solution that is good within
their search space, but it may not be the global optimum (the best possible
solution).
One way to try to escape local optima is to use a technique
called simulated annealing, which is inspired by the way that metals are cooled
and solidified. The idea is that we allow the heuristic to "explore"
and potentially move to worse solutions, but with a decreasing probability as
the number of iterations increases. This can help the heuristic to escape local
optima and potentially find the global optimum.
Local search
Is a type of
heuristic search that focuses on finding a solution that is locally optimal,
rather than globally optimal. This means that the algorithm looks for the best
solution within a certain range or neighborhood, rather than searching the
entire space for the absolute best solution. The advantage of local search is
that it can often find a good solution quickly, but the disadvantage is that it
can get stuck in a local optimum, meaning it finds a solution that is good
within its neighborhood but not necessarily the best solution overall.
One way to try to
escape local optima in local search is to use techniques such as simulated
annealing or genetic algorithms, which allow the algorithm to make moves that
may not be locally optimal in order to explore different areas of the search
space and potentially find a better global solution.
In general, local search algorithms differ from exhaustive
(or brute force) search in that they try to find a solution by searching only a
small number of candidate solutions in order to find an answer.
This is done by starting at a
solution and then making small, local changes to it in order to try to find a
better solution. Local search algorithms often work by iteratively improving a
solution, moving to a neighboring solution that is closer to the optimal
solution. They can be seen as a form of heuristic, as they do not guarantee to
find the global optimal solution, but they can often find good solutions
quickly.
There are two main types of
local search algorithms: construction algorithms and iterative improvement
algorithms. Construction algorithms search the space of partial solutions,
looking for a feasible one, while iterative improvement algorithms search the space
of feasible solutions, looking for an improvement in the objective function.
Local search algorithms can be
very effective for finding good solutions to hard optimization problems, but
they can also get stuck in local optima, where further improvements are not
possible. To escape local optima, local search algorithms can use various
techniques such as randomization, perturbation, and memory.
This means that, in the case of NP-Hard problems, a local
search may not always guarantee to get the answer correct, but is designed to
be as effective as possible most of the time.
There are a few ways that local
search algorithms can escape from local optima, or suboptimal solutions:
1.
Random restarts: This involves starting the
search from a randomly chosen initial solution and repeating the process
multiple times. This can help the algorithm escape from local optima by giving
it a chance to explore different parts of the search space.
2.
Perturbation: This involves intentionally
introducing small changes or "perturbations" to the current solution
in order to escape from local optima. This can involve randomly altering the
values of some variables or swapping the values of two variables.
3.
Simulated annealing: This is a probabilistic
technique that can be used to escape from local optima. It is inspired by the
annealing process used in metallurgy, where a material is heated and then
cooled slowly in order to reduce defects and increase its structural purity. In
the context of local search, simulated annealing involves allowing the
algorithm to make "bad" moves with a certain probability, which can
help it escape from local optima and potentially find a global optimum.
4.
Genetic algorithms: This is a heuristic
optimization method inspired by the process of natural evolution. It involves
using principles of genetics and natural selection to evolve a population of
solutions over time, with the goal of finding a global optimum. Genetic
algorithms can help escape from local optima by maintaining a diverse
population of solutions and using crossover and mutation operators to explore
different parts of the search space.
Initial
solution
A generic local search process
|
Initial
solution |
|
Make
a small change to the current solution(s) |
|
Evaluate
new solution |
|
Accept/reject
new solution |
Iterative improvement
Iterative improvement algorithms are local search
algorithms that start with a solution and try to find a better solution by making
small changes to the current one. They continue this process until they reach a
local optimum, at which point they stop. A local optimum is a solution that is
better than or equal to any of its neighbours (i.e. solutions that can be
reached by making a single change to the current solution).
This is the simplest form of local search technique and is
commonly found as a subroutine in other more complex heuristics.
It works by starting with a feasible solution, and then
repeatedly making small changes to the solution in an attempt to improve it.
The process is repeated until no further improvements can be made, at which
point the algorithm returns the current solution as the answer.
|
Make
a small change to the current solution |
|
Evaluate
new solution |
|
Accept
new solution if better |
|
Initial
solution |
Example: 2-opt heuristic for the TSP
The 2-opt heuristic for the
traveling salesman problem involves iteratively improving the current solution
by swapping two edges that form a sub-tour with two other edges that connect
the same vertices. This process is repeated until no further improvement can be
made. The algorithm is called 2-opt because it only considers pairs of edges
(opt is short for "optimization"). The following is a pseudocode for
the 2-opt heuristic:
function 2-opt(tour):
best_tour = tour
improved = True
while improved:
improved =
False
for i in
range(len(tour)):
for j in
range(i+2, len(tour)):
new_tour =
two_opt_swap(tour, i, j)
if
tour_length(new_tour) < tour_length(best_tour):
best_tour
= new_tour
improved
= True
return best_tour
The function two_opt_swap(tour, i, j)
returns a new tour obtained by swapping the edges (i, i+1) and (j, j+1) in
the input tour tour. The function tour_length(tour)
returns the total length of the tour.
The 2-opt heuristic is a simple and effective local search
algorithm for the traveling salesman problem, but it can get stuck in local
optima. The Lin-Kernighan algorithm is a more advanced local search algorithm
that uses the 2-opt heuristic as a subroutine and can escape local optima by
making multiple 2-opt moves at a time.
In the travelling salesman problem, a common approach is to
take an initial solution (perhaps created by a greedy algorithm) and repeatedly
swap two edges over until no more improvement can be made. This is called a
2-opt search.
Local and Global optimality
Local optimality is when a solution is the best among its
neighboring solutions, while global optimality is when a solution is the best
among all possible solutions. Local search algorithms tend to find local
optimal solutions, while exhaustive search algorithms can find global optimal
solutions. However, due to their exponential running time, exhaustive search
algorithms are not practical for large problems.
When we find no further improvement using an iterative
improvement algorithm, we have reached the local optimum. A solution
which cannot be improved any further by making the small change the algorithm
uses. However, the local optimum is not always equal to the global optimum.
However, this local optimum may not be the global optimum,
which is the best solution possible out of all the solutions we could have
found.
It is possible to get stuck at a local optimum, meaning we
cannot find a better solution even though one exists. This is known as being
trapped in a local optima.
To try and escape local optima and find the global optimum,
we can use more complex heuristics such as genetic algorithms or simulated
annealing. These techniques involve making larger changes to the current
solution in order to explore more of the search space and potentially find a better
solution.
Metaphors galore
There are many different metaphors that have been used to
describe the concept of a local optimum. One common one is being stuck in a
local "valley" where the surrounding area is lower than the current
position, so there is no way to move upward. Another is being on the top of a
hill and not being able to see any higher hills in the distance, so the current
position is assumed to be the highest point. Regardless of the metaphor used,
the idea is the same: a local optimum is a solution that cannot be improved
upon by making small, local changes to it. This can be contrasted with a global
optimum, which is the best solution possible out of all the possible solutions.
At this point we start to think about the shape of our problems.
Linear programming problems have a favourable search landscape, because
all local optima are global optima.
This means that if we find a solution that cannot be
improved by making small changes, we know that it is the best possible solution.
Other problem types do not have such nice landscapes. For
example, imagine a problem where the objective function looks like a valley
with many hills. In this case, our algorithm may get stuck on a hill, even
though there is a deeper valley nearby.
There are many ways to try and escape these local optima,
and these methods are collectively known as metaheuristics. Some examples of
metaheuristics include simulated annealing, tabu search, and genetic
algorithms. These methods try to find better solutions by making larger changes
to the current solution, or by trying out many different solutions
simultaneously.
General optimisation problems aren’t like this
They can have multiple local optima and it's possible for
the global optimum to be different from all the local optima. In these cases,
iterative improvement algorithms might get stuck at a local optimum, unable to
find the global one. This is called getting trapped in a local minimum.
●
Imagine you want to climb to the top of a
perfect cone. All you have to do is keep heading upwards and you’re guaranteed
to get there.
●
Now imagine you want to climb to the top of a
mountain. There are many different paths you can take, and you might find
yourself at a local peak where you cannot climb any higher. However, if you
keep exploring, you may find a path that leads you to the global peak, or the
highest point on the mountain.
This metaphor illustrates the difference between local and
global optima. In an optimisation problem, a local optimum is a solution that
cannot be improved upon by making small changes to the solution, while a global
optimum is the best possible solution to the problem. In a problem with a
favourable search landscape, all local optima are also global optima, but in a
problem with a more complex search landscape, there may be multiple local
optima, and the global optimum may not be a local optimum.
●
Does this work with a real hill (iterative
improvement algorithms are often called ‘hill climbers’)?
No, a real hill usually has many local peaks or optima, and
a hill climber may get stuck at a local peak that is not the global peak. To
reach the global peak, the hill climber may need to take a step down and then
climb up again, which is not possible with an iterative improvement algorithm.
Escaping local optima
There are many ways to try and escape local optima when
using iterative improvement algorithms. One common method is called random
restarts, where the algorithm is run multiple times with different random
starting points and the best solution found is returned. Another method is
called simulated annealing, where the algorithm is allowed to accept worse
solutions with a certain probability in order to try and escape local optima.
There are also methods that use more complex search patterns, such as genetic
algorithms, which mimic the process of natural evolution to try and find good
solutions.
So having established that our hill climbing (or steepest
descent, if we’re minimising) algorithm may leave us stuck on top of a traffic
cone just outside Bangor, when trying to find the top of Snowdon, we need to
think about how we escape local optima.
There are several
ways to escape local optima in a search algorithm:
1.
Restarting
the search from a different starting point can sometimes help, as it may allow
the algorithm to find a different path to the global optimum.
2.
Using a
more complex search algorithm, such as simulated annealing or evolutionary
algorithms, which allow for the possibility of "jumping" out of a
local optimum and exploring other areas of the search space.
3.
Introducing
randomness or noise into the search process, which can sometimes help the
algorithm escape local optima and explore other areas of the search space.
4.
Using a
multi-start or iterated local search approach, which involves running the
search algorithm multiple times from different starting points and then
combining the results to find the global optimum.
Good local search algorithms need to be able to accept
non-improving solutions (take a step back) to have a real chance of finding a
true optimum.
One way to do this is to introduce randomness, or noise,
into the search process. This can be as simple as adding a probability of
accepting a non-improving solution, or by randomly perturbing the current
solution and accepting it with a certain probability. This is called simulated
annealing, as it is inspired by the annealing process used in metallurgy to
purify and strengthen metals.
Another way to escape local optima is by using a multi-start
method, where the algorithm is run multiple times from different randomly
generated starting points and the best solution found is returned. This can
also be combined with simulated annealing to further improve the chances of
finding the global optimum.
Finally, evolutionary algorithms such as genetic algorithms
use the concept of evolution and natural selection to escape local optima.
These algorithms work by generating a population of candidate solutions and
iteratively applying genetic operators such as crossover and mutation to
generate new solutions. The best solutions are then selected to form the next
generation, and the process is repeated until a satisfactory solution is found.
A good example of an algorithm that does this is Simulated
Annealing
Simulated Annealing is a probabilistic technique used to
find the global optimum of a function. It is an iterative method that starts
with a randomly generated solution and iteratively makes small changes to it,
in order to try to improve the solution. The algorithm accepts non-improving
solutions with a certain probability, in order to avoid getting stuck in local
optima. This probability is reduced over time, according to a schedule, hence
the name "Simulated Annealing", which is inspired by the annealing
process used in metallurgy to purify metals.
Here is an example of Simulated Annealing in Python:
import random
import math
def
simulated_annealing(initial_solution, cost_function, neighbors_function,
temperature_function):
current_solution = initial_solution
current_cost = cost_function(current_solution)
for t in temperature_function:
next_solution = random.choice(neighbors_function(current_solution))
next_cost = cost_function(next_solution)
delta_cost = next_cost - current_cost
if delta_cost > 0:
current_solution = next_solution
current_cost = next_cost
else:
probability = math.exp(delta_cost / t)
if random.random() < probability:
current_solution = next_solution
current_cost = next_cost
return current_solution
This algorithm
takes four arguments:
● initial_solution: the initial solution generated by the
algorithm.
● cost_function: a function that takes a solution and
returns its cost.
● neighbors_function: a function that takes a solution and
returns a list of its neighbors (i.e., solutions that can be obtained by making
small changes to it).
● temperature_function: a function that generates a sequence of
temperatures that are used to control the probability of accepting
non-improving solutions.
The algorithm
iteratively generates a new solution from the current solution's neighbors, and
accepts it as the new current solution if its cost is lower, or with a certain
probability if its cost is higher. The probability of accepting a higher-cost
solution is controlled by the current temperature, which is reduced over time
according to the temperature function. This way, the algorithm initially has a
higher chance of accepting non-improving solutions, in order to explore the
search space more widely, and then reduces this probability over time, in order
to converge to the global optimum.
Simulated Annealing (SA)
|
Make
a small change to the current solution |
|
Evaluate
new solution |
|
Accept
new solution if better - or with some probability if worse |
|
Initial
solution |
|
Make
probability depend on how much worse the new solution is |
|
Reduce
this probability as algorithm goes on (cooling) |
The SA algorithm
Simulated Annealing (SA) is a local search algorithm that is
used to find a good approximate solution to a problem that is difficult to
solve exactly. It is based on the idea of annealing in metallurgy, where a
material is heated and then cooled slowly in order to reduce defects and
increase its structural purity.
The algorithm works by simulating the process of annealing
in a material by starting at a high temperature and gradually decreasing it
over time. At high temperatures, the algorithm is more likely to accept
non-improving solutions, which allows it to explore a larger portion of the
search space and potentially escape local optima. As the temperature decreases,
the algorithm becomes more selective and is less likely to accept non-improving
solutions, which helps it to converge on a better solution.
Here is a simple example of how the SA algorithm might be
implemented in Python:
import random
def simulated_annealing(problem,
max_iterations, temperature_function):
current_solution = problem.get_initial_solution()
best_solution = current_solution
for i in range(max_iterations):
temperature = temperature_function(i)
next_solution = problem.get_random_neighbor(current_solution)
delta_e = problem.get_value(next_solution) - problem.get_value(current_solution)
if delta_e > 0:
current_solution = next_solution
if problem.get_value(current_solution) >
problem.get_value(best_solution):
best_solution = current_solution
else:
probability = math.exp(delta_e / temperature)
if random.random() < probability:
current_solution = next_solution
return best_solution
This implementation of the SA algorithm takes as input a problem
object that represents the problem to be solved, the maximum number of
iterations to run the algorithm for, and a temperature_function
that returns the temperature at each iteration. The problem
object should have methods for getting an initial solution, generating a random
neighbor of a given solution, and computing the value of a solution.
The algorithm starts by getting an initial solution and
setting the best solution found so far to this initial solution. It then
iterates for a given number of times, getting the temperature at each iteration
using the temperature function and generating a random neighbor of the current
solution. If the value of the neighbor is greater than the value of the current
solution, the current solution is updated to be the neighbor and the best
solution is updated if the current solution is better than the best solution
found so far. If the value of the neighbor is not greater than the value of the
current solution, the current solution is still updated to be the neighbor with
a probability equal to exp(delta_e / temperature), where delta_e is
the difference in value between the neighbor and the current solution. This
probability allows the algorithm to occasionally accept non-improving
solutions, especially at high temperatures. After the maximum number of
iterations has been reached, the algorithm returns the best solution found.
Simulated annealing works just like iterative improvement,
but with a modification to the acceptance criterion. It uses the metropolis
criterion, derived from the cooling (annealing) of molten crystals. A new
solution is accepted in two circumstances
●
If the new solution is better (like iterative
improvement)
●
If it is worse and r < exp(δ/T)
○
r is a random number between 0 and 1
○
δ is the difference in evaluation between the
old and new solutions (should always be negative: old-new for minimisation,
new-old for maximisation).
○
T is a variable called the temperature,
which reduces as the algorithm continues - this makes the process more likely
to accept non-improving solutions at the beginning and less likely towards the
end. Usually by multiplying by a value in the range 0.9 - 0.99, periodically.
Tabu Search
Tabu search is a local search algorithm that uses a
"memory" of past moves to guide the search process. It is designed to
escape local optima by allowing the algorithm to move to a non-optimal solution
if it has not been visited in the recent past. This helps to prevent the
algorithm from getting stuck in a local optimum, and can also help to prevent
cycling. The algorithm maintains a list of "tabu" moves, which are
moves that are not allowed to be made for a certain number of iterations. This
list is updated at each iteration, and the algorithm tries to find a move that
is not tabu and that improves the solution. The use of a tabu list helps to
prevent the algorithm from getting stuck in a cycle and allows it to explore a
wider range of the search space.
An alternative approach to escape local optima is given by
the Tabu Search framework.
●
Rather than picking any improving solution at
each iteration, always pick the best (even if it is not an improvement).
●
Remember which choices you made recently and do
not unmake them (stops you climbing straight back up)
Iteratively try to improve the solution by making a move
(swap two items, remove one, add one etc.) and check if it is an improvement ●
If it is, accept it ● If it isn't, accept it with some probability (controlled
by a temperature parameter, which decreases over time). This probability should
be high when the temperature is high and low when the temperature is low. ●
Update the temperature according to a schedule (e.g. linear decrease,
exponential decrease) ● Repeat until the temperature is low enough or a
satisfactory solution is found
This approach uses adaptive memory to keep a list of
recently altered components. These components are marked as tabu (or taboo
- as in forbidden) for a set number of iterations (the tabu tenure)
before they can be changed again.
The hope is that by restricting the search to new
solutions, the algorithm will be able to escape local optima and find better
solutions. Tabu search has been very successful in practice and is used in a
wide range of applications.

No comments:
Post a Comment