Friday, 27 January 2023

Escaping Local Optima

 


Escaping Local Optima

One of the challenges with heuristics is that they can get stuck in a local optimum, meaning that they find a solution that is good within their search space, but it may not be the global optimum (the best possible solution).

One way to try to escape local optima is to use a technique called simulated annealing, which is inspired by the way that metals are cooled and solidified. The idea is that we allow the heuristic to "explore" and potentially move to worse solutions, but with a decreasing probability as the number of iterations increases. This can help the heuristic to escape local optima and potentially find the global optimum.

Local search

Is a type of heuristic search that focuses on finding a solution that is locally optimal, rather than globally optimal. This means that the algorithm looks for the best solution within a certain range or neighborhood, rather than searching the entire space for the absolute best solution. The advantage of local search is that it can often find a good solution quickly, but the disadvantage is that it can get stuck in a local optimum, meaning it finds a solution that is good within its neighborhood but not necessarily the best solution overall.

One way to try to escape local optima in local search is to use techniques such as simulated annealing or genetic algorithms, which allow the algorithm to make moves that may not be locally optimal in order to explore different areas of the search space and potentially find a better global solution.

In general, local search algorithms differ from exhaustive (or brute force) search in that they try to find a solution by searching only a small number of candidate solutions in order to find an answer.

This is done by starting at a solution and then making small, local changes to it in order to try to find a better solution. Local search algorithms often work by iteratively improving a solution, moving to a neighboring solution that is closer to the optimal solution. They can be seen as a form of heuristic, as they do not guarantee to find the global optimal solution, but they can often find good solutions quickly.

There are two main types of local search algorithms: construction algorithms and iterative improvement algorithms. Construction algorithms search the space of partial solutions, looking for a feasible one, while iterative improvement algorithms search the space of feasible solutions, looking for an improvement in the objective function.

Local search algorithms can be very effective for finding good solutions to hard optimization problems, but they can also get stuck in local optima, where further improvements are not possible. To escape local optima, local search algorithms can use various techniques such as randomization, perturbation, and memory.

This means that, in the case of NP-Hard problems, a local search may not always guarantee to get the answer correct, but is designed to be as effective as possible most of the time.

There are a few ways that local search algorithms can escape from local optima, or suboptimal solutions:

1.       Random restarts: This involves starting the search from a randomly chosen initial solution and repeating the process multiple times. This can help the algorithm escape from local optima by giving it a chance to explore different parts of the search space.

2.       Perturbation: This involves intentionally introducing small changes or "perturbations" to the current solution in order to escape from local optima. This can involve randomly altering the values of some variables or swapping the values of two variables.

3.       Simulated annealing: This is a probabilistic technique that can be used to escape from local optima. It is inspired by the annealing process used in metallurgy, where a material is heated and then cooled slowly in order to reduce defects and increase its structural purity. In the context of local search, simulated annealing involves allowing the algorithm to make "bad" moves with a certain probability, which can help it escape from local optima and potentially find a global optimum.

4.       Genetic algorithms: This is a heuristic optimization method inspired by the process of natural evolution. It involves using principles of genetics and natural selection to evolve a population of solutions over time, with the goal of finding a global optimum. Genetic algorithms can help escape from local optima by maintaining a diverse population of solutions and using crossover and mutation operators to explore different parts of the search space.

Initial solution

A generic local search process

Make a small change to the current solution(s)

Evaluate new solution

Accept/reject new solution

 

 


 


Iterative improvement

Iterative improvement algorithms are local search algorithms that start with a solution and try to find a better solution by making small changes to the current one. They continue this process until they reach a local optimum, at which point they stop. A local optimum is a solution that is better than or equal to any of its neighbours (i.e. solutions that can be reached by making a single change to the current solution).

This is the simplest form of local search technique and is commonly found as a subroutine in other more complex heuristics.

It works by starting with a feasible solution, and then repeatedly making small changes to the solution in an attempt to improve it. The process is repeated until no further improvements can be made, at which point the algorithm returns the current solution as the answer.

Make a small change to the current solution

Evaluate new solution

Accept new solution if better

Initial solution

 

 

 

 

 

 

 

 

 


Example: 2-opt heuristic for the TSP

The 2-opt heuristic for the traveling salesman problem involves iteratively improving the current solution by swapping two edges that form a sub-tour with two other edges that connect the same vertices. This process is repeated until no further improvement can be made. The algorithm is called 2-opt because it only considers pairs of edges (opt is short for "optimization"). The following is a pseudocode for the 2-opt heuristic:

function 2-opt(tour):

  best_tour = tour

  improved = True

  while improved:

    improved = False

    for i in range(len(tour)):

      for j in range(i+2, len(tour)):

        new_tour = two_opt_swap(tour, i, j)

        if tour_length(new_tour) < tour_length(best_tour):

          best_tour = new_tour

          improved = True

  return best_tour

The function two_opt_swap(tour, i, j) returns a new tour obtained by swapping the edges (i, i+1) and (j, j+1) in the input tour tour. The function tour_length(tour) returns the total length of the tour.

The 2-opt heuristic is a simple and effective local search algorithm for the traveling salesman problem, but it can get stuck in local optima. The Lin-Kernighan algorithm is a more advanced local search algorithm that uses the 2-opt heuristic as a subroutine and can escape local optima by making multiple 2-opt moves at a time.

In the travelling salesman problem, a common approach is to take an initial solution (perhaps created by a greedy algorithm) and repeatedly swap two edges over until no more improvement can be made. This is called a 2-opt search.

 

 

 

 

 

 


Local and Global optimality

Local optimality is when a solution is the best among its neighboring solutions, while global optimality is when a solution is the best among all possible solutions. Local search algorithms tend to find local optimal solutions, while exhaustive search algorithms can find global optimal solutions. However, due to their exponential running time, exhaustive search algorithms are not practical for large problems.

When we find no further improvement using an iterative improvement algorithm, we have reached the local optimum. A solution which cannot be improved any further by making the small change the algorithm uses. However, the local optimum is not always equal to the global optimum.

However, this local optimum may not be the global optimum, which is the best solution possible out of all the solutions we could have found.

It is possible to get stuck at a local optimum, meaning we cannot find a better solution even though one exists. This is known as being trapped in a local optima.

To try and escape local optima and find the global optimum, we can use more complex heuristics such as genetic algorithms or simulated annealing. These techniques involve making larger changes to the current solution in order to explore more of the search space and potentially find a better solution.

 

Metaphors galore

There are many different metaphors that have been used to describe the concept of a local optimum. One common one is being stuck in a local "valley" where the surrounding area is lower than the current position, so there is no way to move upward. Another is being on the top of a hill and not being able to see any higher hills in the distance, so the current position is assumed to be the highest point. Regardless of the metaphor used, the idea is the same: a local optimum is a solution that cannot be improved upon by making small, local changes to it. This can be contrasted with a global optimum, which is the best solution possible out of all the possible solutions.

At this point we start to think about the shape of our problems. Linear programming problems have a favourable search landscape, because all local optima are global optima.

This means that if we find a solution that cannot be improved by making small changes, we know that it is the best possible solution.

Other problem types do not have such nice landscapes. For example, imagine a problem where the objective function looks like a valley with many hills. In this case, our algorithm may get stuck on a hill, even though there is a deeper valley nearby.

There are many ways to try and escape these local optima, and these methods are collectively known as metaheuristics. Some examples of metaheuristics include simulated annealing, tabu search, and genetic algorithms. These methods try to find better solutions by making larger changes to the current solution, or by trying out many different solutions simultaneously.

General optimisation problems aren’t like this

They can have multiple local optima and it's possible for the global optimum to be different from all the local optima. In these cases, iterative improvement algorithms might get stuck at a local optimum, unable to find the global one. This is called getting trapped in a local minimum.

     Imagine you want to climb to the top of a perfect cone. All you have to do is keep heading upwards and you’re guaranteed to get there.

     Now imagine you want to climb to the top of a mountain. There are many different paths you can take, and you might find yourself at a local peak where you cannot climb any higher. However, if you keep exploring, you may find a path that leads you to the global peak, or the highest point on the mountain.

This metaphor illustrates the difference between local and global optima. In an optimisation problem, a local optimum is a solution that cannot be improved upon by making small changes to the solution, while a global optimum is the best possible solution to the problem. In a problem with a favourable search landscape, all local optima are also global optima, but in a problem with a more complex search landscape, there may be multiple local optima, and the global optimum may not be a local optimum.

 

     Does this work with a real hill (iterative improvement algorithms are often called ‘hill climbers’)?

No, a real hill usually has many local peaks or optima, and a hill climber may get stuck at a local peak that is not the global peak. To reach the global peak, the hill climber may need to take a step down and then climb up again, which is not possible with an iterative improvement algorithm.

Escaping local optima

There are many ways to try and escape local optima when using iterative improvement algorithms. One common method is called random restarts, where the algorithm is run multiple times with different random starting points and the best solution found is returned. Another method is called simulated annealing, where the algorithm is allowed to accept worse solutions with a certain probability in order to try and escape local optima. There are also methods that use more complex search patterns, such as genetic algorithms, which mimic the process of natural evolution to try and find good solutions.

So having established that our hill climbing (or steepest descent, if we’re minimising) algorithm may leave us stuck on top of a traffic cone just outside Bangor, when trying to find the top of Snowdon, we need to think about how we escape local optima.

There are several ways to escape local optima in a search algorithm:

1.       Restarting the search from a different starting point can sometimes help, as it may allow the algorithm to find a different path to the global optimum.

2.       Using a more complex search algorithm, such as simulated annealing or evolutionary algorithms, which allow for the possibility of "jumping" out of a local optimum and exploring other areas of the search space.

3.       Introducing randomness or noise into the search process, which can sometimes help the algorithm escape local optima and explore other areas of the search space.

4.       Using a multi-start or iterated local search approach, which involves running the search algorithm multiple times from different starting points and then combining the results to find the global optimum.

Good local search algorithms need to be able to accept non-improving solutions (take a step back) to have a real chance of finding a true optimum.

One way to do this is to introduce randomness, or noise, into the search process. This can be as simple as adding a probability of accepting a non-improving solution, or by randomly perturbing the current solution and accepting it with a certain probability. This is called simulated annealing, as it is inspired by the annealing process used in metallurgy to purify and strengthen metals.

Another way to escape local optima is by using a multi-start method, where the algorithm is run multiple times from different randomly generated starting points and the best solution found is returned. This can also be combined with simulated annealing to further improve the chances of finding the global optimum.

Finally, evolutionary algorithms such as genetic algorithms use the concept of evolution and natural selection to escape local optima. These algorithms work by generating a population of candidate solutions and iteratively applying genetic operators such as crossover and mutation to generate new solutions. The best solutions are then selected to form the next generation, and the process is repeated until a satisfactory solution is found.

A good example of an algorithm that does this is Simulated Annealing

Simulated Annealing is a probabilistic technique used to find the global optimum of a function. It is an iterative method that starts with a randomly generated solution and iteratively makes small changes to it, in order to try to improve the solution. The algorithm accepts non-improving solutions with a certain probability, in order to avoid getting stuck in local optima. This probability is reduced over time, according to a schedule, hence the name "Simulated Annealing", which is inspired by the annealing process used in metallurgy to purify metals.

Here is an example of Simulated Annealing in Python:

import random

import math

 

def simulated_annealing(initial_solution, cost_function, neighbors_function, temperature_function):

  current_solution = initial_solution

  current_cost = cost_function(current_solution)

  for t in temperature_function:

    next_solution = random.choice(neighbors_function(current_solution))

    next_cost = cost_function(next_solution)

    delta_cost = next_cost - current_cost

    if delta_cost > 0:

      current_solution = next_solution

      current_cost = next_cost

    else:

      probability = math.exp(delta_cost / t)

      if random.random() < probability:

        current_solution = next_solution

        current_cost = next_cost

  return current_solution

This algorithm takes four arguments:

     initial_solution: the initial solution generated by the algorithm.

     cost_function: a function that takes a solution and returns its cost.

     neighbors_function: a function that takes a solution and returns a list of its neighbors (i.e., solutions that can be obtained by making small changes to it).

     temperature_function: a function that generates a sequence of temperatures that are used to control the probability of accepting non-improving solutions.

The algorithm iteratively generates a new solution from the current solution's neighbors, and accepts it as the new current solution if its cost is lower, or with a certain probability if its cost is higher. The probability of accepting a higher-cost solution is controlled by the current temperature, which is reduced over time according to the temperature function. This way, the algorithm initially has a higher chance of accepting non-improving solutions, in order to explore the search space more widely, and then reduces this probability over time, in order to converge to the global optimum.

Simulated Annealing (SA)

 

Make a small change to the current solution

Evaluate new solution

Accept new solution if better - or with some probability if worse

Initial solution

Make probability depend on how much worse the new solution is

Reduce this probability as algorithm goes on (cooling)

 

 

 

 

 

 

 

 

 

 

 

 


The SA algorithm

Simulated Annealing (SA) is a local search algorithm that is used to find a good approximate solution to a problem that is difficult to solve exactly. It is based on the idea of annealing in metallurgy, where a material is heated and then cooled slowly in order to reduce defects and increase its structural purity.

The algorithm works by simulating the process of annealing in a material by starting at a high temperature and gradually decreasing it over time. At high temperatures, the algorithm is more likely to accept non-improving solutions, which allows it to explore a larger portion of the search space and potentially escape local optima. As the temperature decreases, the algorithm becomes more selective and is less likely to accept non-improving solutions, which helps it to converge on a better solution.

Here is a simple example of how the SA algorithm might be implemented in Python:

import random

 

def simulated_annealing(problem, max_iterations, temperature_function):

  current_solution = problem.get_initial_solution()

  best_solution = current_solution

 

  for i in range(max_iterations):

    temperature = temperature_function(i)

   

    next_solution = problem.get_random_neighbor(current_solution)

    delta_e = problem.get_value(next_solution) - problem.get_value(current_solution)

   

    if delta_e > 0:

      current_solution = next_solution

      if problem.get_value(current_solution) > problem.get_value(best_solution):

        best_solution = current_solution

    else:

      probability = math.exp(delta_e / temperature)

      if random.random() < probability:

        current_solution = next_solution

       

  return best_solution

This implementation of the SA algorithm takes as input a problem object that represents the problem to be solved, the maximum number of iterations to run the algorithm for, and a temperature_function that returns the temperature at each iteration. The problem object should have methods for getting an initial solution, generating a random neighbor of a given solution, and computing the value of a solution.

The algorithm starts by getting an initial solution and setting the best solution found so far to this initial solution. It then iterates for a given number of times, getting the temperature at each iteration using the temperature function and generating a random neighbor of the current solution. If the value of the neighbor is greater than the value of the current solution, the current solution is updated to be the neighbor and the best solution is updated if the current solution is better than the best solution found so far. If the value of the neighbor is not greater than the value of the current solution, the current solution is still updated to be the neighbor with a probability equal to exp(delta_e / temperature), where delta_e is the difference in value between the neighbor and the current solution. This probability allows the algorithm to occasionally accept non-improving solutions, especially at high temperatures. After the maximum number of iterations has been reached, the algorithm returns the best solution found.

Simulated annealing works just like iterative improvement, but with a modification to the acceptance criterion. It uses the metropolis criterion, derived from the cooling (annealing) of molten crystals. A new solution is accepted in two circumstances

     If the new solution is better (like iterative improvement)

     If it is worse and r < exp(δ/T)

     r is a random number between 0 and 1

     δ is the difference in evaluation between the old and new solutions (should always be negative: old-new for minimisation, new-old for maximisation).

     T is a variable called the temperature, which reduces as the algorithm continues - this makes the process more likely to accept non-improving solutions at the beginning and less likely towards the end. Usually by multiplying by a value in the range 0.9 - 0.99, periodically.

Tabu Search

Tabu search is a local search algorithm that uses a "memory" of past moves to guide the search process. It is designed to escape local optima by allowing the algorithm to move to a non-optimal solution if it has not been visited in the recent past. This helps to prevent the algorithm from getting stuck in a local optimum, and can also help to prevent cycling. The algorithm maintains a list of "tabu" moves, which are moves that are not allowed to be made for a certain number of iterations. This list is updated at each iteration, and the algorithm tries to find a move that is not tabu and that improves the solution. The use of a tabu list helps to prevent the algorithm from getting stuck in a cycle and allows it to explore a wider range of the search space.

An alternative approach to escape local optima is given by the Tabu Search framework.

     Rather than picking any improving solution at each iteration, always pick the best (even if it is not an improvement).

     Remember which choices you made recently and do not unmake them (stops you climbing straight back up)

Iteratively try to improve the solution by making a move (swap two items, remove one, add one etc.) and check if it is an improvement ● If it is, accept it ● If it isn't, accept it with some probability (controlled by a temperature parameter, which decreases over time). This probability should be high when the temperature is high and low when the temperature is low. ● Update the temperature according to a schedule (e.g. linear decrease, exponential decrease) ● Repeat until the temperature is low enough or a satisfactory solution is found

This approach uses adaptive memory to keep a list of recently altered components. These components are marked as tabu (or taboo - as in forbidden) for a set number of iterations (the tabu tenure) before they can be changed again.

The hope is that by restricting the search to new solutions, the algorithm will be able to escape local optima and find better solutions. Tabu search has been very successful in practice and is used in a wide range of applications.

No comments:

Post a Comment