Friday, 27 January 2023

Heuristics


 

Heuristics

Heuristics are techniques that can be used to solve problems that do not guarantee an optimal solution, but often produce good enough solutions in a reasonable amount of time. Heuristics can be used when an exact solution is not necessary or when it is not possible to solve the problem exactly within a reasonable amount of time. Some common types of heuristics include genetic algorithms, simulated annealing, and greedy algorithms.

What happens if the problem’s just too big?

If a problem is too big to be solved exactly, or if we just don’t have the time to wait for an exact solution, we can use heuristics to find a solution which is likely to be close to the optimal solution. Heuristics are techniques which have no guarantee of finding the optimal solution, but which work well in practice. Some common heuristics include:

     Genetic algorithms: inspired by natural evolution, these algorithms involve generating a population of candidate solutions, evaluating their quality, and then selecting the best ones to create the next generation of solutions.

     Simulated annealing: this heuristic involves starting with a randomly generated solution and then gradually refining it by making small changes and accepting or rejecting them based on a probability that decreases over time.

     Greedy algorithms: as we discussed earlier, these algorithms involve making the locally optimal choice at each step and hoping that these choices lead to a globally optimal solution.

     Too big for brute force (likely)

Too big for even the most powerful solvers

In these cases, heuristics come to the rescue. Heuristics are like algorithms, but they are not guaranteed to find the optimal solution. They are used to find good solutions quickly. Heuristics can be very effective in practice, but it is important to be aware of their limitations.

     Too big or hard to transform into Integer Programming (less likely but still common)

Too big for a heuristic method to find the optimum solution in an acceptable amount of time

     Too big for SAT Solvers? Constraint programming? (not covered in this module but worth reading about)

In these cases, we often turn to heuristics, which are algorithms that do not guarantee to find the optimal solution, but can often find a good solution quickly. Heuristics are often used in practice when the exact solution is not required, or when the problem is too large to solve exactly in a reasonable amount of time. Some common heuristics include simulated annealing, genetic algorithms, and local search.

What we often find ourselves doing is writing algorithms that look to find a solution efficiently but can’t guarantee to return the true optimum. These processes are called heuristics.

Some common heuristics include:

1.       Genetic algorithms: These algorithms use principles of natural selection and genetics to iteratively improve a solution.

2.       Simulated annealing: This algorithm involves slowly adjusting a solution to find the global optimum, similar to the way a metal cools and solidifies.

3.       Ant colony optimization: This algorithm involves simulating the behavior of ants as they search for food and use pheromone trails to guide each other to the best solution.

4.       Hill climbing: This is a simple heuristic that involves starting with a random solution and iteratively improving it by making small changes until no further improvement can be found.

5.       Randomized search: This involves randomly generating and evaluating potential solutions until a satisfactory one is found.

While heuristics cannot guarantee to find the optimal solution, they can often find good solutions quickly, and are often used in practice when faced with large or complex optimization problems.

An example

Let’s take the vertex cover problem, since we looked at it last week. We know we almost certainly can’t solve this in polytime, but how about we try something that might get us a reasonable solution most of the time.

     Select the node with the largest number of incident edges

     Remove those edges from the graph

     Repeat until there are no more edges left

One heuristic for the vertex cover problem is the following:

Start with an empty vertex cover (no vertices selected)

Select the vertex with the most number of neighbors and add it to the vertex cover

Remove all of the selected vertex's neighbors from the graph

Repeat steps 2 and 3 until all vertices have been removed or there are no more vertices with neighbors

This heuristic doesn't always produce the optimal vertex cover, but it often returns a solution that is close to the optimal solution and runs in polynomial time.

Can you find a simple instance where this works successfully?

It is not possible to find a simple instance where a heuristic solution to the vertex cover problem would always work successfully, as the vertex cover problem is NP-complete and therefore it is not possible to find an efficient solution for all instances of the problem. However, it is possible to develop heuristics that can find good approximate solutions for some instances of the problem in a reasonable amount of time.

More importantly, can you find one where it doesn’t?

For example, consider the following graph:

       1

    2     3

       4

The optimal vertex cover has size 2, and it can be achieved by selecting the vertices 1 and 4.

On the other hand, consider the following graph:

       1

       |

       2

       |

       3

       |

       4

The optimal vertex cover has size 1, and it can be achieved by selecting any of the vertices 1, 2, 3, or 4. However, the heuristic of selecting the vertices with the highest degree would result in a vertex cover with size 2, as it would select vertices 1 and 2 (or 1 and 3, or 1 and 4).

Greedy heuristics

A greedy heuristic is an algorithm that makes the locally optimal choice at each step, hoping that these choices will lead to a globally optimal solution. In other words, it makes the most "appealing" choice at each step, without worrying about the consequences of future choices.

For example, a greedy algorithm for the vertex cover problem might choose the vertex with the most edges at each step, hoping that this will lead to the smallest possible vertex cover. However, this might not always lead to the optimal solution and can be used as a heuristic rather than a guaranteed solution.

The previous example is another greedy algorithm - but in this case it does not necessarily find the global optimum.

Greedy algorithms are quite common in heuristic design, as they are simple to write and can often be improved using fairly standard strategies (next week)

Another example for vertex cover might be:

     Start with all nodes in the vertex cover

     Remove the node with the smallest number of covered, incident edges

     Check to see if the cover is still valid, if not, replace node and mark as ‘unremovable’

     Repeat until there are no more removable nodes

 

Try both heuristics on these

You could use the integer program from last week to check if the results are optimal

 

 

 

 

 

 

 

 

 


Construction and perturbation

Construction heuristics involve building up a solution incrementally, one piece at a time. This can be done either in a greedy manner (selecting the best option at each step) or by following a specific ordering of the choices. Perturbation heuristics involve starting with an initial solution and then making small changes to it to try and improve upon it. This can be done either randomly or by following a specific set of rules. Both of these techniques can be effective for finding approximate solutions to optimization problems, but they cannot guarantee to find the true optimal solution.

There are essentially two kinds of heuristic, both of which fall under the local search category.

     Construction search - which starts with an empty solution and adds elements until a feasible solution is found

     Perturbation search - which starts with a feasible solution and makes small changes to it until no more improvements can be found

     The first is a construction heuristic, which starts from an empty solution and adds elements to it until a complete solution is found. A common example of this is the nearest neighbour algorithm for the travelling salesman problem, where at each step the next closest city is added to the tour.

     The second is a perturbation heuristic, which starts with a complete solution and makes local changes to it in an attempt to improve it. An example of this is the 2-opt algorithm for the travelling salesman problem, which repeatedly swaps pairs of edges in the tour to try to find a shorter route.

The first and third examples for vertex cover were construction search while the second (the drop algorithm) is a (very simple) form of perturbation search.

Construction heuristics work by building up a solution incrementally, usually by adding one element at a time. They often work by selecting the most promising element to add next according to some criterion. Perturbation heuristics, on the other hand, work by starting with an existing solution and then making small changes to it in the hope of finding a better solution. This process is repeated until no further improvements can be found.

Generic algorithms

There are also generic algorithms that can be applied to a wide range of problems, such as genetic algorithms and simulated annealing. These algorithms work by starting with a random initial solution and then iteratively making small changes to it in order to try to improve upon it. The goal is to try to find a good solution in a reasonable amount of time, even if it is not guaranteed to be the optimal solution. These algorithms are often used in practice to solve hard optimization problems where no other efficient solution is known.

Construction search:

     Start with an empty solution

     Select an element to add, according to some criterion

     Is the solution feasible yet?

     If so, stop

     Else repeat

Construction search involves starting with an empty solution and adding elements to it iteratively until we have a valid solution. This can be done in a greedy manner (choosing the best option at each step) or using a more complex procedure. An example of a construction heuristic for the vertex cover problem is the greedy algorithm which always adds the vertex with the most incident edges to the cover. This is not guaranteed to find the optimal solution, but often performs well in practice.

Perturbation search (iterative improvement):

     Start with a feasible solution (perhaps found by a construction algorithm)

     Make a small change to the current solution

     Is it better?

     If so, keep the new solution

     Else change back

     Repeat until no more improvements can be found

In perturbation search, we start with an initial solution and then repeatedly make small changes to it in an attempt to improve it. The changes made to the solution are called "perturbations." This process is repeated until no further improvements can be found. Perturbation search is often used in combination with local search, where the initial solution is obtained through a construction heuristic and then improved through iterative perturbations.

Construction algorithms search the space of partial solutions, looking for a feasible one, while iterative improvement algorithms search the space of feasible solutions, looking for an improvement in objective function value.

Practice

As with many things in life, practice is essential when learning to do something new. Try to produce a heuristic for each of the following problems:

     Set cover

     Travelling salesman

     Subset sum

     3-SAT

Set Cover:

     A greedy heuristic for the set cover problem could be to repeatedly choose the set that covers the most uncovered elements and add it to the solution, until all elements are covered. This heuristic is not guaranteed to find the optimal solution, but it often performs well in practice.

Travelling Salesman:

     A common heuristic for the travelling salesman problem is the nearest neighbor algorithm, which involves starting at a random city and choosing the nearest unvisited city as the next destination until all cities have been visited. The solution is then completed by returning to the starting city. This heuristic is not guaranteed to find the optimal solution, but it often performs well in practice.

Subset Sum:

     A simple heuristic for the subset sum problem could be to sort the elements in the set in non-decreasing order and then iteratively add the largest remaining element to the solution if it does not exceed the target sum. This heuristic is not guaranteed to find the optimal solution, but it often performs well in practice.

3-SAT:

     A heuristic for 3-SAT could be to repeatedly choose a clause with the fewest number of unassigned variables and assign values to those variables in a way that satisfies the clause. If no such assignment is possible, the heuristic could choose a random unassigned variable and flip its value. This heuristic is not guaranteed to find the optimal solution, but it often performs well in practice.

When you’re done, read up on known perturbation heuristics such as 2-opt and the Lin-Kernighan algorithm for the TSP.

For the travelling salesman problem, one heuristic is called 2-opt. This heuristic works by iteratively identifying and removing any edges that are part of a sub-tour (a tour within a larger tour). These sub-tours are then replaced by a new set of edges that connect the sub-tour's endpoints directly to each other, bypassing the other vertices in the sub-tour. This process is repeated until no sub-tours can be found.

Another heuristic for the TSP is the Lin-Kernighan algorithm. This algorithm works by iteratively replacing a sequence of edges in the current tour with a shorter alternative route. This process is repeated until no further improvements can be found.

No comments:

Post a Comment