Heuristics
Heuristics are techniques that can be used to solve
problems that do not guarantee an optimal solution, but often produce good
enough solutions in a reasonable amount of time. Heuristics can be used when an
exact solution is not necessary or when it is not possible to solve the problem
exactly within a reasonable amount of time. Some common types of heuristics
include genetic algorithms, simulated annealing, and greedy algorithms.
What happens if the problem’s just too big?
If a problem is too big to be
solved exactly, or if we just don’t have the time to wait for an exact
solution, we can use heuristics to find a solution which is likely to be close
to the optimal solution. Heuristics are techniques which have no guarantee of
finding the optimal solution, but which work well in practice. Some common
heuristics include:
● Genetic algorithms: inspired by natural
evolution, these algorithms involve generating a population of candidate
solutions, evaluating their quality, and then selecting the best ones to create
the next generation of solutions.
● Simulated annealing: this heuristic involves
starting with a randomly generated solution and then gradually refining it by
making small changes and accepting or rejecting them based on a probability
that decreases over time.
● Greedy algorithms: as we discussed earlier,
these algorithms involve making the locally optimal choice at each step and
hoping that these choices lead to a globally optimal solution.
●
Too big for brute force (likely)
Too big for even the most powerful solvers
In these cases, heuristics come to the rescue. Heuristics
are like algorithms, but they are not guaranteed to find the optimal solution.
They are used to find good solutions quickly. Heuristics can be very effective
in practice, but it is important to be aware of their limitations.
●
Too big or hard to transform into Integer
Programming (less likely but still common)
Too big for a heuristic method to
find the optimum solution in an acceptable amount of time
●
Too big for SAT Solvers? Constraint programming?
(not covered in this module but worth reading about)
In these cases, we often turn to
heuristics, which are algorithms that do not guarantee to find the optimal
solution, but can often find a good solution quickly. Heuristics are often used
in practice when the exact solution is not required, or when the problem is too
large to solve exactly in a reasonable amount of time. Some common heuristics
include simulated annealing, genetic algorithms, and local search.
What we often find ourselves doing is writing algorithms
that look to find a solution efficiently but can’t guarantee to return the true
optimum. These processes are called heuristics.
Some common heuristics include:
1.
Genetic algorithms: These algorithms use
principles of natural selection and genetics to iteratively improve a solution.
2.
Simulated annealing: This algorithm involves
slowly adjusting a solution to find the global optimum, similar to the way a
metal cools and solidifies.
3.
Ant colony optimization: This algorithm involves
simulating the behavior of ants as they search for food and use pheromone
trails to guide each other to the best solution.
4.
Hill climbing: This is a simple heuristic that
involves starting with a random solution and iteratively improving it by making
small changes until no further improvement can be found.
5.
Randomized search: This involves randomly
generating and evaluating potential solutions until a satisfactory one is
found.
While heuristics cannot guarantee to find the optimal
solution, they can often find good solutions quickly, and are often used in
practice when faced with large or complex optimization problems.
An example
Let’s take the vertex cover problem, since we looked at it
last week. We know we almost certainly can’t solve this in polytime, but how
about we try something that might get us a reasonable solution most of the
time.
●
Select the node with the largest number of
incident edges
●
Remove those edges from the graph
●
Repeat until there are no more edges left
One heuristic for
the vertex cover problem is the following:
Start with an empty
vertex cover (no vertices selected)
Select the vertex
with the most number of neighbors and add it to the vertex cover
Remove all of the
selected vertex's neighbors from the graph
Repeat steps 2 and
3 until all vertices have been removed or there are no more vertices with
neighbors
This heuristic
doesn't always produce the optimal vertex cover, but it often returns a
solution that is close to the optimal solution and runs in polynomial time.
Can you find a simple instance where this works
successfully?
It is not possible to find a simple instance where a
heuristic solution to the vertex cover problem would always work successfully,
as the vertex cover problem is NP-complete and therefore it is not possible to
find an efficient solution for all instances of the problem. However, it is
possible to develop heuristics that can find good approximate solutions for
some instances of the problem in a reasonable amount of time.
More importantly, can you find one where it doesn’t?
For example, consider the following graph:
1
2 3
4
The optimal vertex cover has size 2, and it can be achieved
by selecting the vertices 1 and 4.
On the other hand, consider the following graph:
1
|
2
|
3
|
4
The optimal vertex cover has size 1, and it can be achieved
by selecting any of the vertices 1, 2, 3, or 4. However, the heuristic of
selecting the vertices with the highest degree would result in a vertex cover
with size 2, as it would select vertices 1 and 2 (or 1 and 3, or 1 and 4).
Greedy heuristics
A greedy heuristic is an algorithm that makes the locally
optimal choice at each step, hoping that these choices will lead to a globally
optimal solution. In other words, it makes the most "appealing"
choice at each step, without worrying about the consequences of future choices.
For example, a greedy algorithm for the vertex cover problem
might choose the vertex with the most edges at each step, hoping that this will
lead to the smallest possible vertex cover. However, this might not always lead
to the optimal solution and can be used as a heuristic rather than a guaranteed
solution.
The previous example is another greedy algorithm - but in
this case it does not necessarily find the global optimum.
Greedy algorithms are quite common in heuristic design, as
they are simple to write and can often be improved using fairly standard
strategies (next week)
Another example for vertex cover might be:
●
Start with all nodes in the vertex cover
●
Remove the node with the smallest number of
covered, incident edges
○
Check to see if the cover is still valid, if
not, replace node and mark as ‘unremovable’
●
Repeat until there are no more removable nodes
Try both heuristics on these
You could use the integer program from last week to check if
the results are optimal
Construction and perturbation
Construction heuristics involve building up a solution
incrementally, one piece at a time. This can be done either in a greedy manner
(selecting the best option at each step) or by following a specific ordering of
the choices. Perturbation heuristics involve starting with an initial solution
and then making small changes to it to try and improve upon it. This can be done
either randomly or by following a specific set of rules. Both of these
techniques can be effective for finding approximate solutions to optimization
problems, but they cannot guarantee to find the true optimal solution.
There are essentially two kinds of heuristic, both of which
fall under the local search category.
●
Construction search - which starts with an empty
solution and adds elements until a feasible solution is found
●
Perturbation search - which starts with a
feasible solution and makes small changes to it until no more improvements can
be found
●
The first is a construction heuristic, which
starts from an empty solution and adds elements to it until a complete solution
is found. A common example of this is the nearest neighbour algorithm for the travelling
salesman problem, where at each step the next closest city is added to the
tour.
●
The second is a perturbation heuristic, which
starts with a complete solution and makes local changes to it in an attempt to
improve it. An example of this is the 2-opt algorithm for the travelling
salesman problem, which repeatedly swaps pairs of edges in the tour to try to
find a shorter route.
The first and third examples for vertex cover were
construction search while the second (the drop algorithm) is a (very simple)
form of perturbation search.
Construction heuristics work by building up a solution incrementally,
usually by adding one element at a time. They often work by selecting the most
promising element to add next according to some criterion. Perturbation
heuristics, on the other hand, work by starting with an existing solution and
then making small changes to it in the hope of finding a better solution. This
process is repeated until no further improvements can be found.
Generic algorithms
There are also generic algorithms that can be applied to a
wide range of problems, such as genetic algorithms and simulated annealing.
These algorithms work by starting with a random initial solution and then
iteratively making small changes to it in order to try to improve upon it. The
goal is to try to find a good solution in a reasonable amount of time, even if
it is not guaranteed to be the optimal solution. These algorithms are often
used in practice to solve hard optimization problems where no other efficient
solution is known.
Construction search:
●
Start with an empty solution
○
Select an element to add, according to some
criterion
○
Is the solution feasible yet?
■ If
so, stop
■ Else
repeat
Construction search involves starting with an empty
solution and adding elements to it iteratively until we have a valid solution.
This can be done in a greedy manner (choosing the best option at each step) or
using a more complex procedure. An example of a construction heuristic for the
vertex cover problem is the greedy algorithm which always adds the vertex with
the most incident edges to the cover. This is not guaranteed to find the
optimal solution, but often performs well in practice.
Perturbation search (iterative improvement):
●
Start with a feasible solution (perhaps found by
a construction algorithm)
○
Make a small change to the current solution
○
Is it better?
■ If
so, keep the new solution
■ Else
change back
○
Repeat until no more improvements can be found
In perturbation search, we start with an initial solution
and then repeatedly make small changes to it in an attempt to improve it. The
changes made to the solution are called "perturbations." This process
is repeated until no further improvements can be found. Perturbation search is
often used in combination with local search, where the initial solution is
obtained through a construction heuristic and then improved through iterative
perturbations.
Construction algorithms search the space of partial
solutions, looking for a feasible one, while iterative improvement algorithms
search the space of feasible solutions, looking for an improvement in objective
function value.
Practice
As with many things in life, practice is essential when
learning to do something new. Try to produce a heuristic for each of the
following problems:
●
Set cover
●
Travelling salesman
●
Subset sum
●
3-SAT
Set Cover:
● A greedy heuristic for the set cover problem
could be to repeatedly choose the set that covers the most uncovered elements
and add it to the solution, until all elements are covered. This heuristic is
not guaranteed to find the optimal solution, but it often performs well in
practice.
Travelling
Salesman:
● A common heuristic for the travelling
salesman problem is the nearest neighbor algorithm, which involves starting at
a random city and choosing the nearest unvisited city as the next destination
until all cities have been visited. The solution is then completed by returning
to the starting city. This heuristic is not guaranteed to find the optimal
solution, but it often performs well in practice.
Subset Sum:
● A simple heuristic for the subset sum
problem could be to sort the elements in the set in non-decreasing order and
then iteratively add the largest remaining element to the solution if it does
not exceed the target sum. This heuristic is not guaranteed to find the optimal
solution, but it often performs well in practice.
3-SAT:
● A heuristic for 3-SAT could be to repeatedly
choose a clause with the fewest number of unassigned variables and assign
values to those variables in a way that satisfies the clause. If no such
assignment is possible, the heuristic could choose a random unassigned variable
and flip its value. This heuristic is not guaranteed to find the optimal
solution, but it often performs well in practice.
When you’re done, read up on known perturbation heuristics
such as 2-opt and the Lin-Kernighan algorithm for the TSP.
For the travelling salesman problem, one heuristic is called
2-opt. This heuristic works by iteratively identifying and removing any edges
that are part of a sub-tour (a tour within a larger tour). These sub-tours are
then replaced by a new set of edges that connect the sub-tour's endpoints
directly to each other, bypassing the other vertices in the sub-tour. This
process is repeated until no sub-tours can be found.
Another heuristic for the TSP is the Lin-Kernighan
algorithm. This algorithm works by iteratively replacing a sequence of edges in
the current tour with a shorter alternative route. This process is repeated
until no further improvements can be found.

No comments:
Post a Comment