Problem-Solving Agents
Intelligent agents are supposed to act in such a way that the environment
goes through a sequence of states that maximizes the performance measure.
Unfortunately, this specification is difficult to translate into a
successful agent design. The task is simplified if the agent can adopt
a goal and aim to satisfy it.
Example Suppose the agent is in Auckland and wishes to get to Wellington. There are a number of factors to consider e.g. cost, speed and comfort of journey.
Goals such as this help to organize behaviour by limiting the objectives that the agent is trying to achieve. Goal formulation, based on the current situation is the first step in problem solving. In addition to formulating a goal, the agent may wish to decide on some other factors that affect the desirability of different ways of achieving the goal.
We will consider a goal to be a set of states - just those states in
which the goal is satisfied.
Actions can be viewed as causing transitions between states.
How can the agent decide on what types of actions to consider?
Problem formulation is the process of deciding what actions and states to consider. For now let us assume that the agent will consider actions at the level of driving from one city to another.
The states will then correspond to being in particular towns along the way.
The agent has now adopted the goal of getting to Wellington, so unless it is already there, it must transform the current state into the desired one. Suppose that there are three roads leaving Auckland but that none of them lead directly to Wellington. What should the agent do? If it does not know the geography it can do no better than to pick one of the roads at random.
However, suppose the agent has a map of the area. The purpose of a map is to provide the agent with information about the states it might get itself into and the actions it can take. The agent can use the map to consider subsequent steps of a hypothetical journey that will eventually reach its goal.
In general, an agent with several intermediate options of unknown value can decide what to do by first examining different possible sequences of actions that lead to states of known value and then choosing the best one.
This process is called search. A search algorithm takes a problem
as input and returns a solution in the form of an action sequence.
Once a solution is found, the actions it recommends can be carried out.
This is called the execution phase. Hence, we have a simple "formulate,
search, execute" design for the agent.
function SIMPLE-PROBLEM-SOLVING-AGENT(p) returns an
action
inputs: p, a percept
static: s, an action sequence, initially
empty
state,
some description of the current world state
g,
a goal, initially null,
problem,
a problem formulation
state <- UPDATE-STATE(state, p)
if s is empty then
g <- FORMULATE-GOAL(state)
problem <- FORMULATE-PROBLEM(state,
g)
s <- SEARCH(problem)
action <- RECOMMENDATIONS(s, state)
s<- REMAINDER(s,state)
return action
Formulating Problems
Formulating problems is an art. First, we look at the different amounts
of knowledge that an agent can have concerning its actions and the state
that it is in. This depends on how the agent is connected to its environment.
There are four essentially different types of problems.
First, suppose that the agent's sensors give it enough information to tell exactly which state it is in (i.e., the world is completely accessible) and suppose it knows exactly what each of its actions does.
Then it can calculate exactly what state it will be in after any sequence of actions. For example, if the initial state is 5, then it can calculate the result of the actions sequence {right, suck}.
This simplest case is called a single-state problem.
Now suppose the agent knows all of the effects of its actions, but world access is limited. For example, in the extreme case, the agent has no sensors so it knows only that its initial state is one of the set {1,2,3,4,5,6,7,8}. In this simple world, the agent can succeed even though it has no sensors. It knows that the action {right} will cause it to be in one of the states {2,4,6,8}. In fact, it can discover that the sequence {right, suck, left, suck} is guaranteed to reach a goal state no matter what the initial state is.
In this case, when the world is not fully accessible, the agent must reason about sets of states that it might get to, rather than single states. This is called the multiple-state problem.
The case of ignorance about the effects of actions can be treated similarly.
Example Suppose the suck action sometimes deposits dirt when there is none. Then if the agent is in state 4, sucking will place it in one of {2,4}. There is still a sequence of actions that is guaranteed to reach the goal state.
However, sometimes ignorance prevents the agent from finding a guaranteed solution sequence. Suppose that the agent has the nondeterministic suck action as above and that it has a position sensor and a local dirt sensor. Suppose the agent is in one of the states {1,3}. The agent might formulate the action sequence {suck, right, suck}. The first action would change the state to one of {5,7}; moving right would then change the state to {6,8}. If the agent is, in fact, in state 6, the plan will succeed, otherwise it will fail. It turns out there is no fixed action sequence that guarantees a solution to this problem.
The agent can solve the problem if it can perform sensing actions during execution. For example, starting from one of {1,3}: first suck dirt, then move right, then suck only if there is dirt there. In this case the agent must calculate a whole tree of actions rather than a single sequence, i.e., plans now have conditionals in them that are based on the results of sensing actions.
For this reason, we call this a contingency problem. Many problems in the real world are contingency problems. This is why most people keep their eyes open when walking and driving around.
Single-state and multiple-state problems can be handled by similar search techniques. Contingency problems require more complex algorithms. They also lend them selves to an agent design in which planning and execution are interleaved.
We will only consider cases where guaranteed solutions consist of a single action sequence.
The last type of problem is the exploration problem. In this type of problem, the agent has no information about the effects of its actions. The agent must experiment, gradually discovering what its actions do and what sorts of states exist. This is search in the real world rather than in a model.
Well-defined problems and solutions
We have seen that the basic elements of a problem definition are the states and actions. To capture these ideas more precisely, we need the following:
A path in the state space is simply any sequence of actions leading from one state to another.
To deal with multiple-state problems, a problem consists of an initial state set; a set of operators specifying for each action the set of states reached from any given state; and a goal test and path cost function. The state space is replaced by the state set space.
Measuring problem-solving performance
The effectiveness of a search can be measured in at least three ways:
Toy problems
The 8-puzzle problem
|
|
|
|
|
|
|
|
|
|
|
|
fig Start State
|
|
|
|
|
|
|
|
|
|
|
|
One important trick is to notice that rather than use operators such as "move the 3 tile into the blank space," it is more sensible to have operators such as "the blank space changes places with the tile to its left." This is because there are fewer of the latter kind of operator. This leads to the following formulation:
The Travelling Salesperson Problem is a famous touring problem in which each city in a network of connected cities must be visited exactly once. The goal is to find the shortest tour. This problem is NP-hard.
VLSI layout - a VLSI chips can have millions of gates and the positioning and connections to every gate are crucial to the successful operation of the chip. CAD tools are used in every phase of the process. Two difficult subtasks are cell layout and channel routing. The goal in cell layout is to place every cell on the chip with no overlap and enough room in between to place connecting wires. There are additional constrains such as placing particular cells near each other. Channel routing in the process of placing the connecting wires.
Robot navigation - Guide a robot about Mars.
Assembly sequencing - Robot assembly of complex objects.
Generating action sequences
To solve the route planning example problem, the first step is to test
if the current state is the goal state. If not, we must consider some other
states. This is done by applying the operators to the current state, thereby
generating
a new set of states. This process is called expanding the state.
Whenever there are multiple possibilities, a choice must be made about
which one to consider further.
This is the essence of search - choosing one option and putting the others aside for later, in case the first choice does not lead to a solution. We continue choosing, testing, and expanding until a solution is found or until there are no more states that can be expanded. The choice of which state to expand first is determined by the search strategy.
It is helpful to think of the process as building a search tree that is superimposed over the state space. The root of the tree is a search node corresponding to the initial state. The leaves of the tree are nodes that do not have successors either because they are goal states or because no operators can be applied to them.
The general search algorithm is described as follows:
function GENERAL-SEARCH(problem, strategy) returns
a
solution, or failure
initialise the search tree using the initial state of
the problem
loop do
if there are no candidates
for expansion then return failure
choose a leaf node for expansion
according to strategy
if the node contains
a goal state then return the corresponding solution
else expand the node
and add the resulting nodes to the search tree
end
Data structures for search trees
We will assume a search tree node is a five component structure with
the following components:
The EXPAND function is responsible for calculating each of the five
components above for the nodes that it generates.
We also need to represent the collection of nodes that are waiting
to be expanded - this collection is called the fringe or frontier.
For efficiency, the collection of node on the fringe will be implemented
as a queue using the following operations:
Now we can write a more formal version of the general search algorithm.
function GENERAL-SEARCH(problem, QUEUING-FN) returns
a
solution, or failure
nodes <- MAKE-QUEUE(MAKE-NODE(INITIAL_STATE[problem]))
loop do
if nodes is
empty then return failure
node <- REMOVE-FRONT(nodes)
if GOAL_TEST[problem]
applied to STATE(node) succeeds then return node
nodes <- QUEUING-FN(nodes,
EXPAND(node, OPERATORS[problem]))
end
Even though uninformed search is less effective than heuristic search, uninformed search is still important because there are many problems for which information used to make informed choices is not available.
We now present six uninformed search strategies:
Breadth-first search
In this strategy, the root node is expanded first, then all of the
nodes generated by the root are expanded before any of their successors.
Then these successors are all expanded before any of their successors.
In general, all the nodes at depth d in the search tree are expanded
before any nodes at d+1. This kind of search can be implemented by using
the queuing function ENQUEUE-AT-END.
Breadth-first search is complete and optimal provided the path cost is a nondecreasing function of the depth of the node. Lets look at the amount of time and memory it requires. Consider a hypothetical search tree with b successors for every node. Such a tree is said to have a branching factor of b. If the first solution is a depth d, then the maximum number of nodes expanded before reaching a solution is 1+b+b2+b3+...+bd.
This is not good. For example consider the following table for b=10,
1 node/ms, 100 bytes/node:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note that memory requirements are a bigger problem here than execution
time.
In general, exponential complexity search problems cannot be solved
for any but the smallest instances.
Uniform cost search (Dijkstra's Algorithm)
This strategy modifies breadth-first search to account for general
path cost. Now the lowest cost node on the fringe is always the first one
expanded (recall that the cost of a node is the cost of the path to that
node).
When certain conditions are met, the first solution found is guaranteed to be the cheapest.
Consider an example
Uniform cost search finds the cheapest solution provided the cost of a path never decreases as it gets longer, i.e., g(SUCCESSOR(n)>=g(n) for every node n.
Depth-first search
This search strategy always expands one node to the deepest level of the tree. Only when a dead-end is encountered does the search backup and expand nodes at shallower levels.
This strategy can be implemented by calling GENERAL-SEARCH with a queuing
function that always puts the newest node on the front ( ENQUEUE-AT-FRONT)
Depth-first search has modest memory requirements, storing only a single
path from root to the current node along with the remaining unexpanded
sibling nodes for each node on the path. For a state space with branching
factor b and maximum depth m, depth first search requires storage of only
bm nodes, in contrast to bd for breadth-first search. Using
the same assumptions as our analysis of breadth-first search, depth-first
search would require 12 kilobytes instead of 111 terabytes at depth 12,
a factor of 10 billion times less space.
For problems that have a lot of solutions, depth-first search is often faster than breadth-first search because it has a good chance of finding a solution after exploring only a small portion of the search space.
However, depth-first search can get stuck on a wrong path in infinite search spaces.
It is common to implement depth-first search with a recursive function that calls itself on each of its children in turn. In this case, the queue is stored implicitly in the local state of each invocation on the calling stack.
Depth-limited search
This strategy avoids the pitfalls of depth-first search by imposing
a cut-off on the maximum depth of a path. Depth-first search is used to
search to the given depth.
Depth-limited search is complete but non optimal and if we choose a
depth-limit that is too shallow its not even complete.
Iterative deepening search
The hard part about depth-limited search is picking the right depth
limit. Most of the time we will not know what to pick for a depth limit.
Iterative deepening search is a strategy that side-steps the issue of choosing
the depth limit by trying all possible limits in increasing order.
function ITERATIVE-DEEPENING-SEARCH(problem) returns
a
solution sequence
inputs: problem, a problem
for depth <- 0 to infiniity
do
if DEPTH-LIMITED-SEARCH(problem,
depth) succeeds then return its result
end
return failure
In effect, this search strategy keeps the best features of both breadth-first and depth-first search. It has the modest memory requirements of depth-first search but is optimal and complete like breadth-first search. Also it does not get stuck in dead ends.
Iterative deepening may seem wasteful because many states are expanded multiple times. For most problems, however, the overhead of this multiple expansion is actually rather small because a majority of the nodes are at the bottom of the tree.
So instead of:
1+b+b2+...+bd-2+bd-1+bd
we have:
(d+1)1+(d)b+(d-1)b2+...+(3)bd-2+(2)bd-1+bd
For b=10 and d=5, the numbers are 111,111 and 123,456. So iterative
deepening is 11% more costly than depth-limited in this case.
In terms of complexity numbers, iterative deepening has the same asymptotic
complexity as breadth-first, but has space complexity O(bd).
In general, iterative deepening is the preferred search method when
there is a large search space and the depth of the solution is unknown.
Bidirectional Search
The idea here is to search forward from the initial state and backward
from the goal state simultaneously. When it works well, the search meets
in the middle. For problems where the branching factor is b in both directions,
bidirectional search can make a big difference.
The solution will be found in O(bd/2) steps because both searches
have to go only half way. For example, if b=10 and d=6, breadth-first generates
1,111,111 nodes, whereas bidirectional search generates only 2,222 nodes.
Even though this sounds great in theory, there are several issues that
must be addressed before this algorithm can be implemented.
| Criterion |
|
|
|
|
|
|
| Time |
|
|
|
|
|
|
| Space |
|
|
|
|
|
|
| Optimal ? |
|
|
|
|
|
|
| Complete ? |
|
|
|
|
|
|
Even when the tree is finite, avoiding repeated states can yield exponential speedups. The classic example is a space that contains only m+1 states. The tree contains every possible path which is 2m branches.
There are three ways to deal with repeated states, in increasing order of effectiveness and complexity:
Example The 8-queens problem can be posed as a CSP in which the variables are the locations of the eight queens; the possible values are squares on the board; and the constraints state that no two queens can be in the same row, column, or diagonal.
Cryptoarithmetic and VLSI layout can be posed as CSPs.
Constraints can be unary, binary, or higher-order:
Each variable vi in a CSP has a domain Di which is the set of possible values that the variable can take on. The domain can be either discrete or continuous.
A unary constraint specifies the allowable subset of a variable's domain and a binary constraint specifies the allowable subset of the cross-product of the two domains. In discrete CSPs with finite domains, constraints can be represented simply by enumerating the allowable combinations of values.
Example in 8-queens, the no attack constraint can be represented by a set of pairs of allowable values: {<1,3>,<1,4>,<1,5>,...,<2,4>,<2,5>,...}.
Constraints involving continuous variables cannot be enumerated in this way. Here we consider only CSPs with discrete, absolute, binary or unary constraints.
The initial state will be the state in which all variables are unassigned. Operators will assign a value to a variable from the set of possible values. The goal test will check if all variables are assigned and all constraints satisfied.
We can use depth-limited search because all solutions are of depth n, where n is the number of variables.
In the most naive implementation, any unassigned variable in a state
can be assigned a value by and operator. Hence, the branching factor is
the sum of the cardinalities of the different domains, i.e.,
,
e.g., 64 in the 8-queens problem. Then the search space would be 648.
But note that the order of assignments makes no difference to the final
solution. So we can choose a value for a single variables at each node.
This results in a search space that is
,
or 88 in 8-queens.
A straightforward depth-limited search will examine all of these possible states. CSP contain some well known NP-complete problems and so in the worst case any algorithm to solve CSPs will be exponential.
In most real problems, however, we can take advantage of problem structure to eliminate large parts of the search space.
In CSPs, the goal test is a set of constraints which can be considered individually, as opposed to all together. Depth-limited search on a CSP wastes time searching when constraints have already been violated. For example, suppose we put the first two queens in the top row. DLS will examine all 86 possibilities for the remaining queens before discovering that no solution exists in that subtree.
A substantial improvement is to insert a test before the successor generation step to check whether any constraint has been violated by the variable assignments up to this point. The resulting algorithm backtracks immediately to try something else. This is called backtracking search.
Backtracking still has some failings. Suppose the squares chosen for the first six queens make it impossible to place the seventh queen, because they attack all 8 squares in the last column. Backtracking will still try all possible placings of the seventh queen. Forward checking avoids this problem by looking ahead to detect unsolvability. Each time a variable is instantiated, forward checking deletes from the domains of the as-yet-uninstantiated variables all of those values that conflict with the variables assigned so far. If any of the domains becomes empty, then the search backtracks immediately.
Forward checking is a special case of arc consistency checking. A state is arc-consistent if every variable has a value in its domain that is consistent with each of the constraints on that variable. This can be achieved by successive deletion of values that are inconsistent with some constraint. As values are deleted, other values may become inconsistent because they relied on the deleted values. This behaviour is called constraint propagation: choices are gradually narrowed down. Sometimes arc consistency is all that is needed to completely solve a problem.
Queueing-Fn <- a function that orders nodes
by EVAL-FN
return GENERAL-SEARCH(problem, Queueing-Fn)
Note that the name best-first search is actually a misnomer because
if it really always expanded the best node first, there would be no search
at all.
What it, in fact, does is choose next the node that appears to
be the best.
Just as there is a whole family of GENERAL-SEARCH algorithms with different
ordering functions, there is a whole family of BEST-FIRST-SEARCH algorithms
with different evaluation functions. These algorithms typically use some
estimated measure of the cost of the solution and try to minimize it. One
such measure is cost of the path so far. We have already used that one.
In order to focus the search, the measure must incorporate some estimate
of the cost of the path from a state to the goal.
Minimize estimated cost to reach a goal: Greedy search
The simplest best-first strategy is to minimize the estimated cost
to reach the goal, i.e., always expand the node that appears to be closest
to the goal. A function that calculates cost estimates is called an heuristic
function.
To get an idea of what an heuristic function looks like, lets look at a particular problem. Here we can use as h the straight-line distance to the goal. To do this, we need the map co-ordinates of each city. This heuristic works because roads tend to head in more or less of a straight line.
This figure shows the progress of a greedy search to find a path from
Arad to Bucharest.
For this problem, greedy search leads to a minimal cost search because
no node off the solution path is expanded. However, it does not find the
optimal path: the path it found via Sibiu and Fagaras to Bucharest is 32
miles longer than the path through Pimnicu Vilcea and Pitesti.
Hence, the algorithm always chooses what looks locally best, rather
than worrying about whether or not it will be best in the long run. (This
is why its called greedy search.)
Greedy search is susceptible to false starts. Consider the problem
of getting from Iasi to Fagaras. h suggests that Neamt be expanded first,
but it is a deadend. The solution is to go first to Vaslui and then continue
to Urziceni, Bucharest and Fagaras.
Note that if we are not careful to detect repeated states, the solution
will never be found - the search will oscillate between Neamt and Iasi.
Greedy search resembles dfs in the way that it prefers to follow a
single path to the goal and backup only when a deadend is encountered.
It suffers from the same defects as dfs - it is not optimal and it is incomplete
because it can start down an infinite path and never try other possibilities.
The worst-case complexity for greedy search is O(bm),
where m is the maximum depth of the search. Its space complexity is the
same as its time complexity, but the worst case can be substantially reduced
with a good heuristic function.
Minimizing the total path cost: A* search
Recall that uniform search minimizes the cost of the path so far, g(n):
it is optimal and complete, but can be very inefficient. We can combine
the use of g(n) and h(n) simply by summing them
Hence, if we are trying to find the cheapest solution, a reasonable thing to try is the node with the lowest value for f. It is possible to show that this strategy is complete and optimal, given a simple restriction on the h function: h must never overestimate the cost to reach the goal. Such an h is called an admissible heuristic.
If h is admissible, f(n) never overestimates the actual cost of the best solution through n.
Best-first search using f and an admissible h is known as A* search.
Notice that hsld (straight-line distance) is an addmissible heuristic because the shortest path between two points is a straight line. The following diagram shows the first few steps of the A* search for Bucharest using hsld.
Notice that A* prefers to expand from Rimnicu Vilcea rather
than Fagaras. Even though Fagaras is closer to Bucharest, the path taken
to get to Fagaras is not as efficient in getting close to Bucharest
as the path taken to get to Rimnicu.
The behaviour of A* search
An important thing to notice about the example A* search
above: along any path from the root, the f-cost never decreases. This fact
holds true for almost all admissible heuristics. An heuristic with this
property is said to exhibit monotonicity.
Monotonicity is not quite the same as admissible. In a few cases, heuristics are admissible, but not monotonic.
When our heuristics are not monotonic, we will make them so, so we can assume that if an heuristics is admissible it is monotonic.
Now, we can take a look at what is going on in A*. If f never decreases along any path out of the root, we can conceptually draw contours in the state space.
Because A* expands the leaf node of lowest f, an A*
search fans out from the start node, adding nodes in concentric bands of
increasing f-cost.
With uniform-cost search (A* using h=0), the bands will
be "circular" around the start node. WIth more accurate heuristics, the
bands will stretch towards the goal state and become narrowly focused around
the optimal path. If we define f* to the the cost of the optimal
solution path, we can say:
A* is optimally efficient for any given admissible heuristic function. This is because any algorithm that does not expand all nodes in the contours between the root and the goal runs the risk of missing the optimal solution.
Complexity of A*
The catch with A* is that even though its complete, optimal
and optimally efficient, it still can't always be used, because for most
problems, the number of nodes within the goal contour search space is still
exponential in the length of the solution.
Similarly to breadth-first search, however, the major difficulty with A* is the amount of space that it uses.
|
|
|
|
|
|
|
|
|
|
|
|
fig Start State
|
|
|
|
|
|
|
|
|
|
|
|
By keeping track of repeated states, this number can be cut down to 9!=362,880 different arrangements of 9 squares. We need a heuristic to further reduce this number.
If we want to find shortest solutions, we need a function that never overestimates the number of steps to the goal. Here are two possibilities:
Usually, the effective branching factor of an heuristic is fairly constant over problem instances and, hence, experimental measurements of b* on a small set of problems provides a good approximation of the heuristic's usefulness.
A well designed heuristic should have a value of b* that
is close to 1. The following is a table of the performance of A*
with h1 and h2 above on 100 randomly generated problems. It shows that
h2 is better than h1 and that both are much better than iterative deepening
search.
|
|
|
|||||
| d |
|
|
|
|
|
|
| 2 |
|
|
|
|
|
|
| 4 |
|
|
|
|
|
|
| 6 |
|
|
|
|
|
|
| 8 |
|
|
|
|
|
|
| 10 |
|
|
|
|
|
|
| 12 |
|
|
|
|
|
|
| 14 |
|
|
|
|
|
|
| 16 |
|
|
|
|
|
|
| 18 |
|
|
|
|
|
|
| 20 |
|
|
|
|
|
|
| 22 |
|
|
|
|
|
|
| 24 |
|
|
|
|
|
|
Note that h2 is always a better heuristic function than h1.
Inventing heuristic functions
Is is possible for heuristic functions to be invented mechanically?
In many cases, the answer is yes. In the 8-puzzle case, h1 and h2 can
be considered perfectly accurate path lengths for simplified versions of
the 8-puzzle problem. That is, if the rules of the puzzle were changed
so that a tile could be moved anywhere instead of just into an opening,
h1 would accurately give the number of steps in the shortest solution.
Is there a simplified problem in which h2 gives correct path lengths?
Yes. If a tile could move a square in any direction, even onto an occupied
square.
A problem with less restrictions on its operators is called a relaxed
problem. It is often the case that the cost of an exact solution to
a relaxed problem is a good heuristic for the original problem.
If a problem definition is written down in a formal language, it is
possible to automatically generate relaxations. Again in 8-puzzle, the
operators can be described as:
It turns out that if there are several choices for heuristic functions with none clearly dominating the others, and they are all admissible, then they can be combined as
Often it is possible to pick out features of a state that contribute to its heuristic evaluation, even if it is hard to say what the contribution should be. For example, the goal in chess is to checkmate and the relevant features include number of pieces of each kind belonging to each side, the number of pieces that are attacked by opponent's pieces, etc.
Usually, the evaluation function is assumed to be a linear combination of the features and even if we don't know the weights, it is possible to use a learning algorithm to acquire them.
We have not considered the cost of computing the heuristic functions in our discussion. So long as its about the cost of expanding a node, this is not an issue, but when the cost is as high as expanding several hundred nodes, it can no longer be ignored.
Heuristics for constraint satisfaction
We have so far examined variants of depth first search for solving
CSPs. Now we extend the analysis by considering heuristics for selecting
a variable to instantiate and for choosing a value for the variable.
Suppose we are trying to colour the following map with three colours
so that no two adjacent regions have the same colour.
Clearly the best next move is to colour region E first because the
only possible colour for it is blue. All the other regions have a choice
and we might make the wrong one requiring us to backtrack.
This idea is called the most-constrained-variable heuristic. It is used with forward checking. At each point in the search, the variable with the fewest possible values is chosen to have a value assigned next. In this way the branching factor in the search tends to be minimized. When this heuristic is used in solving the n-queens problems, the maximum value of n goes from 30 for forward checking to around 100.
The most-constraining-variable heuristic attempts to reduce the branching factor on future choices by assigning a value to the variable that is involved in the largest number of constraints on other unassigned variables.
Once a variable has been selected, a value must still be chosen for it. For example, suppose we decide to assign a value to region C after A and B. Red is a better choice than blue because it leaves more freedom for future choices. This is the least-constraining-value heuristic - choose a value that rules out the smallest number of values in variables connected to the current variable by constraints. When applied to the n-queens problem, it allows problems up to n=1000 to be solved.
function DFS-CONTOUR(node, f-limit) returns
a
solution sequence and a new f-COST limit
inputs: node, a node
f-limit, the current f-COST limit
local variables: next-f, the f-COST limit for
the next contour, initially infinity
if f-COST[node]
> f-limit then return null, f-COST[node]
if GOAL-TEST[problem](STATE[node])
then return node, f-limit
for each node
s in SUCCESSORS(node) do
solution, new-f <-
DFS-CONTOUR(s,
f-limit)
if solution
is non-null then return solution, f-limit
next-f <- MIN(next-f,
new-f)
end
return null, next-f
Uses the same trick we used to make breadth-first search memory efficient.
In this algorithm each iteration is dfs, just as in regular iterative
deepening. However, the search is modified to use an f-cost limit rather
than a depth limit.
Hence, each iteration expands all nodes inside the contour for the
current f-cost, peeping over the contour to find out where the next contour
lies.
Once the search inside a contour is completed, a new iteration is started
using a new f-cost for the next contour.
IDA* is complete and optimal, but because it is depth-first it only requires space proportional to the longest path that it explores. b*d is a good estimate of storage requirements.
The algorithm's time complexity depends on the number of different values that the h function can take on. The city block distance function used in the 8-puzzle takes on one of a small number of integer values. So typically, f only increases two or three times along any solution path. Hence, IDA* only goes through 2 or 3 iterations. Because IDA* does not need to maintain a queue of nodes, its treatment of nodes is faster.
Unfortunately, IDA* has difficulty in more complex domains. In the travelling salesperson problem, for example, the heuristic value is different for every state. As a result, the search is much more like iterative deepening: if A* expands N nodes, IDA* will have to go through N iterations and will therefore expand 1+2+...+N=O(N2) nodes.
One way around this problem is to increase the f-cost limit by a fixed amount [e] on each iteration, so that the total number of iterations is proportional to 1/[e]. This can reduce the search cost at the expense of returning solutions that can be worse than optimal by at most [e]. Such an algorithm is called [e]-admissible.
SMA* search
IDA*s difficulties in some problem spaces can be traced to using too little memory. Between iterations, it retains only a single number, the current f-cost limit.
SMA*(Simplified memory-bounded A*) can make use of all available memory to carry out the search. SMA* has the following properties:
if SUCCESSORS(n)
all in memory then remove n from Queue
if memory is fullthen
delete shallowest,
highest f-cost node in Queue
remove it from its
parent's successor list
insert it's parent on
Queue if necessary
insert s on Queue
end
Example Here is a hand simulation of SMA* on
illustrates how it works.
In this case, it would return D which is a reachable, non-optimal solution.
Given a reasonable amount of memory, SMA* can solve significantly more difficult problems that A* without incurring significant overhead in terms of extra nodes generated. It performs well on problems with highly connected state spaces with real-valued heuristics, on which IDA* has difficulty.
For this kind of algorithm, consider a grid containing all of the problem states. The height of any point corresponds to the value of the evaluation function on the state at that point.
The idea of iterative improvement algorithms is to move around the grid, trying to find the highest peaks, which are optimal solutions.
These algorithms usually keep track only of the current state and do not look ahead beyond immediate neighbors.
There are two major classes of iterative improvement algorithms: Hill-climbing algorithms try to make changes that improve the current state. Simulated annealing algorithms sometimes make changes that make things worse.
Hill-Climbing Search
Always move in a direction of increasing quality. This simple policy has three well-known drawbacks:
Ridges - a ridge can have steeply sloping sides, so that the
search reaches the top with ease, but on the top may slope gently toward
a peak. Unless there happen to be operators that move directly along the
top of the ridge, the search may oscillate from side to side making little
or no progress.
current <- MAKE-NODE(INITIAL-STATE[problem])
loop do
next <- a highest-valued
successor of current
if VALUE[next]
< VALUE[current] then return current
current <- next
end
Simulated Annealing
Allow hill-climbing to take some downhill steps to escape local maxima.
The innermost loop of simulated annealing is quite similar to hill-climbing. Instead of picking the best move, however, it picks a random move. If the move actually improves the situation, it is executed. Otherwise, the algorithm make the move with some probability less than 1. The probability decrease exponentially with the badness of the move - the amount [[Delta]]E by which the evaluation is worsened.
A second parameter T is also used to determine the probability. At higher values of T, bad moves are more likely to be allowed. As T tends towards 0, they become more and more unlikely, until the algorithms behaves like hill-climbing. The schedule input determines the value of T as a function of how many cycles already have been completed.
Simulated annealing was developed from an explicit analogy with annealing - the process of gradually cooling a liquid until it freezes. The value function corresponds to the total energy of the atoms in the material and T corresponds to the temperature. The schedule determines the rate at which the temperature is lowered. Individual moves in the state space correspond to random fluctuations due to thermal noise. One can prove that if the temperature is lowered sufficiently slowly, the material will attain a lowest-energy configuration. This corresponds to the statement that if schedule lowers T slowly enough, the algorithm will find a global optimum.
function SIMULATE-ANNEALING(problem, schedule) returns
a solution state
inputs: problem, a problem
schedule,
a mapping from time to "temperature"
static: current, a node
next,
a node
current <- MAKE-NODE(INITIAL-STATE[problem])
for t <- 1 to infinity do
T <- schedule
[t]
if T
= 0 then return current
next <- a randomly
selected successor of current
deltaE <- VALUE[next]
- VALUE[current]
if deltaE > 0
then current <- next
else current
<- next with probability exp(deltaE/T)
Applications to constraint satisfaction problems
CSPs can be solved by iterative improvement methods by first assigning values to all the variables and then applying modification operators to move the configuration towards a solution. These operators simply assign a different value to a variable, e.g., 8-queens.
Algorithms that solve CSPs in the fashion are called heuristic repair methods. In choosing a new value for a variable, the most obvious heuristic is to select the value that results in the minimum number of conflicts with other variables. This is called the min-conflicts heuristic.