Courses/CS 461/Winter 2006/Rick Strom/Week Nine
From CSWiki
Contents |
[edit] GP Path Finding
I will be attempting to evolve a path finding algorithm with which a unit can intelligently but realisticly move from one point on a map to another, given a set of obstacles and impediments.
I will be using, as a guide, Tobin Ehlis' Application of Genetic Programming to the Snake Game because of its similarities to this project -- in particular, he uses a similar set of terminals and functions.
I also am planning to implement the A* path finding algorithm in this application for easy comparison of the existing program and the evolved program. I will be using the STL A* implementation written by Justin Heyes-Jones. This particular implementation is slightly broken, and so will be modified to work correctly (see below).
[edit] Maps
Maps are based on a 40x40 grid of valid characters stored in a text file.
A few sample maps after loading and displaying graphically:
| http://img2.glowfoto.com/images/2006/03/05-1234108591T.jpg | http://img2.glowfoto.com/images/2006/03/05-1234174024T.jpg |
| A mountain pass. Darker greys indicate steeper areas on the map, and are more costly to cross. | A river and enemy cluster. Crossing water has around half the penalty of passing through enemy territory. |
[edit] Map Symbols
The intelligence will need to find a path along a 40x40 map, where each 1x1 square is marked by one of a set of symbols:
- X: Unpassable (a tree)
- 0-9: Sloped (with 0 being flat, 9 being totally uphill)
- ~: Water
- V: Enemy
Moving to any square adds 1 to the total distance of the path. While squares marked X can not be moved to, all other squares can, with an associated cost. Stepping on a square gives a penalty to the unit except when the path is marked 0 (flat). Water gives a penalty of 5, while enemy squares give a penalty of 9 (the algorithm I'm seeking will control units without human interference, and as such enemy avoidance will be important).
Because the unit is unaware of the map as a whole, I will still say that the best path is the shortest, as it will be the best path possible given the knowledge available. This is based on the assumption that an intelligent agent will attempt to reach his destination in the shortest time possible given the information he has available.
[edit] Map Cycling
While the application allows you to run the process on user defined maps, a truly good program should be successful on any map. To that end, the program can generate random maps, and regenerate the map automatically every x generations. Any well adapted algorithm will need to adapt further upon map cycling. Hopefully, after several map changes, the surviving algorithm should be able to handle any map thrown at it.
[edit] The Existing A* Algorithm
The A* path finding algorithm is the most widely known algorithm, and as such I am assuming it is the one most commonly used in games. If so, that's a shame, for it is a truly terrible algorithm. I hope to evolve an equal or better algorithm (in terms of realism, not speed or efficiency) than A*.
Why is it terrible? Have a look. The following screens are taken from my modified version of Justin Heyes-Jones' A* implementation (modified to work properly, to allow two modes of operation, and to be a well-designed class).
After my modification to the A* template, A* can operate in two modes: smart and stupid. Stupid A* only recognizes two states, passable or unpassable. Consequently the path is short, but not neccessarily efficient given the cost of the squares travelled upon. Smart A* considers those costs, and therefore finds the cheapest route, though not necessarily the shortest. Below are a few examples of why neither are perfect. The red line is the path discovered by A*. Green squares are unpassable. Blue is water, red is enemy territory, and darker greys represent steeper slopes.
| http://img2.glowfoto.com/images/2006/03/05-2139397365T.jpg | http://img2.glowfoto.com/images/2006/03/05-2153387457T.jpg | http://img2.glowfoto.com/images/2006/03/05-2153408038T.jpg | http://img2.glowfoto.com/images/2006/03/05-2204296038T.jpg |
| A* effectively finds a path through this maze. This demonstrates A*'s effectiveness at finding a path in a complex situation. Note the weird choice of the east loop near the end of the path, indicating that the algorithm "leans" towards its goal, even if there is a better path available. | But without considering a move's cost, stupid A* plows through any square it can, leading to a shorter, but more expensive path. | Smart A*, able to consider the cost of each move, finds a less expensive path. It avoids the steep areas if it has a choice between steep terrain and flat land. | But this condition is far too common. A*'s short-sighted decision making leads it away from water, right into a hornet's nest. |
You can see the inherent problems with this algorithm. When it operates efficiently, it operates wholly unrealistically. When it operates "intelligently," it runs the risk of making utterly horrendous decisions that no human intelligence would make (like opting for enemy contact rather than crossing the river).
[edit] How It Works
For a good explanation of how the A* algorithm works, see A* Pathfinding for Beginners by Patrick Lester.
[edit] Limitations
Bryan Stout, former AI programmer at MicroProse, sums up the limitations of A*:
There are situations where A* may not perform very well, for a variety of reasons. The more or less real-time requirements of games, plus the limitations of the available memory and processor time in some of them, may make it hard even for A* to work well. A large map may require thousands of entries in the Open and Closed list, and there may not be room enough for that. Even if there is enough memory for them, the algorithms used for manipulating them may be inefficient. The quality of A*’s search depends on the quality of the heuristic estimate h(n). If h is very close to the true cost of the remaining path, its efficiency will be high; on the other hand, if it is too low, its efficiency gets very bad. In fact, breadth-first search is an A* search, with h being trivially zero for all nodesthis certainly underestimates the remaining path cost, and while it will find the optimum path, it will do so slowly.
The full article is available at Gamasutra (registration required)
[edit] Sensors
The unit can sense the following:
- Water within 3 squares ahead
- Enemies within 1 square ahead
- Trees from 5 squares ahead
- If the forward square is more or less steep than the current square
- If the goal is ahead, behind, to the right or to the left [added March 09]
In-program, these are all boolean functions, and are represented as follows:
- isWaterAhead1, isWaterAhead2, isWaterAhead3
- isEnemyAhead
- isBlockedAhead1, isBlockedAhead2, isBlockedAhead3, isBlockedAhead4, isBlockedAhead5
- isSteeperAhead, isLessSteepAhead
- isGoalLeft, isGoalRight, isGoalAhead, isGoalBehind
These are the functions. I suspect this will need to be modified down the road to provide the programs with a suitable set of sensors.
Note that the isSteeperAhead and isLessSteepAhead essentially compare the cost of the next square relative to the current square (for example, water ahead of level3 slope would return true), and thus should allow a program that closely resembles smart A*. Similarly, the isBlockedAhead1 function allows a program to closely resemble stupid A*.
The isGoal* functions were added March 09 to allow algorithms some sense of direction (this drastically cut down on the aimlessness of the prior algorithms).
[edit] Moves
The unit has the following moves:
- Turn right
- Turn left
- Move forward
As represented in-program, these are:
- turnRight
- turnLeft
- moveForward
These are the terminals.
[edit] Solutions
Solutions are stored in a binary tree, with terminals as leaves, and functions as parent nodes. This will allow easy generation of a path.
Given the 40x40 grid, with a clear start and end position, the solution tree need only be traversed according to the values returned by its function nodes until a terminal is reached. One thing I need to consider here is how to handle a blocked path -- perhaps the result should be NO SOLUTION if the unit decides to take the blocked path? And then, should its fitness score just be some maximum value (say, 40*40)?
| http://img2.glowfoto.com/images/2006/03/06-1910399686T.jpg | http://img2.glowfoto.com/images/2006/03/07-1741123993T.jpg |
| A fairly complex solution displayed in the tree view. This path is complete, and was found by A*. | The path discovered by the current best solution. Notice it fails by crossing into a green X (tree) square. The program has been expanded to show the traversal of the program tree to get from the start position (0,0) to the first move (1,0). In the tree view, the top child is selected if the parent condition is true, and the bottom child if false. This path terminates in a moveForward terminal. |
[edit] Scoring
In the event a program finds a full path from the start point to the end point, the score is determined by adding the number of forward moves (turns are not counted) to the penalty aquired from each square traversed.
In the event a solution fails to find a complete path, the score is as above, for as long as the path reached, plus the manhattan distance from the terminal point of the path to the end point on the map, plus a FAILURE penalty of 20000. In this way, a full collection of failing algorithms will still be comparable according, mostly, to the manhattan distance to the goal. For example, if the path dies immediately upon leaving (0,0), its final score would be 20,000 + (39 + 39) = 20,078. This would be a better score than a solution which stumbled around before dying at (0,0), as such a solution would aquire some penalties and moves before dying in the same place (in other words, at least the first solution had the decency to die efficiently). It is not immediately clear if this is the correct way to score failed programs, but I am going with it for the time being. An alternative would be to disregard penalties until the solution finds a complete path.
[edit] Repopulation
The application contains functions for copying a tree (or subtree) to another location, deleting trees (or subtrees), generating random trees (or subtrees) of fixed depth, finding a random node in a tree and determining the depth of a tree.
Combining these functions allows:
- Mutation
- find random node
- determine depth of this node
- delete node
- generate random tree of this depth (2) at the mutation location (1)
- Crossover
- select Parent A
- select Parent B
- find random node in (1)
- find random node in (2)
- get a pointer to (3)
- copy (4) to (3)
- delete (4)
- copy (5) to (4)
- delete (5)
- Reproduction
Given the above, two parent programs (A, B) can create two child programs (C, D) as follows:
- copy A to C
- copy B to D
- get a random node in C
- get a random node in D
- swap (3) and (4)
Because of the potentially large size of solution trees in memory, population size will be important. I will be experimenting with different size populations on a system with 1GB memory running a 3.4GHz Pentium 4 processor.
[edit] Results
[edit] March 08
A few (non-genetic) test runs seemed to indicate that the best solution was likely to converge to a solution which minimizes to the following:
- if isBlockedAhead1 then turnLeft
- else moveForward
The resulting path is shown in the following image:
http://img2.glowfoto.com/images/2006/03/08-1507172181T.jpg
This has been the outcome of 20 consecutive random runs (where the run ends when a solution is found, with each round testing a new random solution). [note: all members of the population were random. No genetic selection occurred at all.]
This seemed to indicate that functions like isGoalWest, isGoalEast, isGoalSouth and isGoalNorth would be necessary to allow the program a way to think smarter than a wall-avoiding robot. These have not been added yet, but likely will be by the end of the night.
Upon running this with genetic selection, the result was as expected, but with a fairly surpising twist: the algorithm was not only reducable to the single node program listed above, it actually was that program. In other words, while the best algorithm was ultimately stupid, it was at least simplified. There is nothing in the program to encourage it to favor minimized programs over bloated ones, but it seemed to work out that way anyways.
[edit] March 09
The functions isGoalNorth (et al) actually ended up being very stupid functions, due to the fact that the unit in motion has direction (so while it can ask about the NSEW-ness of the goal, it can't make useful decisions based on that information).
Consequently, the four new functions added are instead isGoalForward, isGoalBehind, isGoalLeft and isGoalRight. These return information which takes into account the direction, and thus allows the program to make much more useful decisions.
So far, I am getting some useful paths, running with a population of 200, mutation probability of 8%, reproduction probability of 70%, and crossover probability of 95%. The solutions are relatively smart, and score surprisingly close to smart A*. On my Middle Earth map, in fact, consisting of a river, a mountain, and some trees, A* scores 93 while the evolved program scores 88. On this particular map, I was able to evolve a program which beat A* (although clearly a good algorithm would need to work on any map, not just a fixed map).
| http://img2.glowfoto.com/images/2006/03/09-1659319789T.jpg | http://img2.glowfoto.com/images/2006/03/09-1656276231T.jpg |
| The evolved program scores a 88. The best possible score for this map would be 83. | Smart A* scores an 93 on the same map. |
[edit] March 10
Crucial to finding a good generalized algorithm is making sure the algorithm succeeds and succeeds well on any map, even and especially because the map might change during the course of pathfinding (enemies move, for example).
| http://img2.glowfoto.com/images/2006/03/10-1644447179T.jpg |
| Running on a randomly generated map which cycles every 100 rounds. Best solution in red, second best in blue, all others in black. |
[edit] March 11
I made a small change tonight after class, adding a function doBoth to the function set which executes its left child (which is always a terminal), and continues down the right path. This gives the program memory.
50 generations turned up the following algorithm, which was impossible under the old function set:
| http://img2.glowfoto.com/images/2006/03/11-2212309134T.jpg | http://img2.glowfoto.com/images/2006/03/11-2218555219T.jpg |
| Algorithm is able to analyze its surroundings before making a decision. | Same algorithm on a different map. Notice it makes more turns and behaves much smarter. This one seems to like hugging the wall. |
Ok, I'm a believer now. That path was not possible even by random chance before.

