Skip to content

Latest commit

 

History

History
272 lines (209 loc) · 12.1 KB

Agents.md

File metadata and controls

272 lines (209 loc) · 12.1 KB

🤖 AI Agents to play The Royal Game of Ur

This file contains a list of all the current artifical intelligence (AI) strategies that are implemented in this project to play The Royal Game of Ur. This file aims to describe the algorithms used by each of the agents. It also contains statistics for each of the agents to demonstrate their strengths and weaknesses.

📈 Current Rankings

We have several AI agents designed to play The Royal Game of Ur. Therefore, it is useful to play them against one another to get an idea of their relative strengths.

Agent Win Percentage Win % when light Win % when dark
Expectimax (Depth 7) 85.1% 88.7% 81.5%
Expectimax (Depth 5) 79.2% 84.4% 74.0%
Expectimax (Depth 3) 67.8% 74.3% 61.3%
Greedy 53.4% 55.9% 50.8%
Last-Move 44.7% 47.0% 42.3%
Random 19.6% 19.9% 19.3%
First-Move 0.2% 0.2% 0.2%

These results may not show us the best way to play the game yet, but they definitely tell us the worst. If you want to lose every game you play, there is no better strategy than always moving your least advanced piece!

🎲 The Random Agent 🎲

The random agent is probably the most self-explanatory agent we have. When it is the random agent's turn, it looks through all of its legal moves, and picks a random one. Surprisingly, the random agent remains quite effective against the worst strategies in the game. Unsurprisingly however, more advanced agents wipe the floor with the random agent.

📊 Random vs. Expectimax (Depth 5)

Here are the results of a game between the random agent and one of our best strategies: expectimax with a depth of 5:

Agent Win Percentage
Expectimax (Depth 5) 99.5%
Random 0.5%

Surprisingly, even against our best agent random still has a chance! If you play randomly, you could expect that 1 in 200 games, you might get lucky enough to win!

📄 Random Agent Source Code

The source code for the random agent is available in RandomAgent.java.

👶 The First-Move Agent 👶

The first-move agent is the worst possible strategy we have at playing The Royal Game of Ur. This agent always picks the least advanced piece to move.

📊 First-Move vs. Random

How bad could the first-move agent really be? The answer is very, very bad. Here are some results of the first-move agent playing against a random player:

Agent Win Percentage
Random 99.2%
First-Move 0.8%

As you can see, even against someone playing randomly, this agent is very unlikely to win. This can be understood by considering that often this agent would get all of its pieces onto the board before it ever considers taking a piece off the board. This gives ample opportunity for its opponent to take its pieces before it decides to advance them.

📄 First-Move Agent Source Code

The source code for the first-move agent is available in FirstMoveAgent.java.

👴 The Last-Move Agent 👴

The last-move agent is a much better alternative to the first-move agent when playing The Royal Game of Ur. This agent always picks the most advanced piece to move. This leads this agent to perform much better than random, however it is still not up to par when playing against more advanced agents.

📊 Last-Move vs. Random

Here are the results of the last-move agent being played against the random agent:

Agent Win Percentage
Last-Move 89.7%
Random 10.3%

Somewhat surprisingly, the random agent still has a chance against this simple strategy. Against more advanced strategies however, this simple strategy performs much more poorly.

📊 Last-Move vs. Expectimax (Depth 5)

Here are the results of the last-move agent being played against our best expectimax agent with a depth of 5:

Agent Win Percentage
Expectimax (Depth 5) 88.7%
Last-Move 11.3%

This shows us that although the last-move agent's strategy is much better than picking random moves, there are still much better strategies to choose from.

📄 Last-Move Agent Source Code

The source code for the last-move agent is available in LastMoveAgent.java.

🤑 The Greedy Agent 🤑

The greedy agent is an agent that starts to develop some semblance of tactics while playing The Royal Game of Ur. This agent is similar to the last-move agent, except it prioritises taking pieces and moving pieces onto rosettes.

Here is the decision-making process of the greedy agent:

  1. Can the agent capture any pieces? If it can, pick the most advanced piece that can make a capturing move, and move it.
  2. Can the agent move any pieces onto rosette tiles? If it can, pick the most advanced piece that can move onto a rosette, and move it.
  3. Move the most advanced piece it can!

This helps the greedy agent gain a modest advantage over the last-move agent.

📊 Greedy vs. Last-Move

Here are the results of the greedy agent playing against the last-move agent:

Agent Win Percentage
Greedy 62.9%
Last-Move 37.1%

These results demonstrate that the greedy agent does indeed perform better than the last-move agent. The difference between them though is modest, with the greedy agent winning only 25.8% more games than the last-move agent.

📊 Greedy vs. Expectimax (Depth 5)

Here are the results of the greedy agent playing against a depth-5 expectimax:

Agent Win Percentage
Expectimax (Depth 5) 81.8%
Greedy 18.1%

As these results show, the greedy agent still gets easily defeated by the depth-5 expectimax agent. This suggests that the strategy involved in The Royal Game of Ur still dominates the match, even when playing very sensible moves.

📄 Greedy Agent Source Code

The source code for the greedy agent is available in GreedyAgent.java.

📒 The Expectimax Agent 📒

The expectimax agent is currently our best agent. Expectimax (Expectation Maximisation) is our first agent to look ahead at potential future moves, with an aim at maximising the expected outcome of the moves it makes.

It does this by considering all the possible moves that could happen to some depth into the future. It then scores all the possible end states after these moves, and does a statistical analyser to determine which move is expected to maximise its own score.

Scoring End-States

The expectimax agent scores the potential end-states using the formula Σ own pieces advancement - Σ opponent's pieces advancement, where pieces advancement represents how many tiles each piece has been moved. For example, if you moved a piece four spaces, you would gain 4 score. The only way to lose score is if the opponent captures any of your pieces.

Canonicalising Wins: An enhancement to the pieces advanced utility function is to canonicalise wins/losses to the maximum and minimum possible scores respectively. This has the advantage that no one win is better than another, and that there is no "good" way to lose. If you win, you get the maximum possible score, no matter if you win by 1 tile or 5 tiles. Additionally, if you are about to lose, this canonicalisation prioritises doing whatever possible to try to win, however slim the chances are. Otherwise, the AI may just try to improve its score so that it doesn't lose by as much.

Utility Function Win Percentage
Canocalising Wins 50.8%
Pieces Advanced 49.2%

Statistical Analysis of End-States

The score (otherwise termed utility) of each end-state is collapsed up the tree of potential moves in two stages:

  1. The move that is considered best by the active player is chosen, and the utility after that move is propagated up to step 2.
  2. The best moves after each possible dice roll is determined. These utilities are then combined using a weighted sum based upon the probability of each dice roll.

This is repeated until we get all the way back up to the current state of the game. At this point, we now have an accurate score for each move, and we can pick the move with the highest score to play.

📊 Expectimax vs. Expectimax: How does the depth affect its play?

We have already shown above that expectimax beats all of our other strategies of playing The Royal Game of Ur quite solidly. But how does it fair against itself when its looking to different depths into the future?

Here are some results with expectimax searching to a depth of 3, 5, and 7 moves into the future to decide its moves:

Agent Win Percentage
Expectimax (Depth 7) 64.2%
Expectimax (Depth 5) 54.0%
Expectimax (Depth 3) 31.8%

These results show us that the depth to which the agent searches has a huge impact on how well the agent performs. Unfortunately further testing at higher and higher depths is infeasible due to limitations on available compute. Therefore, there is clearly more work to be done with optimisations and the exploration of new algorithms to crown our perfect strategy!

📄 Expectimax Agent Source Code

The source code for the expectimax agent is available in ExpectimaxAgent.java.

🐼 The Panda Agent 🐼

One issue with expectimax is speed, as checking all possible moves within a certain depth is expensive! Therefore, instead of calculating all possible moves, after a certain depth the panda agent simply ignores the possibility of rolling 0's and 4's.

This sacrifices some accuracy, but it gains a significant amount in speed.

📊 Panda vs. Expectimax

For these tests, the panda is set to ignore rolls of 0 and 4 after a depth of 2.

Depth 5

Agent Win Percentage Milliseconds per move
Expectimax 50.0% 3.19 ms
Panda 50.0% 6.43 ms

Depth 6

Agent Win Percentage Milliseconds per move
Expectimax 50.9% 97.5 ms
Panda 49.1% 37.7 ms

Depth 7

Agent Win Percentage Milliseconds per move
Expectimax 52% 1504 ms
Panda 48% 406 ms

These results show us that after a certain depth, ignoring the possibility of rolling 0's or 4's does not affect the performance of the agent too much. Therefore, the Panda agent is able to perform almost as well as expectimax, with 2-3x the speed.

Note: Do keep in mind that these ms/move statistics will change from computer to computer based on the hardware used. The numbers are really only useful for comparison between algorithms.

Why is this agent named Panda? 🐼

This agent is pretty lazy with its move checking, and if there's one thing pandas are famous for, its their laziness!

📄 The Panda Agent Source Code

The source code for the panda agent is available in PandaAgent.java.