Skip to content

Commit

Permalink
[Docs] Update game descriptions (#1009)
Browse files Browse the repository at this point in the history
  • Loading branch information
sotetsuk authored Aug 21, 2023
1 parent 3176da8 commit 29e6ab0
Show file tree
Hide file tree
Showing 10 changed files with 193 additions and 34 deletions.
92 changes: 91 additions & 1 deletion docs/bridge_bidding.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,96 @@
<img src="https://raw.githubusercontent.com/sotetsuk/pgx/main/docs/assets/bridge_bidding_light.gif" width="30%">
</p>

## Usage

!!! warning "Bridge bidding requires domain knowledge"

To appropriately use bridge bidding environment, you need to understand the rules of contract bridge well.
To avoid wrong usage, we do not provide `pgx.make("bridge_bidding")`.
Instead, you have to directly load `BridgeBidding` class.

```py
from pgx.bridge_bidding import BridgeBidding

env = BridgeBidding()
```

In Pgx, we follow `[Tian+20]` and use pre-computed Double Dummy Solver (DDS) dataset for each hand.
So, `BrdigeBidding` environment requires to load pre-computed DDS dataset by `env = BridgeBidding("<path_to_dataset>")`.
Please run the following command to download the DDS results provided by Pgx.

```py
from pgx.bridge_bidding import download_dds_results
download_dds_results()
```

You can specify which pre-coumpted DDS dataset to use by passing argument to `BridgeBidding` constructor.
Typically, you have to use different DDS datasets for training and testing (evaluation).

## Description

TBA
> Contract bridge, or simply bridge, is a trick-taking card game using a standard 52-card deck. In its basic format, it is played by four players in two competing partnerships,[1] with partners sitting opposite each other around a table. Millions of people play bridge worldwide in clubs, tournaments, online and with friends at home, making it one of the world's most popular card games, particularly among seniors.
>
> ...
>
The game consists of a number of deals, each progressing through four phases. The cards are dealt to the players; then the players call (or bid) in an auction seeking to take the contract, specifying how many tricks the partnership receiving the contract (the declaring side) needs to take to receive points for the deal. During the auction, partners use their bids to exchange information about their hands, including overall strength and distribution of the suits; no other means of conveying or implying any information is permitted. The cards are then played, the declaring side trying to fulfill the contract, and the defenders trying to stop the declaring side from achieving its goal. The deal is scored based on the number of tricks taken, the contract, and various other factors which depend to some extent on the variation of the game being played.
>
> [Contract bridge](https://en.wikipedia.org/wiki/Contract_bridge)
We follow the previous works `[Rong+19,Gong+19,Tian+20,Lockhart+20]` and focus only on the bidding phase of contract bridge.
Therefore, we approximate the playing phase of bridge by using the results of DDS (Double Dummy Solver).

## Specs

| Name | Value |
|:---|:----:|
| Version | `v0` |
| Number of players | `4` |
| Number of actions | `38` |
| Observation shape | `(480,)` |
| Observation type | `bool` |
| Rewards | Game payoff |

## Observation
We follow the observation design of `[Lockhart+20]`, OpenSpiel.

| Index | Description |
|:---:|:----|
| `obs[0:4]` | Vulnerability |
| `obs[4:8]` | Per player, did this player pass before the opening bid? |
| `obs[8:20]` | Per player played bid, double, redouble against 1♧ |
| ... | ... |
| `obs[416:428]` | Per player played bid, double, redouble against 7NT |
| `obs[428:480]` | 13-hot vector indicating the cards we hold |

## Action
Each action `(0, ..., 37)` corresponds to `(Pass, Double, Redouble, 1♧, 1♢, 1♡, 1♤, 1NT, ..., 7♧, 7♢, 7♡, 7♤, 7NT)`, respectively.

| Index | Description |
|:---:|:----|
| `0` | `Pass` |
| `1` | `Double` |
| `2` | `Redouble` |
| `3, ..., 7` | `1♧, 1♢, 1♡, 1♤, 1NT` |
| ... | ... |
| `33, ..., 37` | `7♧, 7♢, 7♡, 7♤, 7NT` |

## Rewards
Players get the game payoff at the end of the game.

## Termination
Terminates by three consecutive passes after the last bid.

## Version History

- `v0` : Initial release (v1.0.0)

## Reference

- `[Rong+19]` "Competitive Bridge Bidding with Deep Neural Networks"
- `[Gong+19]` "Simple is Better: Training an End-to-end Contract Bridge Bidding Agent without Human Knowledge"
- `[Tian+20]` "Joint Policy Search for Multi-agent Collaboration with Imperfect Information"
- `[Lockhart+20]` "Human-agent cooperation in bridge bidding"
- `Double Dummy Solver` http://privat.bahnhof.se/wb758135/
- `PBN format` https://www.tistis.nl/pbn/
- `IMP` https://en.wikipedia.org/wiki/International_Match_Points
35 changes: 25 additions & 10 deletions docs/chess.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,13 @@ env = Chess()

## Description

TBA
> Chess is a board game for two players, called White and Black, each controlling an army of chess pieces in their color, with the objective to checkmate the opponent's king. It is sometimes called international chess or Western chess to distinguish it from related games such as xiangqi (Chinese chess) and shogi (Japanese chess). The recorded history of chess goes back at least to the emergence of a similar game, chaturanga, in seventh century India. The rules of chess as they are known today emerged in Europe at the end of the 15th century, with standardization and universal acceptance by the end of the 19th century. Today, chess is one of the world's most popular games played by millions of people worldwide.
>
> Chess is an abstract strategy game that involves no hidden information and no elements of chance. It is played on a chessboard with 64 squares arranged in an 8×8 grid. At the start, each player controls sixteen pieces: one king, one queen, two rooks, two bishops, two knights, and eight pawns. White moves first, followed by Black. The game is won by checkmating the opponent's king, i.e. threatening it with inescapable capture. There are also several ways a game can end in a draw.
>
> [Chess - Wikipedia](https://en.wikipedia.org/wiki/Chess)

## Rules

TBA

## Specs

| Name | Value |
Expand All @@ -50,14 +50,29 @@ TBA

## Observation
We follow the observation design of AlphaZero `[Silver+18]`.
P1 denotes the current player, and P2 denotes the opponent.

| Index | Description |
|:---:|:----|
| TBA | TBA |
| `[0:6]` | P1 board @ 0-steps before |
| `[6:12]` | P2 board @ 0-steps before |
| `[12:14]` | Repetitions @ 0-steps before |
| ... | (@ 1-7 steps before) |
| `[112]` | Color |
| `[113]` | Total move count |
| `[114:116]` | P1 castling |
| `[116:118]` | P2 castling |
| `[118]` | No progress count|

## Action
We also follow the action design of AlphaZero `[Silver+18]`.
There are `4672 = 64 x 73` possible actions.
Each action represents

TBA
- 64 source position (`action // 73`), and
- 73 moves (`action % 73`)

Moves are defined by 56 queen moves, 8 knight moves, and 9 underpromotions.

## Rewards
Non-zero rewards are given only at the terminal states.
Expand All @@ -76,9 +91,9 @@ Termination occurs when one of the following conditions are satisfied:
1. checkmate
2. stalemate
3. no sufficient pieces to checkmate
4. `50` halfmoves are elapsed without any captures or pawn moves
4. `512` steps are elapsed (from AlphaZero `[Silver+18]`)

4. Threefold repetition
5. `50` halfmoves are elapsed without any captures or pawn moves
6. `512` steps are elapsed (from AlphaZero `[Silver+18]`)

## Version History

Expand Down
31 changes: 22 additions & 9 deletions docs/gardner_chess.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,11 @@ env = GardnerChess()

## Description

TBA
> A board needs to be five squares wide to contain all kinds of chess pieces on the first row. In 1969, Martin Gardner suggested a chess variant on 5×5 board in which all chess moves, including pawn double-move, en-passant capture as well as castling can be made. Later AISE (Associazione Italiana Scacchi Eterodossi, "Italian Heterodox Chess Association") abandoned pawn double-move and castling. The game was largely played in Italy (including by correspondence) and opening theory was developed.
>
> [Minichess - Wikipedia](https://en.wikipedia.org/wiki/Minichess#5%C3%975_chess)

## Rules

TBA
Pgx implementation does not support pawn double-move, en-passant and castling.

## Specs

Expand All @@ -50,14 +49,27 @@ TBA

## Observation
We follow the observation design of AlphaZero `[Silver+18]`.
P1 denotes the current player, and P2 denotes the opponent.

| Index | Description |
|:---:|:----|
| TBA | TBA |
| `[:, :, 0:6]` | P1 board @ 0-steps before |
| `[:, :, 6:12]` | P2 board @ 0-steps before |
| `[:, :, 12:14]` | Repetitions @ 0-steps before |
| ... | (@ 1-7 steps before) |
| `[:, :, 112]` | Color |
| `[:, :, 113]` | Total move count |
| `[:, :, 114]` | No progress count |

## Action
We also follow the action design of AlphaZero `[Silver+18]`.
There are `1225 = 25 x 49` possible actions.
Each action represents

- 25 source position (`action // 49`), and
- 49 moves (`action % 49`)

TBA
Moves are defined by 32 queen moves, 8 knight moves, and 9 underpromotions.

## Rewards
Non-zero rewards are given only at the terminal states.
Expand All @@ -76,8 +88,9 @@ Termination occurs when one of the following conditions are satisfied:
1. checkmate
2. stalemate
3. no sufficient pieces to checkmate
4. `50` halfmoves are elapsed without any captures or pawn moves
4. `256` steps are elapsed (`512` in full-size chess experiments in AlphaZero `[Silver+18]`)
4. threefold repetition
5. `50` halfmoves are elapsed without any captures or pawn moves
6. `256` steps are elapsed (`512` in full-size chess experiments in AlphaZero `[Silver+18]`)


## Version History
Expand Down
3 changes: 1 addition & 2 deletions docs/minatar_asterix.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,7 @@ and spawn rate of enemies and treasure.
| `[:, :, 3]` | Gold |

## Action

TBA
No-op (0), left (1), right (2), up (3), or down (4).

## Version History

Expand Down
2 changes: 1 addition & 1 deletion docs/minatar_breakout.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ the ball hits the bottom of the screen. The balls direction is indicated by a tr

## Action

TBA
No-op (0), left (1), or right (2).

## Version History

Expand Down
3 changes: 1 addition & 2 deletions docs/minatar_freeway.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,7 @@ after 2500 frames have elapsed.
| `[:, :, 5]` | Speed 4 |

## Action

TBA
No-op (0), up (1), or down (2).

## Version History

Expand Down
2 changes: 1 addition & 1 deletion docs/minatar_seaquest.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ active in their previous location to reduce partial observability.

## Action

TBA
No-op (0), up (1), down (2), left (3), right (4), or fire (5).

## Version History

Expand Down
3 changes: 1 addition & 2 deletions docs/minatar_space_invaders.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,7 @@ hits the player.
| `[:, :, 5]` | Enemy bullet |

## Action

TBA
No op (0), left (1), right (2), or fire (3).

## Version History

Expand Down
7 changes: 6 additions & 1 deletion docs/shogi.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,12 @@ The reward at terminal state is described in this table:

## Termination

TBA
Termination occurs when

1. either player checkmates the opponent, or
2. `512` steps are elapsed (from AlphaZero `[Silver+18]`)

Fourfold repetition is not implemented in `v0`.

## Version History

Expand Down
49 changes: 44 additions & 5 deletions docs/sparrow_mahjong.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ It was developed for those unfamiliar with Mahjong,
and requires similar strategic thinking to standard Japanese Mahjong.


### Rules
### Rules of Sparrow Mahjong

<!---
すずめ雀のルールの概略は以下のようなものです。
Expand Down Expand Up @@ -51,11 +51,50 @@ The original rules of Sparrow Mahjong ([すずめ雀](https://sugorokuya.jp/p/su
* one red dora for each banboo tile type (9 tiles)
* Furiten: you cannot *ron* with a tile you have discarded, but you can ron with other tiles

### Modifications in Pgx
### Specifications in Pgx

Pgx implementation is simplified as follows:

* Only for 3 players
* If players can win, they automatically win
* Players always keep red doras in their hands (i.e., red doras are not discarded if they have same but non-dora tiles)
* No [Heavenly hand](https://riichi.wiki/Tenhou_and_chiihou) (Tenhou/天和) to avoid the game ends without any action from players
* Actions are only for discarding tiles (11 discrete actions)
* If players can win, they automatically win
* Players always keep red doras in their hands (i.e., red doras are not discarded if they have same but non-dora tiles)
* No [Heavenly hand](https://riichi.wiki/Tenhou_and_chiihou) (Tenhou/天和) to avoid the game ends without any action from players

## Specs

| Name | Value |
|:---|:----:|
| Version | `v0` |
| Number of players | `3` |
| Number of actions | `11` |
| Observation shape | `(15, 11)` |
| Observation type | `bool` |
| Rewards | `[-1, 1]` |

## Observation
There are 15 planes in the observation and each plane consists of 11 tiles.

| Planes | Description |
|:---:|:----|
| 4 | P1 hand |
| 1 | Red dora in P1 hand |
| 1 | Dora |
| 1 | All discarded tiles by P1 |
| 1 | All discarded tiles by P2 |
| 1 | All discarded tiles by P3 |
| 3 | Discarded tiles in the last 3 steps by P2 |
| 3 | Discarded tiles in the last 3 steps by P3 |

## Action
Tile to discard.

## Rewards
Game payoff normalized to `[-1, 1]`

## Termination
Terminates when either player wins or the wall becomes empty.

## Version History

- `v0` : Initial release (v1.0.0)

0 comments on commit 29e6ab0

Please sign in to comment.