Skip to content

yata0/Mahjong

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mahjong

This is an open source Python libaray of 1-on-1 mahjong game, which provides a standard API to learn algorithms on mahjong environment.

The game rules

You can see the detailed game rules in appendix of the paper.

Installation

You can clone the git project, and import the python library,We support python 3.x,and the only requirement is numpy.

git clone https://github.com/yata0/Mahjong.git

API

Initializing mahjong

from p2_mahjong.wrapper import MJWrapper as Wrapper
wrapper = Wrapper()

Interacting with the Environment

You can see how to use interact with the environment through this example.

import random
from p2_mahjong.wrapper import MJWrapper as Wrapper
wrapper = Wrapper()
is_game_over = False
index = 0
game_count = 100
for game_index in range(game_count):
    is_game_over = False
    wrapper.reset()
    legal_actions = wrapper.get_legal_actions()
    while not is_game_over:
        action_label = random.choice(legal_actions)
        cards, actions, reward, is_game_over, legal_actions = wrapper.step([action_label])
        if is_game_over:
            _, winner_id = wrapper.get_game_status()
            if winner_id is not None:
                print(game_index, wrapper.get_payoffs())
            else:
                print(game_index, "tie")
        index += 1

Tiles and Actions

Tiles

There are 24 unique tiles and 72 tiles in total in the 1-on-1 Mahjong game, and the relevant tile ids defined in the source code are listed in the table below.

Tile name ID
Character 1 9
Character 2 10
Character 3 11
Character 4 12
Character 5 13
Character 6 14
Character 7 15
Character 8 16
Character 9 17
Green 27
Red 28
White 29
East 30
West 31
North 32
South 33
Spring 34
Summer 35
Autumn 36
Winter 37
Mei 38
Lan 39
Zhu 40
Ju 41

Actions

There are 10 types of actions with 105 different actions in total, and the relevant action ids defined in the source code are listed in the table below.

action type auxiliary tiles target tile id
Get Card - - 0
Hu - - 1
Discard - Character 1
Character 2
Character 3
Character 4
Character 5
Character 6
Character 7
Character 8
Character 9
Green
Red
White
East
West
North
South
12
13
14
15
16
17
18
19
20
30
31
32
33
34
35
36
Pong - Character 1
Character 2
Character 3
Character 4
Character 5
Character 6
Character 7
Character 8
Character 9
Green
Red
White
East
West
North
South
46
47
48
49
50
51
52
53
54
64
65
66
67
68
69
70
Gong - Character 1
Character 2
Character 3
Character 4
Character 5
Character 6
Character 7
Character 8
Character 9
Green
Red
White
East
West
North
South
80
81
82
83
84
85
86
87
88
98
99
100
101
102
103
104
Chow Character 2,3
Character 1,3
Character 1,2
Character 3,4
Character 2,4
Character 2,3
Character 4,5
Character 3,5
Character 3,4
Character 5,6
Character 4,6
Character 4,5
Character 6,7
Character 5,7
Character 5,6
Character 7,8
Character 6,8
Character 6,7
Character 8,9
Character 7,9
Character 7,8
Character 1
Character 2
Character 3
Character 2
Character 3
Character 4
Character 3
Character 4
Character 5
Character 4
Character 5
Character 6
Character 5
Character 6
Character 7
Character 6
Character 7
Character 8
Character 7
Character 8
Character 9
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
Concealed Gong - Character 1
Character 2
Character 3
Character 4
Character 5
Character 6
Character 7
Character 8
Character 9
Green
Red
White
East
West
North
South
177
178
179
180
181
182
183
184
185
195
196
197
198
199
200
201
Pass Hu - - 202
Ting - - 203
Add Gong - Character 1
Character 2
Character 3
Character 4
Character 5
Character 6
Character 7
Character 8
Character 9
Green
Red
White
East
West
North
South
213
214
215
216
217
218
219
220
221
231
232
233
234
235
236
237

Standard methods

Stepping

step(self, action: int) -> Tuple[list of list, list of list, list, bool, list]

Run one step of the environment's dynamics.You can call reset() to reset the environment's state.This function accepts an action id and returns a tuple (tiles, actions, rewards, is_game_over, legal_actions)

Parameters:

  • action(int):  an action provided by the player.This is an integer,and should be one of the Actions

Returns

  • tiles(list of list): one of player's observations of the current environment ,list of tiles,contains player’s hand, the player’s Chow, Pong, and Kong, the player’s concealed-Kong, the player’s Discard, the opponent’s Chow, Pong, and Kong, the opponent’s concealed-Kong, and the opponent’s Discard.Each tile in the list is integer, which is one of the Tiles. The specific information is shown in the table below:

    class description length remarks
    self_hand the player's hand 34 complement the length with -1
    self_piles the player's Chow,Pong and Kong 34 complement the length with -1
    self_hidden_piles the player's concealed-Kong 34 complement the length with -1
    self_history_tiles the player's Discard 34 complement the length with -1
    opp_piles the opponent's Chow,Pong and Kong 34 complement the length with -1
    opp_hidden_piles the opponent's concealed-Kong (invisible, replace the real id with 34) 34 complement the length with -1
    opp_history_tiles the opponent's Discard 34 complement the length with -1
    last_table_tile the latest discard tile on the table 1 complement the length with -1
    self_flower player's flower 8 complement the length with -1
    opp_flower opponent's flower 8 complement the length with -1
  • actions(list of list): one of player's observations of the current environment, a list of actions, contains player's history actions, the opponent's history actions,the player's state of Ting and the opponent's state of Ting.The specific information is shown in the table below:

    class description length remarks
    self_his_actions the player's history actions 50 complement the length with -1
    opp_his_actions the opponent's history actions 50 complement the length with -1
    self_wait the player's state of Ting 1 1 for Ting, 0 for not
    opp_wait the opponent's state of Ting 1 1 for Ting, 0 for not
  • rewards(list): reward returned after previous action, the first item in this list is the reward of player 0, the second item is the reward of player 1.

  • is_game_over(bool): a signal to check whether the episode has ended.

  • legal_actions(list):  set of legal actions that can be done next step, these actions are also integers.

Get legal actions

get_legal_actions(self) -> list

Returns

  • legal_actions(list)

Resetting

reset(self) -> Tuple[list of list, list of list, list]

Returns

  • tiles(list of list)
  • actions(list of list)
  • legal_actions(list)

Current player

get_current_player(self) -> int
  • current_player(int): current player to provide an action, which is an integer.

Current Observation

get_current_obs(self) -> Tuple[list of list, list of list]
  • tiles(list of list)
  • actions(list of list)

Payoffs

get_payoffs(self) -> Tuple(list, list, list)
  • payoffs(list): the payoffs of both player, the first is the player 0'score, the second is the player 1's score. Payoff is same to the reward.

  • fan_names(list):  categories to which the winner's completed legal hand belongs.This is a list of strings.

  • fan_score(list):  list of scores one-to-one with fan_names,which is a list of integers.

Game Status

get_game_status(self) -> Tuple(bool, int)
  • is_over(bool): a signal to check whether the episode has ended.
  • winner(int): an integer indicating who is the winner, 0 for player 0 and 1 for player 1.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages