You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey there, I was looking at the phantom tic-tac-toe implementation, specifically the information state tensor:
voidPhantomTTTState::InformationStateTensor(Player player,
absl::Span<float> values) const {
SPIEL_CHECK_GE(player, 0);
SPIEL_CHECK_LT(player, num_players_);
// First 27 bits encodes the player's view in the same way as TicTacToe.// Then the action sequence follows (one-hot encoded, per action).// Encoded in the same way as InformationStateAsString, so full sequences// which may contain action value 10 to represent "I don't know."constauto& player_view = player == 0 ? x_view_ : o_view_;
SPIEL_CHECK_EQ(values.size(), kNumCells * kCellStates +
kLongestSequence * (1 + kBitsPerAction));
std::fill(values.begin(), values.end(), 0.);
for (int cell = 0; cell < kNumCells; ++cell) {
values[kNumCells * static_cast<int>(player_view[cell]) + cell] = 1.0;
}
// Now encode the sequence. Each (player, action) pair uses 11 bits:// - first bit is the player taking the action (0 or 1)// - next 10 bits is the one-hot encoded action (10 = "I don't know")int offset = kNumCells * kCellStates;
for (constauto& player_with_action : action_sequence_) {
if (player_with_action.first == player) {
// Always include the observing player's actions.
values[offset] = player_with_action.first; // Player 0 or 1
values[offset + 1 + player_with_action.second] = 1.0;
} elseif (obs_type_ == ObservationType::kRevealNumTurns) {
// If the number of turns are revealed, then each of the other player's// actions will show up as unknowns.
values[offset] = player_with_action.first;
values[offset + 1 + 10] = 1.0; // I don't know.
} else {
// Do not reveal anything about the number of actions taken by opponent.SPIEL_CHECK_EQ(obs_type_, ObservationType::kRevealNothing);
}
offset += (1 + kBitsPerAction);
}
}
Firstly, values[offset + 1 + 10] = 1.0; // I don't know. should be 1 + 9 I believe since actions range from 0-8 and not 1-9.
Second, we offset offset += (1 + kBitsPerAction); in any case. In the default case of ObservationType::kRevealNothing, I think we should only offset in the if and the else if, but now in the else (right now we always offset).
For instance, with ObservationType::kRevealNumTurns we could get, for player 0's info string:
I believe you're right (at least for both issues in Phantom TTT -- haven't looked at Dark Hex). Nice find.
Can you submit a PR fix?
I think the information state size also be smaller then in the case of kRevealNothing (since its upper bound is smaller). And then we can put a check that the offset has reached the exactly the size of the container at the end of the function (like we do in other games).
If/when you're convinced that this is also an issue in Dark Hex, can you open another issue that refrers to this one? (Maybe also listing the code as you did here)
Hey there, I was looking at the phantom tic-tac-toe implementation, specifically the information state tensor:
Firstly,
values[offset + 1 + 10] = 1.0; // I don't know.
should be1 + 9
I believe since actions range from 0-8 and not 1-9.Second, we offset
offset += (1 + kBitsPerAction);
in any case. In the default case ofObservationType::kRevealNothing
, I think we should only offset in the if and the else if, but now in the else (right now we always offset).For instance, with
ObservationType::kRevealNumTurns
we could get, for player 0's info string:(and you can see that the last "i don't know" action isn't set to 1, as per my previous point).
With
ObservationType::kRevealNothing
, we get:So well, we can still see the number of moves the opponent did by counting the lines of zeros.
The text was updated successfully, but these errors were encountered: