Skip to content

狼人杀环境

代码入口

狼人杀环境提供一个策略类桌面游戏环境,在初始化环境时,允许自定义添加狼人和平民数量。游戏执行时,通过内置的不同角色的提示词,非主持人角色在主持人的引导下进行天黑时的私聊和天亮的公开发言。并根据不同的发言和投票等结果,环境内自动计算被淘汰的角色、角色的技能使用情况和存活状态、游戏是否结束、获胜方及原因等。

空间定义

观察空间

定义:

python
from gymnasium import spaces
from metagpt.environment.werewolf.const import STEP_INSTRUCTIONS

space = spaces.Dict(
        {
            "game_setup": spaces.Text(256),
            "step_idx": spaces.Discrete(len(STEP_INSTRUCTIONS)),
            "living_players": spaces.Tuple(
                (spaces.Text(16), spaces.Text(16))
            ),  # TODO should be tuple of variable length
            "werewolf_players": spaces.Tuple(
                (spaces.Text(16), spaces.Text(16))
            ),  # TODO should be tuple of variable length
            "player_hunted": spaces.Text(16),
            "player_current_dead": spaces.Tuple((spaces.Text(16))),  # TODO should be tuple of variable length
            "witch_poison_left": spaces.Discrete(2),
            "witch_antidote_left": spaces.Discrete(2),
            "winner": spaces.Text(16),
            "win_reason": spaces.Text(64),
        }
    )
from gymnasium import spaces
from metagpt.environment.werewolf.const import STEP_INSTRUCTIONS

space = spaces.Dict(
        {
            "game_setup": spaces.Text(256),
            "step_idx": spaces.Discrete(len(STEP_INSTRUCTIONS)),
            "living_players": spaces.Tuple(
                (spaces.Text(16), spaces.Text(16))
            ),  # TODO should be tuple of variable length
            "werewolf_players": spaces.Tuple(
                (spaces.Text(16), spaces.Text(16))
            ),  # TODO should be tuple of variable length
            "player_hunted": spaces.Text(16),
            "player_current_dead": spaces.Tuple((spaces.Text(16))),  # TODO should be tuple of variable length
            "witch_poison_left": spaces.Discrete(2),
            "witch_antidote_left": spaces.Discrete(2),
            "winner": spaces.Text(16),
            "win_reason": spaces.Text(64),
        }
    )

观察值说明

字段说明取值说明
game_setup游戏初始信息文本串最大长度16
step_idx游戏每轮进行到的当前步数值范围[0-18]
living_players当前存活的玩家名列表可多个玩家
werewolf_players狼人玩家名列表可多个狼人。当前未做隔离,可直接从环境中取得该列表,待优化
player_hunted当前被狼人淘汰的玩家名0个或1个玩家
player_current_dead当前被淘汰的玩家名列表可多个玩家
witch_poison_left女巫剩余毒药数量0或者1
witch_antidote_left女巫剩余解药数量0或者1
winner获胜方最大长度16
win_reason获胜原因最大长度64

空间sample示例:

OrderedDict([('game_setup', 'Game setup: xxx'), ('living_players', ('Player1', 'Player2')), ('player_current_dead', ('Player5', 'Player6')), ('player_hunted', 'Player5'), ('step_idx', 7), ('werewolf_players', ('Player3', 'Player4')), ('win_reason', 'xx'), ('winner', 'werewolf'), ('witch_antidote_left', 1), ('witch_poison_left', 1)])
OrderedDict([('game_setup', 'Game setup: xxx'), ('living_players', ('Player1', 'Player2')), ('player_current_dead', ('Player5', 'Player6')), ('player_hunted', 'Player5'), ('step_idx', 7), ('werewolf_players', ('Player3', 'Player4')), ('win_reason', 'xx'), ('winner', 'werewolf'), ('witch_antidote_left', 1), ('witch_poison_left', 1)])

动作空间

定义:

python
from gymnasium import spaces
from metagpt.environment.werewolf.env_space import EnvActionType

space = spaces.Dict(
    {
        "action_type": spaces.Discrete(len(EnvActionType)),
        "player_name": spaces.Text(16),  # the player to do the action
        "target_player_name": spaces.Text(16),  # the target player who take the action
    }
)
from gymnasium import spaces
from metagpt.environment.werewolf.env_space import EnvActionType

space = spaces.Dict(
    {
        "action_type": spaces.Discrete(len(EnvActionType)),
        "player_name": spaces.Text(16),  # the player to do the action
        "target_player_name": spaces.Text(16),  # the target player who take the action
    }
)

动作值说明

字段说明取值说明
action_type动作类型不同动作对应不同IntEnum值,依次None、WOLF_KILL、VOTE_KILL、WITCH_POISON、WITCH_SAVE、GUARD_PROTECT、PROGRESS_STEP
player_name动作的发起方最大长度16
target_player_name动作的被作用方最大长度16

空间sample示例:

OrderedDict([('action_type', 5), ('player_name', 'Player1'), ('target_player_name', 'Player2')])
OrderedDict([('action_type', 5), ('player_name', 'Player1'), ('target_player_name', 'Player2')])

使用

python
from metagpt.environment.werewolf.werewolf_ext_env import WerewolfExtEnv
from metagpt.environment.werewolf.env_space import (
    EnvAction,
    EnvActionType
)

env = WerewolfExtEnv()

obs, _ = env.reset()  # 得到完整观察值

action = EnvAction(action_type=EnvActionType.VOTE_KILL, player_name="Player1", target_player_name="Player2")  # 初始化一组动作值,`Player1`票死`Player2`
obs, _, _, _, info = env.step(action)  # 执行动作并得到新的完整观察
from metagpt.environment.werewolf.werewolf_ext_env import WerewolfExtEnv
from metagpt.environment.werewolf.env_space import (
    EnvAction,
    EnvActionType
)

env = WerewolfExtEnv()

obs, _ = env.reset()  # 得到完整观察值

action = EnvAction(action_type=EnvActionType.VOTE_KILL, player_name="Player1", target_player_name="Player2")  # 初始化一组动作值,`Player1`票死`Player2`
obs, _, _, _, info = env.step(action)  # 执行动作并得到新的完整观察

Released under the MIT License.