Table of Contents
mlagents_envs.envs.unity_gym_env
UnityGymException Objects
class UnityGymException(error.Error)
Any error related to the gym wrapper of ml-agents.
UnityToGymWrapper Objects
class UnityToGymWrapper(gym.Env)
Provides Gym wrapper for Unity Learning Environments.
__init__
| __init__(unity_env: BaseEnv, uint8_visual: bool = False, flatten_branched: bool = False, allow_multiple_obs: bool = False, action_space_seed: Optional[int] = None)
Environment initialization
Arguments:
unity_env
: The Unity BaseEnv to be wrapped in the gym. Will be closed when the UnityToGymWrapper closes.uint8_visual
: Return visual observations as uint8 (0-255) matrices instead of float (0.0-1.0).flatten_branched
: If True, turn branched discrete action spaces into a Discrete space rather than MultiDiscrete.allow_multiple_obs
: If True, return a list of np.ndarrays as observations with the first elements containing the visual observations and the last element containing the array of vector observations. If False, returns a single np.ndarray containing either only a single visual observation or the array of vector observations.action_space_seed
: If non-None, will be used to set the random seed on created gym.Space instances.
reset
| reset() -> Union[List[np.ndarray], np.ndarray]
Resets the state of the environment and returns an initial observation. Returns: observation (object/list): the initial observation of the space.
step
| step(action: List[Any]) -> GymStepResult
Run one timestep of the environment's dynamics. When end of
episode is reached, you are responsible for calling reset()
to reset this environment's state.
Accepts an action and returns a tuple (observation, reward, done, info).
Arguments:
action
object/list - an action provided by the environment
Returns:
observation
object/list - agent's observation of the current environment reward (float/list) : amount of reward returned after previous actiondone
boolean/list - whether the episode has ended.info
dict - contains auxiliary diagnostic information.
render
| render(mode="rgb_array")
Return the latest visual observations. Note that it will not render a new frame of the environment.
close
| close() -> None
Override _close in your subclass to perform any necessary cleanup. Environments will automatically close() themselves when garbage collected or when the program exits.
seed
| seed(seed: Any = None) -> None
Sets the seed for this env's random number generator(s). Currently not implemented.
ActionFlattener Objects
class ActionFlattener()
Flattens branched discrete action spaces into single-branch discrete action spaces.
__init__
| __init__(branched_action_space)
Initialize the flattener.
Arguments:
branched_action_space
: A List containing the sizes of each branch of the action space, e.g. [2,3,3] for three branches with size 2, 3, and 3 respectively.
lookup_action
| lookup_action(action)
Convert a scalar discrete action into a unique set of branched actions.
Arguments:
action
: A scalar value representing one of the discrete actions.
Returns:
The List containing the branched actions.