rcognita.actors.ActorProbabilisticEpisodicAC

class rcognita.actors.ActorProbabilisticEpisodicAC(dim_output: int = 5, dim_input: int = 2, action_bounds=None, action_init=None, model=None, **kwargs)

__init__(dim_output: int = 5, dim_input: int = 2, action_bounds=None, action_init=None, model=None, **kwargs)

Initialize an actor that samples actions from a probabilistic model. The actor also stores gradients for the model weights for each action taken.

Parameters

action_bounds (list or ndarray, optional) – Bounds on the action.
action_init (list, optional) – Initial action.
model (Model, optional) – Model object to be used as reference by the Predictor and the Critic.

Methods

`__init__`([dim_output, dim_input, …])	Initialize an actor that samples actions from a probabilistic model.
`accept_or_reject_weights`(weights[, …])	Determines whether the given weights should be accepted or rejected based on the specified constraints.
`cache_weights`([weights])	Cache the current weights of the model of the actor.
`create_observation_constraints`(…)	Method to create observation (or state) related constraints using a predictor over a prediction_horizon.
`get_action`()	Get the current action.
`optimize_weights`()	Method to optimize the current actor weights.
`receive_observation`(observation)	Update the current observation of the actor.
`reset`()	Reset the actor’s stored gradients and call the base Actor class’s reset method.
`restore_weights`()	Restore the previously cached weights of the model of the actor.
`set_action`(action)	Set the current action of the actor.
`store_gradient`(gradient)	Store the gradient of the model’s weights.
`update`(observation)	Samples an action from the actor’s distribution, updates the action and action_old attributes, and stores the current gradient in the gradients list.
`update_action`(observation)	Sample an action from the probabilistic model, clip it to the action bounds, and store its gradient.
`update_and_cache_weights`()	Update and cache the weights of the model of the actor.
`update_weights`([weights])	Update the weights of the model of the actor.
`update_weights_by_gradient`(gradient, …)	Update the model weights by subtracting the gradient multiplied by the learning rate and a constant factor.