rcognita.critics.CriticTabularVI
- class rcognita.critics.CriticTabularVI(dim_state_space, running_objective, predictor, model, actor_model, discount_factor=1, N_parallel_processes=5, terminal_state=None)
Critic for tabular agents.
- __init__(dim_state_space, running_objective, predictor, model, actor_model, discount_factor=1, N_parallel_processes=5, terminal_state=None)
Initialize a CriticTabularVI object.
- Parameters
dim_state_space (tuple of int) – The dimensions of the state space.
running_objective (callable) – The running objective function.
predictor (any) – The predictor object.
model (Model) – The model object.
actor_model (any) – The actor model object.
discount_factor (float, optional) – The discount factor for the temporal difference.
N_parallel_processes (int, optional) – The number of parallel processes to use.
terminal_state (optional, int or tuple of int) – The terminal state, if applicable.
- Returns
None
Methods
__init__
(dim_state_space, running_objective, …)Initialize a CriticTabularVI object.
accept_or_reject_weights
(weights[, …])Determine whether to accept or reject the given weights based on whether they violate the given constraints.
cache_weights
([weights])Stores a copy of the current model weights.
initialize_buffers
()Initialize the action and observation buffers with zeros.
objective
(observation, action)Calculate the value of a state given the action taken and the observation of the current state.
optimize_weights
([time])Compute optimized critic weights, possibly subject to constraints.
reset
()Reset the outcome and current critic loss variables, and re-initialize the buffers.
restore_weights
()Restores the model weights to the cached weights.
update
()Update the value function for all states.
update_and_cache_weights
([weights])Update the model’s weights and cache the new values.
update_buffers
(observation, action)Updates the buffers of the critic with the given observation and action.
update_outcome
(observation, action)Update the outcome variable based on the running objective and the current observation and action.
update_single_cell
(observation)Update the value function for a single state.
update_target
(new_target)update_weights
([weights])Update the weights of the critic model.
Attributes
optimizer_engine
Returns the engine used by the optimizer.