rcognita.critics.CriticCALF

class rcognita.critics.CriticCALF(*args, safe_decay_rate=0.001, is_dynamic_decay_rate=True, predictor=None, observation_init=None, safe_controller=None, penalty_param=0, is_predictive=True, **kwargs)

__init__(*args, safe_decay_rate=0.001, is_dynamic_decay_rate=True, predictor=None, observation_init=None, safe_controller=None, penalty_param=0, is_predictive=True, **kwargs)

Initialize a CriticCALF object.

Parameters

args – Arguments to be passed to the base class CriticOfObservation.
safe_decay_rate – Rate at which the safe set shrinks over time.
is_dynamic_decay_rate – Whether the decay rate should be dynamic or not.
predictor – A predictor object to be used to predict future observations.
observation_init – Initial observation to be used to initialize the safe set.
safe_controller – Safe controller object to be used to compute stabilizing actions.
penalty_param – Penalty parameter to be used in the CALF objective.
is_predictive – Whether the safe constraints should be computed based on predictions or not.
kwargs – Keyword arguments to be passed to the base class CriticOfObservation.

Methods

`CALF_critic_lower_bound_constraint`(weights)	Constraint that ensures that the value of the critic is above a certain lower bound.
`CALF_critic_upper_bound_constraint`(weights)	Calculate the constraint violation for the CALF decay constraint when no prediction is made.
`CALF_decay_constraint_no_prediction`(weights)	Constraint that ensures that the CALF value is decreasing by a certain rate.
`CALF_decay_constraint_predicted_on_policy`(weights)	Constraint for ensuring that the CALF function decreases at each iteration.
`CALF_decay_constraint_predicted_safe_policy`(weights)	Calculate the constraint violation for the CALF decay constraint when a predicted safe policy is used.
`__init__`(*args[, safe_decay_rate, …])	Initialize a CriticCALF object.
`accept_or_reject_weights`(weights[, …])	Determine whether to accept or reject the given weights based on whether they violate the given constraints.
`cache_weights`([weights])	Stores a copy of the current model weights.
`initialize_buffers`()	Initialize the action and observation buffers with zeros.
`objective`(args, *kwargs)	Objective of the critic, which is the sum of the squared temporal difference and the penalty for violating the CALF decay constraint, if the penalty parameter is non-zero.
`optimize_weights`([time])	Compute optimized critic weights, possibly subject to constraints.
`reset`()	Reset the outcome and current critic loss variables, and re-initialize the buffers.
`restore_weights`()	Restores the model weights to the cached weights.
`update_and_cache_weights`([weights])	Update the model’s weights and cache the new values.
`update_buffers`(observation, action)	Update data buffers and dynamic safe decay rate.
`update_outcome`(observation, action)	Update the outcome variable based on the running objective and the current observation and action.
`update_target`(new_target)
`update_weights`([weights])	Update the weights of the critic model.

Attributes

optimizer_engine

Returns the engine used by the optimizer.