gfn.gym.helpers.box_utils
This file contains utilitary functions for the Box environment.
Module Contents
Classes
Estimator for P_B for the Box environment. Uses the QuarterCircle(northeastern=False) distribution |
|
A deep neural network for the backward policy. |
|
A module to be used to create a uniform PB distribution for the Box environment |
|
Estimator for P_F for the Box environment. Uses the BoxForwardDist distribution. |
|
A deep neural network for the forward policy. |
|
A deep neural network for the state flow function. |
|
Represents distributions on quarter circles (or parts thereof), either the northeastern |
|
Extends the previous QuarterCircle distribution by considering an extra parameter, called |
|
Represents a distribution on the northeastern quarter disk centered at (0, 0) of maximal radius delta. |
Functions
|
Splits the module output into the expected parameter sets. |
Attributes
- class gfn.gym.helpers.box_utils.BoxPBEstimator(env, module, n_components, min_concentration=0.1, max_concentration=2.0)
Bases:
gfn.modules.GFNModuleEstimator for P_B for the Box environment. Uses the QuarterCircle(northeastern=False) distribution
- Parameters
env (gfn.gym.Box) –
module (torch.nn.Module) –
n_components (int) –
min_concentration (float) –
max_concentration (float) –
- expected_output_dim()
Expected output dimension of the module.
- Return type
int
- to_probability_distribution(states, module_output)
Transform the output of the module into a probability distribution.
The kwargs modify a base distribution, for example to encourage exploration.
Not all modules must implement this method, but it is required to define a policy from a module’s outputs. See DiscretePolicyEstimator for an example using a categorical distribution, but note this can be done for all continuous distributions as well.
- Parameters
states (gfn.states.States) –
module_output (torchtyping.TensorType[batch_shape, output_dim, float]) –
- Return type
torch.distributions.Distribution
- class gfn.gym.helpers.box_utils.BoxPBNeuralNet(hidden_dim, n_hidden_layers, n_components, **kwargs)
Bases:
gfn.utils.NeuralNetA deep neural network for the backward policy.
- Parameters
hidden_dim (int) –
n_hidden_layers (int) –
n_components (int) –
- n_components
the number of components for each distribution parameter.
- forward(preprocessed_states)
Forward method for the neural network.
- Parameters
preprocessed_states (torchtyping.TensorType[batch_shape, 2, float]) – a batch of states appropriately preprocessed for ingestion by the MLP.
- Return type
torchtyping.TensorType[batch_shape, 3 * n_components]
Returns: out, a set of continuous variables.
- class gfn.gym.helpers.box_utils.BoxPBUniform
Bases:
torch.nn.ModuleA module to be used to create a uniform PB distribution for the Box environment
A module that returns (1, 1, 1) for all states. Used with QuarterCircle, it leads to a uniform distribution over parents in the south-western part of circle.
- input_dim = 2
- forward(preprocessed_states)
- Parameters
preprocessed_states (torchtyping.TensorType[batch_shape, 2, float]) –
- Return type
torchtyping.TensorType[batch_shape, 3]
- class gfn.gym.helpers.box_utils.BoxPFEstimator(env, module, n_components_s0, n_components, min_concentration=0.1, max_concentration=2.0)
Bases:
gfn.modules.GFNModuleEstimator for P_F for the Box environment. Uses the BoxForwardDist distribution.
- Parameters
env (gfn.gym.Box) –
module (torch.nn.Module) –
n_components_s0 (int) –
n_components (int) –
min_concentration (float) –
max_concentration (float) –
- expected_output_dim()
Expected output dimension of the module.
- Return type
int
- to_probability_distribution(states, module_output)
Transform the output of the module into a probability distribution.
The kwargs modify a base distribution, for example to encourage exploration.
Not all modules must implement this method, but it is required to define a policy from a module’s outputs. See DiscretePolicyEstimator for an example using a categorical distribution, but note this can be done for all continuous distributions as well.
- Parameters
states (gfn.states.States) –
module_output (torchtyping.TensorType[batch_shape, output_dim, float]) –
- Return type
torch.distributions.Distribution
- class gfn.gym.helpers.box_utils.BoxPFNeuralNet(hidden_dim, n_hidden_layers, n_components_s0, n_components, **kwargs)
Bases:
gfn.utils.NeuralNetA deep neural network for the forward policy.
- Parameters
hidden_dim (int) –
n_hidden_layers (int) –
n_components_s0 (int) –
n_components (int) –
- n_components_s0
the number of components for each s=0 distribution parameter.
- n_components
the number of components for each s=t>0 distribution parameter.
- PFs0
the parameters for the s=0 distribution.
- forward(preprocessed_states)
Forward method for the neural network.
- Parameters
preprocessed_states (torchtyping.TensorType[batch_shape, 2, float]) – a batch of states appropriately preprocessed for ingestion by the MLP.
- Return type
torchtyping.TensorType[batch_shape, 1 + 5 * n_components]
Returns: out, a set of continuous variables.
- class gfn.gym.helpers.box_utils.BoxStateFlowModule(logZ_value, **kwargs)
Bases:
gfn.utils.NeuralNetA deep neural network for the state flow function.
- Parameters
logZ_value (torch.Tensor) –
- forward(preprocessed_states)
Forward method for the neural network.
- Parameters
preprocessed_states (torchtyping.TensorType[batch_shape, input_dim, float]) – a batch of states appropriately preprocessed for ingestion by the MLP.
- Return type
torchtyping.TensorType[batch_shape, output_dim, float]
Returns: out, a set of continuous variables.
- gfn.gym.helpers.box_utils.CLAMP
- class gfn.gym.helpers.box_utils.DistributionWrapper(states, delta, epsilon, mixture_logits, alpha_r, beta_r, alpha_theta, beta_theta, exit_probability, n_components, n_components_s0)
Bases:
torch.distributions.Distribution- Parameters
states (gfn.states.States) –
delta (float) –
epsilon (float) –
- log_prob(sampled_actions)
- sample(sample_shape=())
- gfn.gym.helpers.box_utils.PI_2
- gfn.gym.helpers.box_utils.PI_2_INV
- class gfn.gym.helpers.box_utils.QuarterCircle(delta, northeastern, centers, mixture_logits, alpha, beta)
Bases:
torch.distributions.DistributionRepresents distributions on quarter circles (or parts thereof), either the northeastern ones or the southwestern ones, centered at a point in (0, 1)^2. The distributions are Mixture of Beta distributions on the possible angle range.
When a state is of norm <= delta, and northeastern=False, then the distribution is a Dirac at the state (i.e. the only possible parent is s_0).
Adapted from https://github.com/saleml/continuous-gfn/blob/master/sampling.py
This is useful for the Box environment.
- Parameters
delta (float) –
northeastern (bool) –
centers (torchtyping.TensorType[n_states, 2]) –
mixture_logits (torchtyping.TensorType[n_states, n_components]) –
alpha (torchtyping.TensorType[n_states, n_components]) –
beta (torchtyping.TensorType[n_states, n_components]) –
- get_min_and_max_angles()
- Return type
Tuple[torchtyping.TensorType[n_states], torchtyping.TensorType[n_states]]
- log_prob(sampled_actions)
- Parameters
sampled_actions (torchtyping.TensorType[batch_size, 2]) –
- Return type
torchtyping.TensorType[batch_size]
- sample(sample_shape=torch.Size())
- Parameters
sample_shape (torch.Size) –
- Return type
torchtyping.TensorType[QuarterCircle.sample.sample_shape, 2]
- class gfn.gym.helpers.box_utils.QuarterCircleWithExit(delta, centers, exit_probability, mixture_logits, alpha, beta, epsilon=0.0001)
Bases:
torch.distributions.DistributionExtends the previous QuarterCircle distribution by considering an extra parameter, called exit_probability of shape (n_states,). When sampling, then with probability exit_probability, the exit_action [-inf, -inf] is sampled. The log_prob function needs to change accordingly
- Parameters
delta (float) –
centers (torchtyping.TensorType[n_states, 2]) –
exit_probability (torchtyping.TensorType[n_states]) –
mixture_logits (torchtyping.TensorType[n_states, n_components]) –
alpha (torchtyping.TensorType[n_states, n_components]) –
beta (torchtyping.TensorType[n_states, n_components]) –
epsilon (float) –
- log_prob(sampled_actions)
- sample(sample_shape=())
- class gfn.gym.helpers.box_utils.QuarterDisk(delta, mixture_logits, alpha_r, beta_r, alpha_theta, beta_theta)
Bases:
torch.distributions.DistributionRepresents a distribution on the northeastern quarter disk centered at (0, 0) of maximal radius delta. The radius and the angle follow Mixture of Betas distributions.
Adapted from https://github.com/saleml/continuous-gfn/blob/master/sampling.py
This is useful for the Box environment
- Parameters
delta (float) –
mixture_logits (torchtyping.TensorType[n_components]) –
alpha_r (torchtyping.TensorType[n_components]) –
beta_r (torchtyping.TensorType[n_components]) –
alpha_theta (torchtyping.TensorType[n_components]) –
beta_theta (torchtyping.TensorType[n_components]) –
- log_prob(sampled_actions)
- Parameters
sampled_actions (torchtyping.TensorType[batch_size, 2]) –
- Return type
torchtyping.TensorType[batch_size]
- sample(sample_shape=torch.Size())
- Parameters
sample_shape (torch.Size) –
- Return type
torchtyping.TensorType[QuarterDisk.sample.sample_shape, 2]
- gfn.gym.helpers.box_utils.split_PF_module_output(output, n_comp_max)
Splits the module output into the expected parameter sets.
- Parameters
output (torchtyping.TensorType[batch_shape, output_dim, float]) – the module_output from the P_F model.
n_comp_max (int) – the larger number of the two n_components and n_components_s0.
- Returns
A probability unique to QuarterCircleWithExit. mixture_logits: Parameters shared by QuarterDisk and QuarterCircleWithExit. alpha_r: Parameters shared by QuarterDisk and QuarterCircleWithExit. beta_r: Parameters shared by QuarterDisk and QuarterCircleWithExit. alpha_theta: Parameters unique to QuarterDisk. beta_theta: Parameters unique to QuarterDisk.
- Return type
exit_probability