`gfn.gym.helpers.box_utils`

This file contains utilitary functions for the Box environment.

Module Contents

Classes

`BoxPBEstimator`	Estimator for P_B for the Box environment. Uses the QuarterCircle(northeastern=False) distribution
`BoxPBNeuralNet`	A deep neural network for the backward policy.
`BoxPBUniform`	A module to be used to create a uniform PB distribution for the Box environment
`BoxPFEstimator`	Estimator for P_F for the Box environment. Uses the BoxForwardDist distribution.
`BoxPFNeuralNet`	A deep neural network for the forward policy.
`BoxStateFlowModule`	A deep neural network for the state flow function.
`DistributionWrapper`
`QuarterCircle`	Represents distributions on quarter circles (or parts thereof), either the northeastern
`QuarterCircleWithExit`	Extends the previous QuarterCircle distribution by considering an extra parameter, called
`QuarterDisk`	Represents a distribution on the northeastern quarter disk centered at (0, 0) of maximal radius delta.

Functions

split_PF_module_output(output, n_comp_max)

Splits the module output into the expected parameter sets.

Attributes

`CLAMP`
`PI_2`
`PI_2_INV`

class gfn.gym.helpers.box_utils.BoxPBEstimator(env, module, n_components, min_concentration=0.1, max_concentration=2.0)

Bases: gfn.modules.GFNModule

Estimator for P_B for the Box environment. Uses the QuarterCircle(northeastern=False) distribution

Parameters

env (gfn.gym.Box) –
module (torch.nn.Module) –
n_components (int) –
min_concentration (float) –
max_concentration (float) –

expected_output_dim()

Expected output dimension of the module.

Return type: int

to_probability_distribution(states, module_output)

Transform the output of the module into a probability distribution.

The kwargs modify a base distribution, for example to encourage exploration.

Not all modules must implement this method, but it is required to define a policy from a module’s outputs. See DiscretePolicyEstimator for an example using a categorical distribution, but note this can be done for all continuous distributions as well.

Parameters

states (gfn.states.States) –
module_output (torchtyping.TensorType[batch_shape, output_dim, float]) –

Return type

torch.distributions.Distribution

class gfn.gym.helpers.box_utils.BoxPBNeuralNet(hidden_dim, n_hidden_layers, n_components, **kwargs)

Bases: gfn.utils.NeuralNet

A deep neural network for the backward policy.

Parameters

hidden_dim (int) –
n_hidden_layers (int) –
n_components (int) –

n_components: the number of components for each distribution parameter.

forward(preprocessed_states)

Forward method for the neural network.

Parameters: preprocessed_states (torchtyping.TensorType[batch_shape, 2, float]) – a batch of states appropriately preprocessed for ingestion by the MLP.
Return type: torchtyping.TensorType[batch_shape, 3 * n_components]

Returns: out, a set of continuous variables.

class gfn.gym.helpers.box_utils.BoxPBUniform

Bases: torch.nn.Module

A module to be used to create a uniform PB distribution for the Box environment

A module that returns (1, 1, 1) for all states. Used with QuarterCircle, it leads to a uniform distribution over parents in the south-western part of circle.

input_dim = 2

forward(preprocessed_states)

Parameters: preprocessed_states (torchtyping.TensorType[batch_shape, 2, float]) –
Return type: torchtyping.TensorType[batch_shape, 3]

class gfn.gym.helpers.box_utils.BoxPFEstimator(env, module, n_components_s0, n_components, min_concentration=0.1, max_concentration=2.0)

Bases: gfn.modules.GFNModule

Estimator for P_F for the Box environment. Uses the BoxForwardDist distribution.

Parameters

env (gfn.gym.Box) –
module (torch.nn.Module) –
n_components_s0 (int) –
n_components (int) –
min_concentration (float) –
max_concentration (float) –

expected_output_dim()

Expected output dimension of the module.

Return type: int

to_probability_distribution(states, module_output)

Transform the output of the module into a probability distribution.

The kwargs modify a base distribution, for example to encourage exploration.

Not all modules must implement this method, but it is required to define a policy from a module’s outputs. See DiscretePolicyEstimator for an example using a categorical distribution, but note this can be done for all continuous distributions as well.

Parameters

states (gfn.states.States) –
module_output (torchtyping.TensorType[batch_shape, output_dim, float]) –

Return type

torch.distributions.Distribution

class gfn.gym.helpers.box_utils.BoxPFNeuralNet(hidden_dim, n_hidden_layers, n_components_s0, n_components, **kwargs)

Bases: gfn.utils.NeuralNet

A deep neural network for the forward policy.

Parameters

hidden_dim (int) –
n_hidden_layers (int) –
n_components_s0 (int) –
n_components (int) –

n_components_s0: the number of components for each s=0 distribution parameter.

n_components: the number of components for each s=t>0 distribution parameter.

PFs0: the parameters for the s=0 distribution.

forward(preprocessed_states)

Forward method for the neural network.

Parameters: preprocessed_states (torchtyping.TensorType[batch_shape, 2, float]) – a batch of states appropriately preprocessed for ingestion by the MLP.
Return type: torchtyping.TensorType[batch_shape, 1 + 5 * n_components]

Returns: out, a set of continuous variables.

class gfn.gym.helpers.box_utils.BoxStateFlowModule(logZ_value, **kwargs)

Bases: gfn.utils.NeuralNet

A deep neural network for the state flow function.

Parameters: logZ_value (torch.Tensor) –

forward(preprocessed_states)

Forward method for the neural network.

Parameters: preprocessed_states (torchtyping.TensorType[batch_shape, input_dim, float]) – a batch of states appropriately preprocessed for ingestion by the MLP.
Return type: torchtyping.TensorType[batch_shape, output_dim, float]

Returns: out, a set of continuous variables.

gfn.gym.helpers.box_utils.CLAMP

class gfn.gym.helpers.box_utils.DistributionWrapper(states, delta, epsilon, mixture_logits, alpha_r, beta_r, alpha_theta, beta_theta, exit_probability, n_components, n_components_s0)

Bases: torch.distributions.Distribution

Parameters

states (gfn.states.States) –
delta (float) –
epsilon (float) –

log_prob(sampled_actions)

sample(sample_shape=())

gfn.gym.helpers.box_utils.PI_2

gfn.gym.helpers.box_utils.PI_2_INV

class gfn.gym.helpers.box_utils.QuarterCircle(delta, northeastern, centers, mixture_logits, alpha, beta)

Bases: torch.distributions.Distribution

Represents distributions on quarter circles (or parts thereof), either the northeastern ones or the southwestern ones, centered at a point in (0, 1)^2. The distributions are Mixture of Beta distributions on the possible angle range.

When a state is of norm <= delta, and northeastern=False, then the distribution is a Dirac at the state (i.e. the only possible parent is s_0).

Adapted from https://github.com/saleml/continuous-gfn/blob/master/sampling.py

This is useful for the Box environment.

Parameters

delta (float) –
northeastern (bool) –
centers (torchtyping.TensorType[n_states, 2]) –
mixture_logits (torchtyping.TensorType[n_states, n_components]) –
alpha (torchtyping.TensorType[n_states, n_components]) –
beta (torchtyping.TensorType[n_states, n_components]) –

get_min_and_max_angles()

Return type: Tuple[torchtyping.TensorType[n_states], torchtyping.TensorType[n_states]]

log_prob(sampled_actions)

Parameters: sampled_actions (torchtyping.TensorType[batch_size, 2]) –
Return type: torchtyping.TensorType[batch_size]

sample(sample_shape=torch.Size())

Parameters: sample_shape (torch.Size) –
Return type: torchtyping.TensorType[QuarterCircle.sample.sample_shape, 2]

class gfn.gym.helpers.box_utils.QuarterCircleWithExit(delta, centers, exit_probability, mixture_logits, alpha, beta, epsilon=0.0001)

Bases: torch.distributions.Distribution

Extends the previous QuarterCircle distribution by considering an extra parameter, called exit_probability of shape (n_states,). When sampling, then with probability exit_probability, the exit_action [-inf, -inf] is sampled. The log_prob function needs to change accordingly

Parameters

delta (float) –
centers (torchtyping.TensorType[n_states, 2]) –
exit_probability (torchtyping.TensorType[n_states]) –
mixture_logits (torchtyping.TensorType[n_states, n_components]) –
alpha (torchtyping.TensorType[n_states, n_components]) –
beta (torchtyping.TensorType[n_states, n_components]) –
epsilon (float) –

log_prob(sampled_actions)

sample(sample_shape=())

class gfn.gym.helpers.box_utils.QuarterDisk(delta, mixture_logits, alpha_r, beta_r, alpha_theta, beta_theta)

Bases: torch.distributions.Distribution

Represents a distribution on the northeastern quarter disk centered at (0, 0) of maximal radius delta. The radius and the angle follow Mixture of Betas distributions.

Adapted from https://github.com/saleml/continuous-gfn/blob/master/sampling.py

This is useful for the Box environment

Parameters

delta (float) –
mixture_logits (torchtyping.TensorType[n_components]) –
alpha_r (torchtyping.TensorType[n_components]) –
beta_r (torchtyping.TensorType[n_components]) –
alpha_theta (torchtyping.TensorType[n_components]) –
beta_theta (torchtyping.TensorType[n_components]) –

log_prob(sampled_actions)

Parameters: sampled_actions (torchtyping.TensorType[batch_size, 2]) –
Return type: torchtyping.TensorType[batch_size]

sample(sample_shape=torch.Size())

Parameters: sample_shape (torch.Size) –
Return type: torchtyping.TensorType[QuarterDisk.sample.sample_shape, 2]

gfn.gym.helpers.box_utils.split_PF_module_output(output, n_comp_max)

Splits the module output into the expected parameter sets.

Parameters

output (torchtyping.TensorType[batch_shape, output_dim, float]) – the module_output from the P_F model.
n_comp_max (int) – the larger number of the two n_components and n_components_s0.

Returns

A probability unique to QuarterCircleWithExit. mixture_logits: Parameters shared by QuarterDisk and QuarterCircleWithExit. alpha_r: Parameters shared by QuarterDisk and QuarterCircleWithExit. beta_r: Parameters shared by QuarterDisk and QuarterCircleWithExit. alpha_theta: Parameters unique to QuarterDisk. beta_theta: Parameters unique to QuarterDisk.

Return type

exit_probability

gfn.gym.helpers.box_utils

Module Contents

Classes

Functions

Attributes

`gfn.gym.helpers.box_utils`