examples.train_box

The goal of this script is to reproduce some of the published results on the Box environment. Run one of the following commands to reproduce some of the results in [A theory of continuous generative flow networks](https://arxiv.org/abs/2301.12594)

python train_box.py –delta {0.1, 0.25} –tied {–uniform_pb} –loss {TB, DB}

Module Contents

Functions

estimate_jsd(kde1, kde2)

Estimate Jensen-Shannon divergence between two distributions defined by KDEs

get_test_states([n, maxi])

Create a list of states from [0, 1]^2 by discretizing it into n x n grid.

main(args)

sample_from_reward(env, n_samples)

Samples states from the true reward distribution

Attributes

parser

examples.train_box.estimate_jsd(kde1, kde2)

Estimate Jensen-Shannon divergence between two distributions defined by KDEs

Returns

A float value of the estimated JSD

examples.train_box.get_test_states(n=100, maxi=1.0)

Create a list of states from [0, 1]^2 by discretizing it into n x n grid.

Returns

A numpy array of shape (n^2, 2) containing the test states,

examples.train_box.main(args)
examples.train_box.parser
examples.train_box.sample_from_reward(env, n_samples)

Samples states from the true reward distribution

Implement rejection sampling, with proposal being uniform distribution in [0, 1]^2 :returns: A numpy array of shape (n_samples, 2) containing the sampled states

Parameters