Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNLE and SNRE fail with a 1D prior #1283

Open
paarth-dudani opened this issue Sep 24, 2024 · 8 comments
Open

SNLE and SNRE fail with a 1D prior #1283

paarth-dudani opened this issue Sep 24, 2024 · 8 comments
Labels
bug Something isn't working urgent

Comments

@paarth-dudani
Copy link

paarth-dudani commented Sep 24, 2024

Describe the bug
I am implementing SNRE and SNLE (from implemented algorithms) on a simple exponential simulator model with noise and a prior. The algorithms work just fine for a uniform prior but give the following error: 'number of categories cannot exceed 2^24', with the exponential prior.

To Reproduce

  1. Versions
    Python version: 3.9.13
    SBI version: 0.23.1

  2. Code for SNLE implementation but I get the same error for SNRE implementation as well (with Inference object: NRE)

## Function to introduce noise into the exponential:
def noisy_exponential(scale, input):
    x_noise = np.zeros((len(input)),)
    for i in range(len(input)):
        value = 1/(scale)*np.exp(-input[i]/scale) + 0.05*np.random.normal(0,0.5)
        if value < 0:
            x_noise[i] = 0
        else:
            x_noise[i] = value
    return x_noise
n = 200
t = np.linspace(0,8,n)
# Define the prior
num_dims = 1
num_sims = 100
num_rounds = 2
theta_true = torch.tensor([2.0])


# prior = BoxUniform(low=0*torch.zeros(num_dims), high=4*torch.ones(num_dims))
exp_prior = Exponential(torch.tensor([2.0]))
simulator = lambda theta: ((1/theta)*np.exp(-t/theta) + 0.05*np.random.normal(0,0.5)).float()
x_o = torch.tensor(noisy_exponential(theta_true,t)).float()
# Inference 
inference = NLE(exp_prior)
proposal = exp_prior
for _ in range(num_rounds):
    theta = proposal.sample((num_sims,))
    x = simulator(theta)
    _ = inference.append_simulations(theta, x).train()
    posterior = inference.build_posterior(mcmc_method="slice_np_vectorized",
                                          mcmc_parameters={"num_chains": 2,
                                                           "thin": 5})
    proposal = posterior.set_default_x(x_o)

  1. RuntimeError: Number of categories cannot exceed 2^24. (after the neural network successfully converges)
RuntimeError                              Traceback (most recent call last)
Input [In [43]](vscode-notebook-cell:?execution_count=43), in <cell line: 17>()
     [16](vscode-notebook-cell:?execution_count=43&line=16) proposal = exp_prior
     [17](vscode-notebook-cell:?execution_count=43&line=17) for _ in range(num_rounds):
---> [18](vscode-notebook-cell:?execution_count=43&line=18)     theta = proposal.sample((num_sims,))
     [19](vscode-notebook-cell:?execution_count=43&line=19)     x = simulator(theta)
     [20](vscode-notebook-cell:?execution_count=43&line=20)     _ = inference.append_simulations(theta, x).train()

File ~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:306, in MCMCPosterior.sample(self, sample_shape, x, method, thin, warmup_steps, num_chains, init_strategy, init_strategy_parameters, init_strategy_num_candidates, mcmc_parameters, mcmc_method, sample_with, num_workers, mp_context, show_progress_bars)
    [303](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:303) init_strategy = _maybe_use_dict_entry(init_strategy, "init_strategy", m_p)
    [304](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:304) self.potential_ = self._prepare_potential(method)  # type: ignore
--> [306](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:306) initial_params = self._get_initial_params(
    [307](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:307)     init_strategy,  # type: ignore
    [308](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:308)     num_chains,  # type: ignore
    [309](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:309)     num_workers,
    [310](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:310)     show_progress_bars,
    [311](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:311)     **init_strategy_parameters,
    [312](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:312) )
    [313](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:313) num_samples = torch.Size(sample_shape).numel()
    [315](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:315) track_gradients = method in ("hmc_pyro", "nuts_pyro", "hmc_pymc", "nuts_pymc")

File ~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:618, in MCMCPosterior._get_initial_params(self, init_strategy, num_chains, num_workers, show_progress_bars, **kwargs)
    [615](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:615)     initial_params = torch.cat(initial_params)  # type: ignore
    [616](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/inference/posteriors/mcmc_posterior.py:616) else:
...
--> [112](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/samplers/mcmc/init_strategy.py:112) idxs = torch.multinomial(probs, 1, replacement=False)
    [113](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/samplers/mcmc/init_strategy.py:113) # Return transformed sample.
    [114](https://file+.vscode-resource.vscode-cdn.net/Users/paarthdudani/Desktop/All%20folders/Python%20files/Thesis%20Code/Pilot_code/~/anaconda3/lib/python3.9/site-packages/sbi/samplers/mcmc/init_strategy.py:114) return transform(init_param_candidates[idxs, :])

RuntimeError: number of categories cannot exceed 2^24

Expected behavior
I expect the network to undergo multiple rounds of training (2 in this example) or give me a pairplot after one round of training (not shown above).

@paarth-dudani paarth-dudani added the bug Something isn't working label Sep 24, 2024
@michaeldeistler
Copy link
Contributor

Hi there,

thanks a lot for reporting this! The following will fix it:

class WrappedExponential(Exponential):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def log_prob(*args, **kwargs):
        log_probs = Exponential.log_prob(*args, **kwargs)
        return log_probs.squeeze()


exp_prior = WrappedExponential(torch.tensor([2.0]))

The reason that the issue was happening is that Exponential.log_prob returns a tensor of shape (num_samples, 1), but any multivariate torch distribution (e.g., MultivariateNormal) returns a tensor of shape (num_samples).

I will leave this issue open though because we should deal with this in process_prior, but we currently do not.

I hope that the above fixes the issue! Let me know if you have any more questions!

All the best
Michael

@michaeldeistler michaeldeistler changed the title SNLE and SNRE work fine with a uniform prior but give error with an exponential prior SNLE and SNRE fail with a 1D prior Sep 24, 2024
@michaeldeistler
Copy link
Contributor

michaeldeistler commented Sep 24, 2024

More notes for future fixing on our side:

This is only an issue for 1D pytorch distributions. The issue is that, e.g., torch.distributions.Exponential and torch.distributions.Normal, or torch.distributions.Uniform all return 1) either no sample dimension and no log-prob dimension, or 2) either a sample dimension and a log prob dimension. However, for sbi, we need a sample dimension but no log-prob dimension.

from torch.distributions import Exponential, MultivariateNormal

prior = Exponential(torch.tensor(2.0))
samples = prior.sample((10,))  # (10,)
log_probs = prior.log_prob(samples)  # (10,)
# `sbi` raises an error because one must have a sample dimension.

prior = Exponential(torch.tensor([2.0]))
samples = prior.sample((10,))  # (10, 1)
log_probs = prior.log_prob(samples)  # (10, 1)
# `sbi` fails because the log_prob dimension contains the data dim.

prior = MultivariateNormal(torch.tensor([2.0]), torch.tensor([2.0]))
samples = prior.sample((10,))  # (10, 1)
log_probs = prior.log_prob(samples)  # (10,)
# `sbi` works.

IMO, the easiest fix would be to introduce the following:

class OneDimDistributionWrapper(torch.Distribution):
    def __init__(self, dist, *args, **kwargs):
        super().__init__()
        self.dist = dist

    def sample(*args, **kwargs):
        return self.dist.sample(*args, **kwargs)

    def log_prob(*args, **kwargs):
        return self.dist.log_prob(*args, **kwargs)[..., 0]  # Remove the additional dimension.

    @property
    def arg_constraints(self) -> Dict[str, constraints.Constraint]:
        return self.dist.arg_constraints

    @property
    def support(self):
        return self.dist.support

    @property
    def mean(self) -> Tensor:
        return self.dist.mean

    @property
    def variance(self) -> Tensor:
        return self.dist.variance

We could then use this to wrap the 1D distributions which have a sample dimension.

@paarth-dudani
Copy link
Author

Hi Michael!

I am pleasantly surprised by your prompt response!

I did make the first change as you had suggested and it did resolve the issue! However, I am now facing newer issues such as the network not converging/not converging randomly.

Thanks for the help!!

Result for SNLE:

Screenshot 2024-09-25 at 15 55 18

Result for SNRE (around 1/3rd of the time)

Screenshot 2024-09-25 at 15 57 39

@michaeldeistler
Copy link
Contributor

I don't understand what is being shown in the plot.

@paarth-dudani
Copy link
Author

paarth-dudani commented Sep 25, 2024

The plot is for the posterior on the scale parameter for the exponential being learned by SNLE/SNRE as shown in the code below for SNLE, against the true value of theta.

class WrappedExponential(Exponential):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def log_prob(*args, **kwargs):
        log_probs = Exponential.log_prob(*args, **kwargs)
        return log_probs.squeeze()


exp_prior_mod = WrappedExponential(torch.tensor([2.0]))
simulator = lambda theta: ((1/theta)*np.exp(-t/theta) + 0.05*np.random.normal(0,0.5)).float()

x_o = torch.tensor(noisy_exponential(theta_true,t)).float()
# Inference with the relevant SNE method: Here, SNLE

inference = NLE(exp_prior_mod)
proposal = exp_prior_mod
for _ in range(num_rounds):
    theta = proposal.sample((num_sims,))
    x = simulator(theta)
    _ = inference.append_simulations(theta, x).train(validation_fraction=0.1, retrain_from_scratch = True)
    posterior = inference.build_posterior(mcmc_method="slice_np_vectorized",
                                          mcmc_parameters={"num_chains": 5,
                                                           "thin": 5})
    proposal = posterior.set_default_x(x_o)

# Sample from the posterior and plot

posterior_samples_exp = posterior.sample((200,), x=x_o)

_ = pairplot(
    posterior_samples_exp, limits=[[0, 10]], figsize=(5, 5),
    labels=[r"$\theta$"],
    points=theta_true # add ground truth thetas
)

An example of a desired result is comes from SNPE_C implementation as shown below:

Screenshot 2024-09-25 at 16 32 03

@michaeldeistler
Copy link
Contributor

I looked into this, and found three issues with your model:

  1. Your simulator has a bug. It adds 0.05*np.random.normal(0,0.5) for every theta. In other words, it adds the same noise for every theta, which is not what you want. I changed the line to
simulator = lambda theta: ((1/theta)*torch.exp(-t/theta) + 0.5 * torch.randn_like(theta)).float()
  1. x[:, 0] can have very large values, sometimes up to thousands. These kind of extreme outliers will destroy neural network training. I fixed this with:
theta = proposal.sample((num_sims,))
x = simulator(theta)
x[x > 5.0] = 5.0
  1. With n=200, the posterior is very very narrow. I could imagine that MCMC methods used in SNLE and SNRE can struggle with this. I used n=10.

With all of these fixes, I get the results below:
debug

As you can see, even with n=10, the posterior is very narrow.

Hope this helps
Michael

@paarth-dudani
Copy link
Author

Thank you so much for your suggestions!

I did try them out. However, the estimates that I am getting from each of the 3 algorithms (SNPE_C, SNLE, SNRE_B) are somewhat unreliable. I get bad posterior estimates in between and the good estimates are not reproducible.

Below are some examples of pathological posterior estimates:

  1. SNPE

SNPE_1
SNPE2
SNPE3

  1. SNLE

SNLE_1
SNLE2
SNLE3

  1. SNRE

SNRE_1
SNRE3
SNRE2

It would be great if you could give some general suggestions, as my ultimate aim is to implement these algorithms for a 21 parameter ODE model.

@paarth-dudani
Copy link
Author

paarth-dudani commented Oct 3, 2024

I am sorry for not attaching the code for this issue, so here goes:

  1. SNRE:
exp_prior_mod = WrappedExponential(torch.tensor([2.0]))
simulator = lambda theta: ((1/theta)*np.exp(-t/theta) + 0.05*torch.rand_like(theta)).float()
x_o = torch.tensor(noisy_exponential(theta_true,t)).float()


from sbi.inference import NRE
# torch.manual_seed(0)
# np.random.seed(0)
# inference = NRE(prior)
# proposal = prior
inference = NRE(exp_prior_mod)
proposal = exp_prior_mod
for _ in range(num_rounds):
    theta = proposal.sample((num_sims,))
    print(np.shape(theta))
    x = simulator(theta)
    x[x>10] = 10
    _ = inference.append_simulations(theta, x).train()
    posterior = inference.build_posterior(mcmc_method="slice_np_vectorized",
                                          mcmc_parameters={"num_chains": 10,
                                                           "thin": 5})
    proposal = posterior.set_default_x(x_o)
  1. SNLE: With the same prior and simulator model
inference = NLE(exp_prior_mod)
proposal = exp_prior_mod
for _ in range(num_rounds):
    theta = proposal.sample((num_sims,))
    x = simulator(theta)
    x[x>10] = 10
    _ = inference.append_simulations(theta, x).train(validation_fraction=0.1, retrain_from_scratch = False)
    posterior = inference.build_posterior(mcmc_method="slice_np_vectorized",
                                          mcmc_parameters={"num_chains": 10,
                                                           "thin": 5})
    proposal = posterior.set_default_x(x_o)
  1. SNPE: Same prior and simulator model:
inference = NPE(prior=exp_prior_mod, density_estimator='mdn')
proposal = exp_prior_mod

# Later rounds use the APT loss.
for _ in range(num_rounds):
    theta = proposal.sample((num_sims,))
    x = simulator(theta)
    x[x>10] = 10
    _ = inference.append_simulations(theta, x, proposal=proposal).train(validation_fraction=0.1,use_combined_loss = True, retrain_from_scratch = True)
    posterior = inference.build_posterior().set_default_x(x_o)
    proposal = posterior

I had another question:
What changes can I make to the code so as to deal with the overconfident underestimation of the standard deviation by the posterior estimate? 'retrain_from_scratch = True' was one of my attempts at that.

They all give pathological outputs as shown in the previous comment. I would appreciate any suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working urgent
Projects
None yet
Development

No branches or pull requests

2 participants