Simuk is a Python library for simulation-based calibration (SBC) and the generation of synthetic data.
Prior Simulation-Based Calibration (Prior SBC) is a method for validating Bayesian inference by checking whether the posterior distributions align with the expected theoretical results derived from the prior.
Posterior Simulation-Based Calibration (Posterior SBC) is a method for validating Bayesian inference by checking whether the posterior distributions conditioned on the augmented data (original + posterior predictive) align with the expected theoretical results derived from the posterior.
For Prior SBC, Simuk works with PyMC, Bambi and NumPyro models. For Posterior SBC, Simuk only works with PyMC models for now.
May be pip installed from github:
pip install simuk-
Define a PyMC or Bambi model. For example, the centered eight schools model:
import numpy as np import pymc as pm from arviz_plots import plot_ecdf_pit data = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0]) sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]) with pm.Model() as centered_eight: mu = pm.Normal('mu', mu=0, sigma=5) tau = pm.HalfCauchy('tau', beta=5) theta = pm.Normal('theta', mu=mu, sigma=tau, shape=8) y_obs = pm.Normal('y', mu=theta, sigma=sigma, observed=data)
-
Pass the model to the
SBCclass, and run the simulations. This will take a while, as it is running the model many times.sbc = SBC(centered_eight, num_simulations=100, # ideally this should be higher, like 1000 sample_kwargs={'draws': 100, 'tune': 100}) sbc.run_simulations()
79%|███████▉ | 79/100 [05:36<01:29, 4.27s/it]
-
Plot the empirical CDF for the difference between prior and posterior. The lines should be close to uniform and within the oval envelope.
plot_ecdf_pit(sbc.simulations, visuals={"xlabel":False}, );
We see that due to the funnel neck in the eight schools model, the inference algorithm is not well-calibrated, as indicated by the red points.
Posterior SBC evaluates validity locally, conditional on observed data. It is
currently implemented for PyMC. This requires storing observed data in
pm.Data containers, using dims instead of static shapes, and resizing
covariates and coords in an update_data callback to match the augmented data.
-
Define the model with
pm.Dataanddims:import numpy as np import pymc as pm data = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0]) sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]) with pm.Model(coords={"school": np.arange(8)}) as centered_eight: school_idx = pm.Data("school_idx", np.arange(8)) y_data = pm.Data("y_data", data) sigma_data = pm.Data("sigma_data", sigma) mu = pm.Normal("mu", mu=0, sigma=5) tau = pm.HalfCauchy("tau", beta=5) theta = pm.Normal("theta", mu=mu, sigma=tau, dims="school") y_obs = pm.Normal("y", mu=theta[school_idx], sigma=sigma_data, observed=y_data)
-
Sample once to obtain the original trace:
with centered_eight: idata = pm.sample(progressbar=False)
-
Define
update_datato resize covariates and run Posterior SBC:import simuk from arviz_plots import plot_ecdf_pit def update_data(model, augmented_data, simulation_idx): with model: pm.set_data({ "sigma_data": np.concatenate([sigma, sigma]), "school_idx": np.concatenate([np.arange(8), np.arange(8)]) }) post_sbc = simuk.SBC( centered_eight, method="posterior", trace=idata, update_data=update_data, num_simulations=50, sample_kwargs={"draws": 100, "tune": 100}, progress_bar=False ) post_sbc.run_simulations() plot_ecdf_pit(post_sbc.simulations, group="posterior_sbc", visuals={"xlabel": False})
We see that the funnel neck in the eight schools model is avoided and the inference algorithm is well-calibrated locally for the observed data, as indicated by the absence of red points.
- Talts, S., Betancourt, M., Simpson, D., Vehtari A., and Gelman A. (2018). Validating Bayesian Inference Algorithms with Simulation-Based Calibration.
- Modrák, M., Moon, A, Kim, S., Bürkner, P., Huurre, N., Faltejsková, K., Gelman A and Vehtari, A.(2023). Simulation-based calibration checking for Bayesian computation: The choice of test quantities shapes sensitivity. Bayesian Analysis.
- Säilynoja, T., Marvin Schmitt, Paul-Christian Bürkner and Aki Vehtari (2025). Posterior SBC: Simulation-Based Calibration Checking Conditional on Data.

