Nested Sampling Tutorial¶
This tutorial describes the available options when running Nested
Sampling with mc3
. The following sections make up a script meant
to be run from the Python interpreter or in a Python script. At the
bottom of this page you can see the entire script.
Preamble¶
mc3
implements Nested Sampling through the dynesty package
[Speagle2019]. Thus, make sure to install and cite this Python
package if needed.
In this tutorial, we will use the same Preamble setup as in the MCMC tutorial, fitting a quadratic polynomial. Likewise, most of the input arguments follow the same format as an MCMC run, including: Input Data, Modeling Function, Parameter Priors, Parameter Names, Optimization, Outputs, and Logging.
Sampler Algorithm¶
Set the sampler
argument to dynesty
for a nested-sampling run
with dynesty:
# Sampler algorithm, choose from: 'snooker', 'demc', 'mrw', or 'dynesty'.
sampler = 'dynesty'
Fitting Parameters¶
The params
argument (required) is a 1D float ndarray containing
the initial-guess values for the model fitting parameters.
A nested-sampling run requires a proper domain (i.e., bounded); thus,
the pmin
and pmax
arguments are required, and must have finite
values.
The pstep
argument sets the sampling behavior of the fitting
parameters. A positive pstep
value leaves a parameter free, a
pstep
value of zero keeps the parameter fixed, whereas a negative
pstep
value make the parameter to share its value with another
parameter (see Stepping Behavior).
The prior
, priorlow
, and priorup
arguments set the type of
prior (uniform or Gaussian) and their values. See details in
Parameter Priors.
# Array of initial-guess values of fitting parameters:
params = np.array([ 10.0, -2.0, 0.1])
# Lower and upper boundaries for the MCMC exploration:
pmin = np.array([-10.0, -20.0, -10.0])
pmax = np.array([ 40.0, 20.0, 10.0])
# Parameters' stepping behavior:
pstep = np.array([1.0, 0.5, 0.1])
# Two-sided Gaussian prior on first parameter, uniform priors on rest:
prior = np.array([3.5, 0.0, 0.0])
priorlow = np.array([0.1, 0.0, 0.0])
priorup = np.array([0.3, 0.0, 0.0])
Sampling Configuration¶
The following arguments set the nested- configuration:
# Optimization before MCMC, choose from: 'lm' or 'trf':
leastsq = 'lm'
chisqscale = False
# NS setup:
ncpu = 7
thinning = 1
The leastsq
argument (optional, default: None) allows to run a
least-squares optimization before the sampling (see
Optimization for details).
The ncpu
argument (optional, default: nchains
) sets the number
CPUs to use for the sampling. When ncpu>1
, mc3
will run in
parallel processors through the mutiprocessing
Python
Standard-Library package (no need to set a pool
input).
The thinning
argument (optional, default: 1) sets the posterior
thinning factor (discarding all but every thinning
-th sample),
to reduce the memory usage.
Furhter dynesty Configuration¶
An mc3.sample()
run with dynesty nested-sampling can also
receive arguments accepted by dynesty.DynamicNestedSampler()
or run_nested()
method.
However, note that if you pass prior_transform
, mc3
won’t be
able to compute the log(posterior) (implementation is TBD). Likewise,
if you pass loglikelihood
or prior_transform
, mc3
won’t be
able to run an optimization (implementation is TBD).
Nested-sampling Run¶
Putting it all together, here’s a Python script to run an mc3
nested-sampling retrieval:
import sys
import numpy as np
import mc3
def quad(p, x):
"""
Quadratic polynomial function.
Parameters
p: Polynomial constant, linear, and quadratic coefficients.
x: Array of dependent variables where to evaluate the polynomial.
Returns
y: Polinomial evaluated at x: y(x) = p0 + p1*x + p2*x^2
"""
y = p[0] + p[1]*x + p[2]*x**2.0
return y
# ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
# Preamble (create a synthetic dataset, in a real scenario you would
# get your dataset from your own data analysis pipeline):
np.random.seed(3)
x = np.linspace(0, 10, 100)
p0 = [3, -2.4, 0.5]
y = quad(p0, x)
uncert = np.sqrt(np.abs(y))
data = y + np.random.normal(0, uncert)
# ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
# Modeling function:
func = quad
# List of additional arguments of func (if necessary):
indparams = [x]
# Array of initial-guess values of fitting parameters:
params = np.array([ 10.0, -2.0, 0.1])
# Lower and upper boundaries for the MCMC exploration:
pmin = np.array([-10.0, -20.0, -10.0])
pmax = np.array([ 40.0, 20.0, 10.0])
# Parameters' stepping behavior:
pstep = np.array([1.0, 0.5, 0.1])
# Two-sided Gaussian prior on first parameter, uniform priors on rest:
prior = np.array([3.5, 0.0, 0.0])
priorlow = np.array([0.1, 0.0, 0.0])
priorup = np.array([0.3, 0.0, 0.0])
# Parameter names:
pnames = ['y0', 'alpha', 'beta']
texnames = [r'$y_{0}$', r'$\alpha$', r'$\beta$']
# Sampler algorithm, choose from: 'snooker', 'demc', 'mrw', or 'dynesty'.
sampler = 'dynesty'
# Optimization before MCMC, choose from: 'lm' or 'trf':
leastsq = 'lm'
chisqscale = False
# NS setup:
ncpu = 7
thinning = 1
# Logging:
log = 'NS_tutorial.log'
# File outputs:
savefile = 'NS_tutorial.npz'
plots = True
rms = True
# Run the MCMC:
mc3_output = mc3.sample(data=data, uncert=uncert, func=func, params=params,
indparams=indparams, pmin=pmin, pmax=pmax, pstep=pstep,
pnames=pnames, texnames=texnames,
prior=prior, priorlow=priorlow, priorup=priorup,
sampler=sampler, ncpu=ncpu, thinning=thinning,
leastsq=leastsq, chisqscale=chisqscale,
plots=plots, rms=rms, log=log, savefile=savefile)
A nested-sampling run returns a dictionary with the same outputs as an
MCMC run (see Outputs), except that instead of an
acceptance_rate
, it contains the sampling efficiency eff
, and
the dynesty sampler object dynesty_sampler
. The screen output
should look like this:
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Multi-core Markov-chain Monte Carlo (mc3).
Version 3.0.0.
Copyright (c) 2015-2019 Patricio Cubillos and collaborators.
mc3 is open-source software under the MIT license (see LICENSE).
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Least-squares best-fitting parameters:
[ 3.47315859 -2.71597145 0.53126294]
Running dynesty dynamic nested-samping run:
iter: 23417|eff(%): 26.047|logl*: -63.0< -56.3< -57.3|logz: -69.6+/-0.2
Nested Sampling Summary:
------------------------
Number of evaluated samples: 23417
Thinning factor: 1
NS sample size (thinned): 23417
Sampling efficiency: 26.05%
Param name Best fit Lo HPD CR Hi HPD CR Mean Std dev S/N
----------- ----------------------------------- ---------------------- ---------
y0 3.4732e+00 -1.1103e-01 1.6907e-01 3.5370e+00 1.4736e-01 23.6
alpha -2.7160e+00 -1.2635e-01 8.2435e-02 -2.7514e+00 1.0505e-01 25.9
beta 5.3126e-01 -1.4149e-02 2.0976e-02 5.3508e-01 1.7523e-02 30.3
Best-parameter's chi-squared: 113.6593
Best-parameter's -2*log(posterior): 113.7313
Bayesian Information Criterion: 127.4748
Reduced chi-squared: 1.1717
Standard deviation of residuals: 3.0212
Output sampler files:
'NS_tutorial.npz'
'NS_tutorial_trace.png'
'NS_tutorial_pairwise.png'
'NS_tutorial_posterior.png'
'NS_tutorial_RMS.png'
'NS_tutorial_model.png'
'NS_tutorial.log'