Nested Sampling Tutorial

This tutorial describes the available options when running Nested Sampling with MC3. The following sections make up a script meant to be run from the Python interpreter or in a Python script. At the bottom of this page you can see the entire script.

Preamble

MC3 implements Nested Sampling through the dynesty package [Speagle2019]. Thus, make sure to install and cite this Python package if needed.

In this tutorial, we will use the same Preamble setup as in the MCMC tutorial, fitting a quadratic polynomial. Likewise, most of the input arguments follow the same format as an MCMC run, including: Input Data, Modeling Function, Parameter Priors, Parameter Names, Optimization, Outputs, and Logging.

Sampler Algorithm

Set the sampler argument to dynesty for a nested-sampling run with dynesty:

# Sampler algorithm, choose from: 'snooker', 'demc', 'mrw', or 'dynesty'.
sampler = 'dynesty'

Fitting Parameters

The params argument (required) is a 1D float ndarray containing the initial-guess values for the model fitting parameters.

A nested-sampling run requires a proper domain (i.e., bounded); thus, the pmin and pmax arguments are required, and must have finite values.

The pstep argument sets the sampling behavior of the fitting parameters. A positive pstep value leaves a parameter free, a pstep value of zero keeps the parameter fixed, whereas a negative pstep value make the parameter to share its value with another parameter (see Stepping Behavior).

The prior, priorlow, and priorup arguments set the type of prior (uniform or Gaussian) and their values. See details in Parameter Priors.

# Array of initial-guess values of fitting parameters:
params = np.array([ 10.0, -2.0, 0.1])
# Lower and upper boundaries for the MCMC exploration:
pmin = np.array([-10.0, -20.0, -10.0])
pmax = np.array([ 40.0,  20.0,  10.0])
# Parameters' stepping behavior:
pstep = np.array([1.0, 0.5, 0.1])

# Two-sided Gaussian prior on first parameter, uniform priors on rest:
prior    = np.array([3.5, 0.0, 0.0])
priorlow = np.array([0.1, 0.0, 0.0])
priorup  = np.array([0.3, 0.0, 0.0])

Sampling Configuration

The following arguments set the nested- configuration:

# Optimization before MCMC, choose from: 'lm' or 'trf':
leastsq    = 'lm'
chisqscale = False
# NS setup:
ncpu     = 7
thinning = 1

The leastsq argument (optional, default: None) allows to run a least-squares optimization before the sampling (see Optimization for details).

The ncpu argument (optional, default: nchains) sets the number CPUs to use for the sampling. When ncpu>1, MC3 will run in parallel processors through the mutiprocessing Python Standard-Library package (no need to set a pool input).

The thinning argument (optional, default: 1) sets the posterior thinning factor (discarding all but every thinning-th sample), to reduce the memory usage.

Furhter dynesty Configuration

An mc3.sample() run with dynesty nested-sampling can also receive arguments accepted by dynesty.DynamicNestedSampler() or run_nested() method.

However, note that if you pass prior_transform, MC3 won’t be able to compute the log(posterior) (implementation is TBD). Likewise, if you pass loglikelihood or prior_transform, MC3 won’t be able to run an optimization (implementation is TBD).

Nested-sampling Run

Putting it all together, here’s a Python script to run an MC3 nested-sampling retrieval:

import sys
import numpy as np
import mc3


def quad(p, x):
    """
    Quadratic polynomial function.

    Parameters
        p: Polynomial constant, linear, and quadratic coefficients.
        x: Array of dependent variables where to evaluate the polynomial.
    Returns
        y: Polinomial evaluated at x:  y(x) = p0 + p1*x + p2*x^2
    """
    y = p[0] + p[1]*x + p[2]*x**2.0
    return y

# ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
# Preamble (create a synthetic dataset, in a real scenario you would
# get your dataset from your own data analysis pipeline):
np.random.seed(3)
x  = np.linspace(0, 10, 100)
p0 = [3, -2.4, 0.5]
y  = quad(p0, x)
uncert = np.sqrt(np.abs(y))
data   = y + np.random.normal(0, uncert)
# ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

# Modeling function:
func = quad
# List of additional arguments of func (if necessary):
indparams = [x]

# Array of initial-guess values of fitting parameters:
params = np.array([ 10.0, -2.0, 0.1])
# Lower and upper boundaries for the MCMC exploration:
pmin = np.array([-10.0, -20.0, -10.0])
pmax = np.array([ 40.0,  20.0,  10.0])
# Parameters' stepping behavior:
pstep = np.array([1.0, 0.5, 0.1])

# Two-sided Gaussian prior on first parameter, uniform priors on rest:
prior    = np.array([3.5, 0.0, 0.0])
priorlow = np.array([0.1, 0.0, 0.0])
priorup  = np.array([0.3, 0.0, 0.0])

# Parameter names:
pnames   = ['y0', 'alpha', 'beta']
texnames = [r'$y_{0}$', r'$\alpha$', r'$\beta$']

# Sampler algorithm, choose from: 'snooker', 'demc', 'mrw', or 'dynesty'.
sampler = 'dynesty'

# Optimization before MCMC, choose from: 'lm' or 'trf':
leastsq    = 'lm'
chisqscale = False
# NS setup:
ncpu     = 7
thinning = 1

# Logging:
log = 'NS_tutorial.log'

# File outputs:
savefile = 'NS_tutorial.npz'
plots    = True
rms      = True

# Run the MCMC:
mc3_output = mc3.sample(data=data, uncert=uncert, func=func, params=params,
     indparams=indparams, pmin=pmin, pmax=pmax, pstep=pstep,
     pnames=pnames, texnames=texnames,
     prior=prior, priorlow=priorlow, priorup=priorup,
     sampler=sampler, ncpu=ncpu, thinning=thinning,
     leastsq=leastsq, chisqscale=chisqscale,
     plots=plots, rms=rms, log=log, savefile=savefile)

A nested-sampling run returns a dictionary with the same outputs as an MCMC run (see Outputs), except that instead of an acceptance_rate, it contains the sampling efficiency eff, and the dynesty sampler object dynesty_sampler. The screen output should look like this:

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Multi-core Markov-chain Monte Carlo (MC3).
  Version 3.0.0.
  Copyright (c) 2015-2019 Patricio Cubillos and collaborators.
  MC3 is open-source software under the MIT license (see LICENSE).
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Least-squares best-fitting parameters:
  [ 3.47315859 -2.71597145  0.53126294]

Running dynesty dynamic nested-samping run:
iter: 23417|eff(%): 26.047|logl*:  -63.0< -56.3< -57.3|logz:  -69.6+/-0.2

Nested Sampling Summary:
------------------------
  Number of evaluated samples:  23417
  Thinning factor:                  1
  NS sample size (thinned):     23417
  Sampling efficiency:  26.05%

Param name     Best fit   Lo HPD CR   Hi HPD CR        Mean    Std dev       S/N
----------- ----------------------------------- ---------------------- ---------
y0           3.4732e+00 -1.1103e-01  1.6907e-01  3.5370e+00 1.4736e-01      23.6
alpha       -2.7160e+00 -1.2635e-01  8.2435e-02 -2.7514e+00 1.0505e-01      25.9
beta         5.3126e-01 -1.4149e-02  2.0976e-02  5.3508e-01 1.7523e-02      30.3

  Best-parameter's chi-squared:       113.6593
  Best-parameter's -2*log(posterior): 113.7313
  Bayesian Information Criterion:     127.4748
  Reduced chi-squared:                  1.1717
  Standard deviation of residuals:  3.0212

Output sampler files:
  'NS_tutorial.npz'
  'NS_tutorial_trace.png'
  'NS_tutorial_pairwise.png'
  'NS_tutorial_posterior.png'
  'NS_tutorial_RMS.png'
  'NS_tutorial_model.png'
  'NS_tutorial.log'