Getting Started

System Requirements

MC3 is compatible with Python 2.7 and 3.6+, and has been tested to work on Unix/Linux and OS X machines, with the following software:

  • Numpy (version 1.8.2+)
  • Scipy (version 0.17.1+)
  • Matplotlib (version 1.3.1+)

MC3 may work with previous versions of these software; however, we do not guarantee nor provide support for that.

Install

To install MC3, just run the following command (if you use conda, see instructions below):

pip install mc3

Or alternatively (for conda users and for developers), clone the repository to your local machine with the following terminal commands:

# Clone the repository to your working directory:
git clone https://github.com/pcubillos/mc3
cd mc3
python setup.py install

MC3 provides MCMC and nested-sampling posterior sampling, optimization and other lower-level statistical and plotting routines. See the full docs in the API or through the Python interpreter:

import mc3
# MCMC or nested sampling:
help(mc3.sampling)
# Optimization:
help(mc3.fit)
# Assorted stats:
help(mc3.stats)
# Plotting utilities:
help(mc3.plots)

Example 1: Interactive Run

The following example shows a basic MCMC run from the Python interpreter, for a quadratic-polynomial fit to a noisy dataset:

import numpy as np
import mc3


def quad(p, x):
    """
    Quadratic polynomial function.

    Parameters
        p: Polynomial constant, linear, and quadratic coefficients.
        x: Array of dependent variables where to evaluate the polynomial.
    Returns
        y: Polinomial evaluated at x:  y = p0 + p1*x + p2*x^2
    """
    y = p[0] + p[1]*x + p[2]*x**2.0
    return y

# For the sake of example, create a noisy synthetic dataset, in a real
# scenario you would get your dataset from your data analysis pipeline:
np.random.seed(3)
x = np.linspace(0, 10, 100)
p_true = [3.0, -2.4, 0.5]
y = quad(p_true, x)
uncert = np.sqrt(np.abs(y))
data = y + np.random.normal(0, uncert)

# Initial guess for fitting parameters:
params = np.array([10.0, -2.0, 0.1])
pstep  = np.array([0.03, 0.03, 0.05])

# Run the MCMC:
func = quad
mc3_results = mc3.sample(data, uncert, func, params, indparams=[x],
    pstep=pstep, sampler='snooker', nsamples=1e5, burnin=1000, ncpu=7)

That’s it. The code returns a dictionary with the MCMC results. Among these, you can find the posterior sample (posterior), the best-fitting values (bestp), the lower and upper boundaries of the 68%-credible region (CRlo and CRhi, with respect to bestp), the standard deviation of the marginal posteriors (stdp), among other variables.

MC3 will also print out to screen a progress report every 10% of the MCMC run, showing the time, number of times a parameter tried to go beyond the boundaries, the current best-fitting values, and lowest \(\chi^{2}\); for example:

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
  Multi-core Markov-chain Monte Carlo (MC3).
  Version 3.0.0.
  Copyright (c) 2015-2019 Patricio Cubillos and collaborators.
  MC3 is open-source software under the MIT license (see LICENSE).
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Yippee Ki Yay Monte Carlo!
Start MCMC chains  (Thu Aug  8 11:07:01 2019)

[:         ]  10.0% completed  (Thu Aug  8 11:07:01 2019)
Out-of-bound Trials:
[0 0 0]
Best Parameters: (chisq=112.6268)
[ 3.12818712 -2.53735522  0.51348382]
Gelman-Rubin statistics for free parameters:
[1.01018891 1.00754117 1.01020828]

...

[::::::::::] 100.0% completed  (Thu Aug  8 11:07:03 2019)
Out-of-bound Trials:
[0 0 0]
Best Parameters: (chisq=112.5932)
[ 3.07622279 -2.50383404  0.50898544]
Gelman-Rubin statistics for free parameters:
[1.00065724 1.00044997 1.00034891]
All parameters converged to within 1% of unity.

MCMC Summary:
-------------
  Number of evaluated samples:        100002
  Number of parallel chains:               7
  Average iterations per chain:        14286
  Burned-in iterations per chain:       1000
  Thinning factor:                         1
  MCMC sample size (thinned, burned):  93002
  Acceptance rate:   27.68%

Param name     Best fit   Lo HPD CR   Hi HPD CR        Mean    Std dev       S/N
----------- ----------------------------------- ---------------------- ---------
Param 1      3.0762e+00 -3.9009e-01  3.8081e-01  3.0778e+00 3.8496e-01       8.0
Param 2     -2.5038e+00 -2.2503e-01  2.1973e-01 -2.5003e+00 2.2042e-01      11.4
Param 3      5.0899e-01 -2.7631e-02  2.6352e-02  5.0836e-01 2.6877e-02      18.9

  Best-parameter's chi-squared:       112.5932
  Best-parameter's -2*log(posterior): 112.5932
  Bayesian Information Criterion:     126.4088
  Reduced chi-squared:                  1.1608
  Standard deviation of residuals:  3.00568

At the end of the MCMC run, MC3 displays a summary of the MCMC sample, best-fitting parameters, credible-region boundaries, posterior mean and standard deviation, among other statistics.

Additionally, the user has the option to generate several plots of the MCMC sample: the best-fitting model and data curves, parameter traces, and marginal and pair-wise posteriors (these plots can also be generated automatically with the MCMC run by setting plots=True). The plots sub-package provides the plotting functions:

import mc3.plots as mp
import mc3.utils as mu

# Output dict contains the entire sample (posterior), need to remove burn-in:
posterior, zchain, zmask = mu.burn(mc3_results)
bestp = mc3_results['bestp']
# Set parameter names:
pnames = ["constant", "linear", "quadratic"]

# Plot best-fitting model and binned data:
mp.modelfit(data, uncert, x, y, savefile="quad_bestfit.png")

# Plot trace plot:
mp.trace(posterior, zchain, pnames=pnames, savefile="quad_trace.png")

# Plot pairwise posteriors:
mp.pairwise(posterior, pnames=pnames, bestp=bestp, savefile="quad_pairwise.png")

# Plot marginal posterior histograms (with 68% highest-posterior-density credible regions):
mp.histogram(posterior, pnames=pnames, bestp=bestp, quantile=0.683,
    savefile="quad_hist.png")
_images/quad_bestfit.png _images/quad_trace.png _images/quad_pairwise.png _images/quad_hist.png

Note

These plots can also be automatically generated along with the MCMC run (see Outputs).

Example 2: Shell Run

The following example shows a basic MCMC run from the terminal using a configuration file. First, create a Python file (‘quadratic.py’) with the modeling function:

def quad(p, x):
    y = p[0] + p[1]*x + p[2]*x**2.0
    return y

Then, generate a data set and store into files, e.g., with the following Python script:

import numpy as np
import mc3
from quadratic import quad

# Create synthetic dataset:
x  = np.linspace(0, 10, 1000)         # Independent model variable
p0 = [3, -2.4, 0.5]                   # True-underlying model parameters
y  = quad(p0, x)                      # Noiseless model
uncert = np.sqrt(np.abs(y))           # Data points uncertainty
error  = np.random.normal(0, uncert)  # Noise for the data
data   = y + error                    # Noisy data set
# Store data set and other inputs:
mc3.utils.savebin([data, uncert], 'data.npz')
mc3.utils.savebin([x],            'indp.npz')

Now, create a configuration file with the MC3 setup (‘MCMC.cfg’):

[MCMC]
data      = data.npz
indparams = indp.npz

func     = quad quadratic
params   =  10.0   -2.0   0.1
pmin     = -25.0  -10.0 -10.0
pmax     =  30.0   10.0  10.0
pstep    =   0.3    0.3   0.05

nsamples = 1e5
burnin   = 1000
ncpu     = 7
sampler  = snooker
grtest   = True
plots    = True
savefile = output_demo.npz

Finally, call the MC3 entry point, providing the configuration file as a command-line argument:

mc3 -c MCMC.cfg

Troubleshooting

There may be an error with the most recent version of the multiprocessing module (version 2.6.2.1). If the MCMC breaks with an “AttributeError: __exit__” error message pointing to a multiprocessing module, try installing a previous version of it with this shell command:

pip install --upgrade 'multiprocessing<2.6.2'