py21cmmc.mcmc.run_mcmc#

py21cmmc.mcmc.run_mcmc(core_modules, likelihood_modules, params, datadir='.', model_name='21CMMC', continue_sampling=True, reuse_burnin=True, log_level_21CMMC=None, sampler_cls=<class 'py21cmmc.cosmoHammer.CosmoHammerSampler'>, use_multinest=False, use_zeus=False, use_ultranest=False, **mcmc_options) CosmoHammerSampler[source]#

Run an MCMC chain.

Parameters:
  • core_modules (list) – A list of objects which define the necessary methods to be core modules (see core).

  • likelihood_modules (list) – A list of objects which define the necessary methods to be likelihood modules (see likelihood)

  • params (dict) – Parameters which will be sampled by the chain. Each entry’s key specifies the name of the parameter, and its value is an iterable (val, min, max, width), with val the initial guess, min and max the hard boundaries on the parameter’s value, and width determining the size of the initial ball of walker positions for the parameter.

  • datadir (str, optional) – Directory to which MCMC info will be written (eg. logs and chain files)

  • model_name (str, optional) – Name of the model, which determines filenames of outputs.

  • continue_sampling (bool, optional) – If an output chain file can be found that matches these inputs, sampling can be continued from its last iteration, up to the number of iterations specified. If set to False, any output file which matches these parameters will have its samples over-written.

  • reuse_burnin (bool, optional) – If a pre-computed chain file is found, and continue_sampling=False, setting reuse_burnin will salvage the burnin part of the chain for re-use, but re-compute the samples themselves.

  • log_level_21CMMC ((int or str, optional)) – The logging level of the cosmoHammer log file.

  • use_multinest (bool, optional) – If true, use the MultiNest sampler instead.

  • use_zeus (bool, optional) – If true, use the zeus sampler instead.

  • use_ultranest (bool, optional) – If true, use the UltraNest sampler instead.

Other Parameters:
  • **mcmc_options – All other parameters are passed directly to CosmoHammerSampler. These include important options such as walkersRatio (the number of walkers is walkersRatio*nparams), sampleIterations, burninIterations, pool, log_level_stream and threadCount. If use_multinest, parameters required by MultiNest as shown below should be provided here.

  • n_live_points (int, optional) – number of live points

  • importance_nested_sampling (bool, optional) – If True, Multinest will use Importance Nested Sampling (INS).

  • sampling_efficiency (float, optional) – defines the sampling efficiency. 0.8 and 0.3 are recommended for parameter estimation & evidence evalutation

  • evidence_tolerance (float, optional) – A value of 0.5 should give good enough accuracy.

  • max_iter (int, optional) – maximum number of iterations. 0 is unlimited.

  • multimodal (bool, optional) – whether or not to detect multi mode

  • write_output (bool, optional) – write output files? This is required for analysis.

    If use_zeus, parameters required by zeus as shown below should be provided here.

  • nsteps (int) – number of steps per iteration (Default is 100)

  • ndim (int) – number of dimensions to sample over (Default is number of supplied parameters)

  • nwalkers (int) – number of walkers (Default is 2*ndim)

  • tolerance (float, optional) – Tuning optimization tolerance (Default is 0.05).

  • patience (int, optional) – Number of tuning steps to wait to make sure that tuning is done (Default is 5).

  • maxsteps (int, optional) – Number of maximum stepping-out steps (Default is 10^4).

  • mu (float, optional) – Scale factor (Default value is 1.0), this will be tuned if tune=True.

  • maxiter (int, optional) – Number of maximum Expansions/Contractions (Default is 10^4).

  • pool (bool, optional) – External pool of workers to distribute workload to multiple CPUs (default is None).

  • vectorize (bool, optional) – If true (default is False), logprob_fn receives not just one point but an array of points, and returns an array of log-probabilities.

  • blobs_dtype (list, optional) – List containing names and dtypes of blobs metadata e.g. [(“log_prior”, float), (“mean”, float)]. It’s useful when you want to save multiple species of metadata. Default is None.

  • verbose (bool, optional) – If True (default) print log statements.

  • check_walkers (bool, optional) – If True (default) then check that nwalkers >= 2*ndim and even.

  • shuffle_ensemble (bool, optional) – If True (default) then shuffle the ensemble of walkers in every iteration before splitting it.

  • light_mode (bool, optional) – If True (default is False) then no expansions are performed after the tuning phase. This can significantly reduce the number of log likelihood evaluations but works best in target distributions that are apprroximately Gaussian.

    If use_ultranest, parameters required by UltraNest as shown below should be provided here.

  • log_dir (str) – where to store output files

  • resume (‘resume’, ‘resume-similar’, ‘overwrite’ or ‘subfolder’) – If ‘overwrite’, overwrite previous data. If ‘subfolder’, create a fresh subdirectory in log_dir. If ‘resume’ or True, continue previous run if available. Only works when dimensionality, transform or likelihood are consistent. If ‘resume-similar’, continue previous run if available. Only works when dimensionality and transform are consistent. If a likelihood difference is detected, the existing likelihoods are updated until the live point order differs. Otherwise, behaves like resume.

  • run_num (int or None) – If resume==’subfolder’, this is the subfolder number. Automatically increments if set to None.

  • num_test_samples (int) – test transform and likelihood with this number of random points for errors first. Useful to catch bugs.

  • vectorized (bool) – If true, loglike and transform function can receive arrays of points.

  • draw_multiple (bool) – If efficiency goes down, dynamically draw more points from the region between ndraw_min and ndraw_max. If set to False, few points are sampled at once.

  • ndraw_min (int) – Minimum number of points to simultaneously propose. Increase this if your likelihood makes vectorization very cheap.

  • ndraw_max (int) – Maximum number of points to simultaneously propose. Increase this if your likelihood makes vectorization very cheap. Memory allocation may be slow for extremely high values.

  • num_bootstraps (int) – number of logZ estimators and MLFriends region bootstrap rounds.

  • warmstart_max_tau (float) – Maximum disorder to accept when resume=’resume-similar’; Live points are reused as long as the live point order is below this normalised Kendall tau distance. Values from 0 (highly conservative) to 1 (extremely negligent).

  • update_interval_volume_fraction (float) – Update region when the volume shrunk by this amount.

  • log_interval (int) – Update stdout status line every log_interval iterations.

  • show_status (bool) – Show integration progress as a status line. If no output desired, set to False.

  • dlogz (float) – Target evidence uncertainty. This is the std between bootstrapped logz integrators.

  • dKL (float) – Target posterior uncertainty. This is the Kullback-Leibler divergence in nat between bootstrapped integrators.

  • frac_remain (float) – Integrate until this fraction of the integral is left in the remainder. Set to a low number (1e-2 … 1e-5) to make sure peaks are discovered. Set to a higher number (0.5) if you know the posterior is simple.

  • Lepsilon (float) – Terminate when live point likelihoods are all the same, within Lepsilon tolerance. Increase this when your likelihood function is inaccurate, to avoid unnecessary search.

  • min_ess (int) – Target number of effective posterior samples.

  • max_iters (int) – maximum number of integration iterations.

  • max_ncalls (int) – stop after this many likelihood evaluations.

  • max_num_improvement_loops (int) – run() tries to assess iteratively where more samples are needed. This number limits the number of improvement loops.

  • min_num_live_points (int) – minimum number of live points throughout the run

  • cluster_num_live_points (int) – require at least this many live points per detected cluster

  • insertion_test_zscore_threshold (float) – z-score used as a threshold for the insertion order test. Set to infinity to disable.

  • insertion_test_window (int) – Number of iterations after which the insertion order test is reset.

  • region_class (MLFriends or RobustEllipsoidRegion or SimpleRegion) – Whether to use MLFriends+ellipsoidal+tellipsoidal region (better for multi-modal problems) or just ellipsoidal sampling (faster for high-dimensional, gaussian-like problems) or a axis-aligned ellipsoid (fastest, to be combined with slice sampling).

Returns:

sampler (CosmoHammerSampler instance.) – The sampler object, from which the chain itself may be accessed (via the samples attribute). If use_multinest, return multinest sampler. If use_zeus, return zeus sampler.