Summary

Contents

Summary#

class liesel.goose.Summary(results, additional_chain=None, quantiles=(0.05, 0.5, 0.95), hdi_prob=0.9, selected=None, deselected=None, per_chain=False, which=('mean', 'sd', 'var', 'quantiles', 'hdi', 'ess_bulk', 'ess_tail', 'rhat', 'mcse_mean', 'mcse_sd'))[source]#

Bases: object

Posterior summary and diagnostics for SamplingResults.

Offers two main use cases:

  1. View an overall summary by printing a summary instance, including a summary table of the posterior samples and a summary of sampling errors.

  2. Programmatically access summary statistics via quantities[quantity_name][var_name]. Please refer to the documentation of the attribute quantities for details.

Additionally, the summary object can be turned into a DataFrame using to_dataframe().

If per_chain=False, statistics are computed over all posterior chains and draws. If per_chain=True, each chain is summarized separately.

The low-level computations for HDIs, effective sample sizes, R-hat, and Monte Carlo standard errors are delegated to ArviZ.

By default, the summary contains the following statistics:

  • mean: Posterior mean.

  • sd: Posterior standard deviation.

  • var: Posterior variance.

  • quantiles: Posterior quantiles at the probabilities given by quantiles. These are stored as "quantile" in quantities and become columns named q_<probability> in to_dataframe().

  • hdi: Highest density interval with probability mass hdi_prob. This is the narrowest posterior interval reported by ArviZ at that probability level. In to_dataframe(), it becomes hdi_low and hdi_high.

  • ess_bulk: Bulk effective sample size, a diagnostic for Monte Carlo precision in the central part of the posterior distribution.

  • ess_tail: Tail effective sample size, a diagnostic for Monte Carlo precision in the posterior tails.

  • rhat: Rank-normalized split R-hat, a between-chain convergence diagnostic. Values close to 1 indicate better agreement between chains. This statistic is only computed when more than one chain is summarized together.

  • mcse_mean: Monte Carlo standard error of the posterior mean.

  • mcse_sd: Monte Carlo standard error of the posterior standard deviation.

Use which to compute only a subset of these statistics.

Parameters:
  • results (SamplingResults) – The sampling results to summarize.

  • additional_chain (Position (dict[str, Any]) | None, default: None) – Can be supplied to add more parameters to the summary output. Must be a position chain which matches chain and time dimension of the posterior chain as returned by get_posterior_samples().

  • quantiles (Sequence[float], default: (0.05, 0.5, 0.95)) – Posterior quantile probabilities to compute when "quantiles" is included in which.

  • hdi_prob (float, default: 0.9) – Posterior probability mass of the highest density interval to compute when "hdi" is included in which.

  • selected (list[str] | None, default: None) – Allow to get a summary only for a subset of the position keys.

  • deselected (list[str] | None, default: None) – Allow to get a summary only for a subset of the position keys.

  • per_chain (bool, default: False) – If True, the summary is calculated on a per-chain basis. Certain measures like rhat are not available if per_chain is True.

  • which (Sequence[Literal['mean', 'sd', 'var', 'quantiles', 'hdi', 'ess_bulk', 'ess_tail', 'rhat', 'mcse_mean', 'mcse_sd']], default: ('mean', 'sd', 'var', 'quantiles', 'hdi', 'ess_bulk', 'ess_tail', 'rhat', 'mcse_mean', 'mcse_sd')) – Names of the summary statistics to compute. Supported values are "mean", "sd", "var", "quantiles", "hdi", "ess_bulk", "ess_tail", "rhat", "mcse_mean", and "mcse_sd".

Notes

This class is still considered experimental. The API may still undergo larger changes.

Methods

acceptance_prob_df()

Returns an overview of acceptance probabilities as a dataframe.

aggregate_diagnostics([by])

Aggregates effective sample sizes (ESS) and rhat.

error_df([per_chain])

Returns an overview of the errors recorded during sampling as a dataframe.

to_dataframe()

Turns Summary object into a DataFrame object.

Attributes

per_chain

Whether results are summarized for individual chains (True), or aggregated over chains (False).

quantities

Dict of summarizing quantities.

config

A dictionary of config settings for this summary object.

sample_info

Dictionary of meta-information about the mcmc samples used to create this summary object.

error_summary

Contains error information for each kernel.

kernels_by_pos_key

A dict, linking parameter names (the keys) to the kernel identifier (the values).

liesel_version

The specific version of Liesel used to produce the results.