API Reference
- dendros.open_outputs(path, output_root='Outputs')[source]
Open a Galacticus output collection.
- Parameters:
path (str | Path | List[str | Path] | Mapping[str, str | Path | List[str | Path]]) –
One of:
A single filename – e.g.
"galacticus.hdf5". If sibling MPI-rank files (*_MPI:????) exist they are included automatically.A glob string – e.g.
"run*/galacticus*.hdf5".An explicit list of filenames.
A dict
{label: path-or-paths}to open several models for side-by-side comparison. Equivalent toopen_models(path); returns aModelCollection.
output_root (str) – Top-level HDF5 group containing the
Output*groups. Defaults to"Outputs". Pass"Lightcone"for lightcone runs or any other custom group name as needed.
- Returns:
Collection – For the single-, glob-, or list-of-files forms (one model, possibly MPI-split).
ModelCollection – For the dict form (one entry per model, suitable for multi-model comparison plots).
- Raises:
FileNotFoundError – If no files are found matching path.
TypeError – If path is not a
str,pathlib.Path,list, ordict.
- Return type:
Examples
Open a single file:
c = open_outputs("galacticus.hdf5")
Auto-detect MPI-split files (given any one rank’s file):
c = open_outputs("galacticus_MPI:0000.hdf5")
Open via glob:
c = open_outputs("run001/galacticus*.hdf5")
Open an explicit list:
c = open_outputs(["file_a.hdf5", "file_b.hdf5"])
Open several models for comparison plots (see also
open_models()):m = open_outputs({"Fiducial": "fid.hdf5", "Variant": "var.hdf5"}) figs = m.plot_analyses()
Lightcone mode:
c = open_outputs("lightcone.hdf5", output_root="Lightcone")
- dendros.open_models(models, output_root='Outputs')[source]
Open several Galacticus runs as a labelled
ModelCollection.Each entry is opened with
open_outputs(), so a single filename auto-detects its*_MPI:????peers and a list-of-files entry is accepted as-is. The returnedModelCollectioncan be passed directly toplot_analyses()to overlay analyses from every model on a single figure.- Parameters:
models (Mapping[str, str | Path | List[str | Path]] | Sequence[str | Path | List[str | Path]]) – Either a dict
{label: path-or-paths}— keys become legend labels — or a list ofpath-or-paths, in which case default labels are derived from each model’s primary file stem (with any:MPIxxxxsuffix stripped).output_root (str) – Forwarded to
open_outputs().
- Return type:
- Raises:
ValueError – If a list form produces duplicate default labels. Pass a dict with explicit labels to disambiguate.
Examples
Compare two models with explicit labels:
with open_models({"Fiducial": "fid.hdf5", "Variant": "var.hdf5"}) as m: figs = dendros.plot_analyses(m)
Or with labels derived from filenames:
models = open_models(["fid.hdf5", "var.hdf5"])
- class dendros.Collection(files, output_root='Outputs')[source]
Bases:
objectA collection of one or more Galacticus HDF5 output files.
Prefer constructing instances through
open_outputs()rather than calling this constructor directly.- Parameters:
Examples
>>> from dendros import open_outputs >>> with open_outputs("galacticus.hdf5") as c: ... c.validate_completion() ... print(c.list_outputs()) ... data = c.read("Output1", ["nodeData/haloMass"])
- validate_completion(mode='error')[source]
Check that all files report a successful completion status.
Galacticus writes a
statusCompletionattribute to the root of the HDF5 file when it finishes. This method verifies that the attribute equals"complete"for every file in the collection.- Parameters:
mode (str) –
What to do when an incomplete file is found:
"error"(default) – raiseRuntimeError."warn"– emit aUserWarningand continue."ignore"– do nothing.
- Raises:
ValueError – If mode is not one of the accepted values.
RuntimeError – If mode is
"error"and at least one file is incomplete.
- Return type:
None
- property outputs: OutputIndex
An
OutputIndexfor this collection.The index is scanned lazily on first access and then cached.
- list_outputs(format='astropy')[source]
Return a table of available outputs.
Scans all
Output*groups inside/{output_root}/and extractsoutputTimeandoutputExpansionFactorattributes. Redshift is computed as z = 1/a − 1.- Parameters:
format (str) –
"astropy"(default) returns anastropy.table.Table;"pandas"returns apandas.DataFrame;"tabulate"returns astrformatted using thetabulatelibrary.- Return type:
astropy.table.Table, pandas.DataFrame, or tabulate-formatted string
- list_properties(output, format='astropy')[source]
Return a table of datasets available in the
nodeDatagroup.- Parameters:
- Return type:
astropy.table.Table, pandas.DataFrame, or tabulate-formatted string
- read(output, datasets, where=None)[source]
Read one or more datasets from an output group.
For multi-file collections, arrays from all files are concatenated along axis 0 before any selection is applied.
- Parameters:
output (str | int) – Output name (e.g.
"Output1") or 1-based integer index.datasets (List[str] | Dict[str, str]) – Either a list of relative dataset paths under the output group (e.g.
["nodeData/haloMass"]), in which case the same strings are used as dict keys in the return value; or adictmapping user-chosen labels to relative paths.where –
Nonereads all rows. A boolean mask array of length N_total or an integer index array selects a subset.
- Returns:
Mapping from dataset name / label to
numpy.ndarray.- Return type:
Notes
The
unitsInSIattribute is preserved in the raw array values but not yet applied. Future versions will optionally returnastropy.units.Quantityobjects.
- trace_history(ids, properties, outputs=None, *, id_dataset='nodeData/nodeUniqueIDBranchTip', on_duplicate_file_match='error', int_sentinel=-1)[source]
Trace the history of specified galaxies across outputs.
Convenience wrapper around
dendros.trace_galaxy_history(). See that function for full parameter and return-value documentation.
- list_analyses(format='astropy')[source]
Return a table of
function1Danalyses in/analyses.Convenience wrapper around
dendros.list_analyses().- Parameters:
format (str)
- plot_analyses(name=None, output_directory=None, *, show_target=True, figsize=(7.0, 5.0), dpi=120, file_format='pdf')[source]
Plot
function1Danalyses from/analyses.Convenience wrapper around
dendros.plot_analyses().
- class dendros.ModelCollection[source]
Bases:
dictA
dictmapping label →Collection, one entry per model.Returned by
open_models()and accepted byplot_analyses()so analyses from several Galacticus runs can be overlaid on a single figure for comparison.Acts as a regular dict but also supports the context-manager protocol —
__exit__closes every contained Collection.- close()[source]
Close every contained Collection.
Each close is attempted independently; if one fails the others are still closed and a
UserWarningis emitted naming the failing model so the problem is visible.- Return type:
None
- list_analyses(format='astropy')[source]
Return the union of
function1Danalyses across all models.Each row carries an extra
modelscolumn listing the labels of the models that contain the analysis (sorted, comma-separated). Per-row metadata (description, axis labels, log flags, target presence) is taken from the first model that supplies the analysis.- Parameters:
format (str)
- plot_analyses(name=None, output_directory=None, *, show_target=True, figsize=(7.0, 5.0), dpi=120, file_format='pdf')[source]
Plot
function1Danalyses overlaid across every model.Convenience wrapper around
dendros.plot_analyses()applied to thisModelCollection. Labels come from this dict’s keys; the target overlay is drawn once.
- class dendros.OutputIndex(collection)[source]
Bases:
objectIndex of all
Output*groups found in aCollection.Instances are obtained via
outputs.- Parameters:
collection (Collection) – The parent
Collection.
- table(format='astropy')[source]
Return a table of output metadata.
- Parameters:
format (str) –
"astropy"(default) returns anastropy.table.Table;"pandas"returns apandas.DataFrame.- Return type:
astropy.table.Table or pandas.DataFrame
- class dendros.OutputMeta(name, path, index, time, scale_factor, redshift)[source]
Bases:
objectMetadata for a single Galacticus output snapshot.
- Parameters:
- dendros.sfh_collapse_metallicities(dataset)[source]
Collapse a formation history over metallicity.
Collapses (sums) star formation histories over the metallicity axis. If fixed times were used a 2D
numpy.ndarrayis returned, and any empty entries are filled with zeros. Otherwise, a list of 1D :class:`numpy.ndarray`s is returned.- Parameters:
dataset (DatasetProxy) – The dataset containing the star formation history data.
- dendros.sfh_times(dataset)[source]
Return times associated with a star formation history.
Returns None if no fixed times are associated with this star formation history.
- Parameters:
dataset (DatasetProxy) – The dataset containing the star formation history data.
- dendros.trace_galaxy_history(collection, ids, properties, outputs=None, *, id_dataset='nodeData/nodeUniqueIDBranchTip', on_duplicate_file_match='error', int_sentinel=-1)[source]
Extract per-galaxy property histories across Galacticus outputs.
Galaxies are traced across
Output*groups via an integer branch-tip identifier (usuallynodeUniqueIDBranchTip) that is constant over time for a given object and unique within a single HDF5 file. For each requested property and each chosen output, this function locates every requested ID in every file of the collection, assembles the per-galaxy slice, and stacks the results along a trailing “output” axis.Slots where a galaxy is absent at a given output are filled with
numpy.nan(floating-point properties, and thetime,redshiftandexpansion_factormetadata arrays), withint_sentinel(integer properties), or withFalse(boolean properties). The returnedpresentmask is the canonical indicator of presence/absence and should be preferred to sentinel checks.- Parameters:
collection (Collection) – An open
Collection.ids – Array-like of integer
nodeUniqueIDBranchTipvalues to trace. Coerced tonumpy.ndarrayofint64. Input order is preserved along the first axis of every returned array.properties (Union[List[str], Dict[str, str]]) – Either a list of relative dataset paths under each
Output*group (e.g.["nodeData/basicMass"]), matchingCollection.read(), or adictmapping user-chosen labels to relative paths.outputs (Optional[Sequence[Union[int, str]]]) – Optional iterable selecting a subset of outputs to include. Each element may be a 1-based integer (e.g.
3) or a group name (e.g."Output3"). Arangeis accepted. Defaults to all outputs in the collection, in temporal order.id_dataset (str) – Relative path of the tracing ID dataset under each
Output*group. Defaults to"nodeData/nodeUniqueIDBranchTip".on_duplicate_file_match (str) –
What to do if the same ID is found in more than one file at the same output (IDs are only unique within a file in multi-file collections):
"error"(default) – raiseValueError."warn"– emit aUserWarningand keep the first file’s match."first"– silently keep the first file’s match.
int_sentinel (int) – Missing-slot value used for integer-typed properties. Defaults to
-1.
- Returns:
Contains:
one entry per property –
numpy.ndarrayof shape(n_galaxies,) + per_galaxy_tail + (n_outputs,). A 1-D source dataset yields a 2-D(n_galaxies, n_outputs)array; a 2-D dataset of shape(N, W)yields a 3-D(n_galaxies, W, n_outputs)array; and so on."time"– float array(n_galaxies, n_outputs)of output times, NaN where the galaxy is absent."redshift"– float array(n_galaxies, n_outputs)of redshifts, NaN where the galaxy is absent."expansion_factor"– float array(n_galaxies, n_outputs)of expansion factors, NaN where the galaxy is absent."present"– bool array(n_galaxies, n_outputs)that isTrueexactly where the galaxy was located."output_names"– 1-D object array of output group names in temporal order."ids"– 1-Dint64array of normalized input IDs.
- Return type:
- Raises:
KeyError – If
id_datasetis not present in any chosen output of any file (e.g. the Galacticus run did not emitnodeUniqueIDBranchTip), or if a requested property is missing from a chosen output.ValueError – If
propertiescontains a reserved label, ifoutputsis empty, if the tail shape of a property differs between outputs, or (by default) if an ID appears in more than one file at the same output.NotImplementedError – If a property has a dtype other than integer, floating, or boolean.
Notes
A galaxy need not be present at every output (it may have formed later or merged earlier); ragged histories are expected. Requesting IDs that are never found anywhere produces a
UserWarningrather than an error, since exploratory workflows often probe IDs of uncertain provenance.
- dendros.list_analyses(collection, format='astropy')[source]
Return a table of
function1Danalyses available in the collection.- Parameters:
collection (Collection) – A
Collection. Only the primary file is consulted — for MPI runs, the/analysesdata has been reduced over all ranks and is identical in every file.format (str) –
"astropy"(default),"pandas", or"tabulate".
- Return type:
astropy.table.Table, pandas.DataFrame, or tabulate-formatted string
- Raises:
KeyError – If the file has no top-level
/analysesgroup.
- dendros.plot_analyses(collection, name=None, output_directory=None, *, labels=None, show_target=True, figsize=(7.0, 5.0), dpi=120, file_format='pdf')[source]
Plot one, several, or all
function1Danalyses.A single
Collectionproduces one model curve per figure (legacy behaviour). A list, dict, orModelCollectionof Collections overlays one curve per model on each figure, plotting the target/observational overlay once (since it is shared across models). The union of analyses discovered across models is plotted — figures whose analysis is absent from a given model simply do not include its curve.- Parameters:
collection (_MultiInput) – A
Collection; a sequence of Collections; or a mapping{label: Collection}(e.g. one returned byopen_models()).name (Union[None, str, List[str]]) –
None(default) plots everyfunction1Danalysis discovered across all models. A single name (str) or list of names plots only those.output_directory (Union[None, str, 'Path']) – If given, each figure is also saved as
<output_directory>/<safe_name>.<file_format>. The directory is created if it does not exist.labels (Optional[Sequence[str]]) – Optional sequence of legend labels, one per Collection, used only when collection is a list/tuple of Collections. When omitted, each model is labelled by its primary file’s stem (with any
:MPIxxxxsuffix stripped). Cannot be combined with a dict input.show_target (bool) – If
True(default), overlay target/observational data when present. For multi-model plots the target is plotted only once, from the first model that has it.dpi (int) – Forwarded to matplotlib.
file_format (str) – Forwarded to matplotlib.
- Returns:
Mapping from analysis name to
matplotlib.figure.Figure.- Return type:
- Raises:
KeyError – If a model has no
/analysesgroup, or if a requested name is missing from every model.ImportError – If matplotlib is not installed; install with
pip install 'dendros[plot]'.
MCMC
Entry point
- dendros.open_mcmc(config_path)[source]
Open an MCMC run by parsing its config XML.
- Parameters:
config_path (str | Path) – Path to the Galacticus MCMC
<parameters>XML file.- Return type:
Examples
>>> from dendros import open_mcmc >>> with open_mcmc("mcmcConfig.xml") as run: ... print(run.parameters) ... chains = run.chains
- class dendros.MCMCRun(config)[source]
Bases:
objectAn MCMC run, parsed from its config file and lazily backed by chain data.
Construct via
open_mcmc(). The chain files are not read untilchainsis first accessed; subsequent accesses return the cachedChainSet.- Parameters:
config (MCMCConfig) – Parsed
MCMCConfig.
- property config: MCMCConfig
The parsed
MCMCConfig.
- property parameters: Tuple[ModelParameter, ...]
Active model parameters, in chain-file column order.
- gelman_rubin(*, drop_chains=(), step_grid=None, n_grid=200, min_steps=10, alpha_interval=0.15)[source]
Convenience wrapper around
dendros.gelman_rubin().
- convergence_step(*, threshold=1.1, sustained_for=1, drop_chains=(), step_grid=None, n_grid=200, min_steps=10)[source]
First simulation-step count at which max-Rhat is sustained below threshold.
Computes a Gelman-Rubin trace via
gelman_rubin()and returns theRhatResult.stepsvalue at which convergence is first declared. ReturnsNoneif convergence is never reached on the chosen grid.
- geweke(*, first=0.1, last=0.5)[source]
Convenience wrapper around
dendros.geweke().
- outlier_chains(*, alpha=0.05, max_outliers=10, parameters=None)[source]
Convenience wrapper around
dendros.outlier_chains().
- acceptance_rate(*, post_burn=None)[source]
Convenience wrapper around
dendros.acceptance_rate().
- acceptance_rate_trace(*, window=30, post_burn=0)[source]
Convenience wrapper around
dendros.acceptance_rate_trace().
- autocorrelation_time(*, post_burn=None, c=5.0)[source]
Convenience wrapper around
dendros.autocorrelation_time().
- effective_sample_size(*, post_burn=None, c=5.0)[source]
Convenience wrapper around
dendros.effective_sample_size().
- maximum_posterior(*, drop_chains=())[source]
Convenience wrapper around
dendros.maximum_posterior().
- maximum_likelihood(*, drop_chains=())[source]
Convenience wrapper around
dendros.maximum_likelihood().
- posterior_samples(n, *, post_burn=None, drop_chains=(), rng=None, replace=None)[source]
Convenience wrapper around
dendros.posterior_samples().
- projection_pursuit(*, post_burn=None, drop_chains=())[source]
Convenience wrapper around
dendros.projection_pursuit().- Parameters:
- Return type:
- multivariate_normal_fit(*, post_burn=None, drop_chains=())[source]
Convenience wrapper around
dendros.multivariate_normal_fit().
- write_parameter_file(state, out_path, *, likelihood_index=0)[source]
Emit a single Galacticus parameter file for one likelihood leaf.
Reads
leaves[likelihood_index].base_parameters_file, applies the leaf’sparameter_map(or the full state when no map is set), and writes the result to out_path.- Parameters:
state –
(n_params,)state vector in physical (model) space. Galacticus stores chain rows in physical space, so a row frommaximum_posterior()/posterior_samples()can be passed directly.out_path – Output path. Parent directory is created if missing.
likelihood_index (int) – Which leaf of the likelihood tree to use. Defaults to
0.
- Returns:
Resolved output path.
- Return type:
- corner_plot(*, parameters=None, post_burn=None, drop_chains=(), labels=None, **corner_kwargs)[source]
Convenience wrapper around
dendros.corner_plot().
- write_parameter_files(state, out_dir, *, name_format=None)[source]
Emit one parameter file per likelihood leaf into out_dir.
For
independentLikelihoodsconfigs each leaf has its ownbaseParametersFileNameandparameterMap; this writes one file per leaf, with each file’s filename derived from the base file’s stem.- Parameters:
state –
(n_params,)state vector in physical space.out_dir – Output directory (created if missing).
name_format (str | None) – Output-filename format string accepting
leaf_indexandstem. Defaults to"{stem}.xml"for a single leaf and"{leaf_index:02d}_{stem}.xml"for multiple.
- Returns:
One entry per leaf, in document order.
- Return type:
list of (leaf_index, pathlib.Path)
Configuration
- dendros.parse_mcmc_config(path)[source]
Parse a Galacticus MCMC
<parameters>XML file.- Parameters:
- Return type:
- Raises:
FileNotFoundError – If path does not exist.
ValueError – If the file’s root element is not
<parameters>, or if the requiredposteriorSampleSimulation/logFileRootelements are missing.
- class dendros.MCMCConfig(config_path, log_file_root, simulation_kind, parameters, likelihood)[source]
Parsed Galacticus MCMC configuration.
- Parameters:
config_path (Path)
log_file_root (Path)
simulation_kind (str)
parameters (Tuple[ModelParameter, ...])
likelihood (Likelihood | None)
- config_path
Absolute path to the parsed XML file.
- Type:
- log_file_root
Resolved chain log-file root (relative paths resolved against
config_path’s directory). Per-rank chain files are atf"{log_file_root}_{rank:04d}.log".- Type:
- simulation_kind
Value attribute of
posteriorSampleSimulation, e.g."differentialEvolution"or"particleSwarm". Determines whether chain rows carry trailing per-particle velocity columns.- Type:
- parameters
Tuple of active
ModelParameterentries in document order. This is the canonical ordering used by chain-file columns.- Type:
Tuple[dendros._mcmc._config.ModelParameter, …]
- likelihood
Root of the
posteriorSampleLikelihoodtree, orNoneif the config lacks a likelihood block.- Type:
- state_indices_for(leaf)[source]
Indices of the global state vector applicable to leaf.
For a leaf inside an
independentLikelihoodssubtree, returns the positions inparametersnamed in the leaf’sparameter_map. For other leaves, returns(0, 1, ..., n_params - 1)(identity).- Parameters:
leaf (Likelihood) – A
LikelihoodfromLikelihood.leaves().- Raises:
KeyError – If a name in
parameter_mapisn’t among the active parameters.- Return type:
- class dendros.ModelParameter(name, label=None, prior=None, mapper='identity', perturber=None)[source]
A single
<modelParameter value="active">entry from the config.- Parameters:
name (str)
label (str | None)
prior (PriorSpec | None)
mapper (str)
perturber (PerturberSpec | None)
- label
Optional LaTeX label for plotting.
Nonewhen the config omits the<label>sub-element. Usedisplay_labelto obtain a plottable string regardless.- Type:
str | None
- prior
Parsed
distributionFunction1DPriorblock, if present.- Type:
- perturber
Parsed
distributionFunction1DPerturberblock, if present.- Type:
- class dendros.Likelihood(kind, base_parameters_file=None, parameter_map=None, children=<factory>)[source]
A node in the
posteriorSampleLikelihoodtree.- Parameters:
- base_parameters_file
Resolved path to the
baseParametersFileNameelement’s value when present.Nonefor non-leaf nodes (e.g.independentLikelihoodswithout a base file of its own).- Type:
pathlib.Path | None
- parameter_map
For children of
posteriorSampleLikelihoodIndependentLikelihoods, the parsed<parameterMap value="space separated names"/>for this child. Each entry is a parameter name from the active model parameters.Noneoutside of anindependentLikelihoodscontext, in which case identity mapping (all active parameters) is implied.- Type:
Tuple[str, …] | None
- children
Tuple of child
Likelihoodinstances. Empty for leaves.- Type:
Tuple[dendros._mcmc._config.Likelihood, …]
- leaves()[source]
Flatten the tree to its leaf likelihoods (in document order).
- Return type:
Tuple[Likelihood, …]
Chains
- dendros.read_chains(config)[source]
Discover and read all per-rank chain files for config.
- Parameters:
config (MCMCConfig) – Parsed
MCMCConfig.- Return type:
- Raises:
FileNotFoundError – If no chain files are found at
config.log_file_root.
- class dendros.Chain(chain_index, path, step, eval_time, converged, log_posterior, log_likelihood, state, velocity=None)[source]
One MPI rank’s MCMC chain.
- Parameters:
- path
Source log-file path.
- Type:
- step
Integer simulation-step index, one per row.
- Type:
- eval_time
Wall-clock evaluation time per step, in seconds.
- Type:
- converged
Boolean flag indicating whether the simulation had declared convergence at this step.
- Type:
- log_posterior
Log posterior probability per step.
- Type:
- log_likelihood
Log likelihood per step.
- Type:
- state
(n_steps, n_params)array of parameter values, inMCMCConfig.parametersorder. Values are in physical (model) space — Galacticus applies the inverse ofoperatorUnaryMapperbefore writing.- Type:
- velocity
(n_steps, n_params)array of per-parameter particle velocities forparticleSwarmsimulations;Nonefor differential-evolution and other state-only simulations.- Type:
numpy.ndarray | None
- class dendros.ChainSet(config, chains)[source]
An ordered collection of
Chainobjects from one MCMC run.Iteration yields chains in MPI-rank order.
- Parameters:
config (MCMCConfig) – The parsed
MCMCConfigthe chains correspond to.chains (Sequence[Chain]) – The per-rank chains.
Convergence
- dendros.gelman_rubin(chains, *, drop_chains=(), step_grid=None, n_grid=200, min_steps=10, alpha_interval=0.15)[source]
Brooks-Gelman corrected Rhat as a function of simulation step.
For each chosen truncation point
sthe firstsrows of every surviving chain are used to compute the standard between-chain (B) and within-chain (W) variances and the Brooks-Gelman corrected potential-scale reduction factor \(\hat{R}_c\). The non-parametric interval-length ratio \(R_{\rm interval}\) (Brooks & Gelman 1998 section 1.3) is also computed at the same evaluation points.- Parameters:
chains (ChainSet) –
ChainSetto evaluate. Must contain at least two non-dropped chains and at leastmin_stepsrows per chain.drop_chains (Sequence[int]) – Iterable of
chain_indexvalues to exclude before computing. Use this with the indices returned byoutlier_chains().step_grid (Sequence[int] | None) – Optional explicit 1-D iterable of truncation step counts (1-based). When given,
n_gridandmin_stepsare ignored.n_grid (int) – Number of evenly-spaced evaluation points to use when
step_gridisNone. Capped at the shortest surviving chain length minusmin_steps+ 1.min_steps (int) – Smallest truncation step count to evaluate. Must be
>= 2.alpha_interval (float) – Two-sided significance level for
R_interval(default 0.15, i.e. 85 % credible intervals — matches the Galacticus Perl reference).
- Return type:
- Raises:
ValueError – If fewer than two chains survive
drop_chainsormin_stepsis too small.
- class dendros.RhatResult(steps, Rhat_c, R_interval, parameter_names, alpha_interval, chains_used)[source]
Result of
gelman_rubin().- Parameters:
- steps
(n_eval,)1-D array of truncation step counts at which Rhat was computed (i.e. each entrysmeans “use the firstsrows of every chain”). These are 1-based step counts so the smallest value is the chosenmin_steps.- Type:
- Rhat_c
(n_eval, n_params)array of Brooks-Gelman corrected potential-scale reduction factors.- Type:
- R_interval
(n_eval, n_params)array of non-parametric interval-length ratios (mixed-chain credible interval / mean per-chain credible interval) at the chosenalpha_interval.- Type:
- chains_used
chain_indexvalues of the chains that contributed (afterdrop_chainswas applied).- Type:
Tuple[int, …]
- Rhat_c_max:
Per-step max-over-parameters of
Rhat_c, useful as the input toconvergence_step().
- dendros.convergence_step(rhat_max, *, threshold=1.1, sustained_for=1)[source]
Index into the Rhat grid at which convergence is first declared.
Searches for the smallest index
isuch that every entry ofrhat_max[i : i + sustained_for]is at or belowthreshold.- Parameters:
rhat_max (ndarray) – 1-D array of (max-over-parameters) Rhat values, e.g.
RhatResult.Rhat_c_max().threshold (float) – Convergence threshold. Defaults to
1.1.sustained_for (int) – Number of consecutive grid points that must all be below the threshold before convergence is declared. Defaults to
1(strict first crossing).
- Returns:
Grid index at which convergence is first sustained, or
Noneif the threshold is never met.- Return type:
int or None
Notes
Use
RhatResult.stepsto translate the returned grid index to a simulation-step count.
- dendros.geweke(chains, *, first=0.1, last=0.5)[source]
Per-chain Geweke z-scores comparing the means of two chain segments.
For each chain and each parameter, returns
\[z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{s^2_1/n_1 + s^2_2/n_2}}\]where segment 1 covers the first
firstfraction of the chain and segment 2 covers the lastlastfraction. Large|z|for any parameter suggests the chain has not yet reached a stationary distribution — useful when the chains were started from an under-dispersed state (which makes Gelman-Rubin uninformative).- Parameters:
- Returns:
(n_chains, n_params)z-score array. Chains shorter than 4 rows in either segment yieldNaN.- Return type:
np.ndarray
Notes
The variance estimator used here is the simple sample variance, which treats each draw as independent. Autocorrelated chains will produce artificially-large
|z|; once a proper integrated-autocorrelation-time estimator lands (Phase 3) this can be inflated by the ACL to recover the classical spectral-density-at-zero variant.
- dendros.outlier_chains(chains, *, alpha=0.05, max_outliers=10, parameters=None)[source]
Iterative two-sided Grubbs test on each chain’s final state.
Each chain contributes its last row (the most recent state) as a single multivariate point. The Grubbs test is applied iteratively over the active chains, dropping the chain whose maximum per-parameter deviation exceeds the critical value at each step, until none exceed it or
max_outlierschains have been removed.- Parameters:
chains (ChainSet) –
ChainSet. Must contain at least three chains.alpha (float) – Two-sided significance level. Defaults to
0.05to match the Galacticus Perl reference’s hard-coded value.max_outliers (int) – Maximum number of chains to declare as outliers.
parameters (Iterable[str] | None) – Optional iterable of parameter names to restrict the test to a subset. Unknown names raise
KeyError.
- Returns:
chain_indexvalues of the chains flagged as outliers, in the order they were removed.- Return type:
Mixing diagnostics
- dendros.autocorrelation_function(chains, *, post_burn=0, max_lag=None)[source]
Per-chain, per-parameter normalized autocorrelation function.
- Parameters:
- Returns:
Array of shape
(n_chains, max_lag + 1, n_params). Chains are truncated to a common length (the shortest post-burn chain) so this is rectangular.- Return type:
np.ndarray
- dendros.autocorrelation_time(chains, *, post_burn=None, c=5.0)[source]
Integrated autocorrelation time per parameter, in steps.
Implements the standard Sokal automatic-windowing estimator over the chain-averaged autocovariance. For each parameter, the per-chain autocovariances are averaged before integrating, which is more stable than averaging per-chain
τ_intestimates.- Parameters:
chains (ChainSet) –
ChainSet. All chains are truncated to the shortest post-burn length.post_burn (int | None) – Number of leading rows to skip from each chain.
Nonetriggers automatic detection viagelman_rubin()/convergence_step(); if convergence is not reached aUserWarningis emitted and0is used.c (float) – Sokal window constant. Defaults to
5.0.
- Returns:
(n_params,)array of integrated autocorrelation times in steps.- Return type:
np.ndarray
- dendros.effective_sample_size(chains, *, post_burn=None, c=5.0)[source]
Effective sample size per parameter.
Defined as
N_total / τ_intwhereN_totalis the total number of post-burn samples summed across all chains andτ_intis the chain- averaged integrated autocorrelation time fromautocorrelation_time().- Parameters:
post_burn (int | None) – See
autocorrelation_time().c (float) – Sokal window constant.
- Returns:
(n_params,)array of effective sample sizes.- Return type:
np.ndarray
- dendros.acceptance_rate(chains, *, post_burn=None)[source]
Per-chain post-burn acceptance rate.
A step is “accepted” iff any parameter component differs from the previous step. Galacticus emits the same row when a proposal is rejected, so this is the canonical acceptance count.
- Parameters:
post_burn (int | None) – Number of leading rows to skip in each chain.
Nonetriggers automatic detection viagelman_rubin()/convergence_step(); if convergence is not reached aUserWarningis emitted and0is used.
- Returns:
(n_chains,)array.NaNfor any chain with fewer than two post-burn rows.- Return type:
np.ndarray
- dendros.acceptance_rate_trace(chains, *, window=30, post_burn=0)[source]
Sliding-window acceptance rate as a function of step.
For each chain, returns a 1-D array whose
i-th entry is the fraction of the previous window transitions that were accepted (i.e. changed at least one parameter). The firstwindowentries are filled withnumpy.nanbecause the window is not yet full.- Parameters:
- Returns:
One 1-D array per chain, each of length
n_steps_post_burn. Returned as a list because chains may have different post-burn lengths.- Return type:
list of np.ndarray
Posterior analyses
- dendros.maximum_posterior(chains, *, drop_chains=())[source]
State vector at the maximum log posterior across all surviving chains.
- dendros.maximum_likelihood(chains, *, drop_chains=())[source]
State vector at the maximum log likelihood across all surviving chains.
- class dendros.MaxResult(state, log_posterior, log_likelihood, chain_index, step, parameter_names)[source]
Result of
maximum_posterior()ormaximum_likelihood().- Parameters:
- state
(n_params,)parameter vector at the maximizing step.- Type:
- step
Simulation-step value of the maximizing row (1-based, matching
Chain.step).- Type:
- dendros.posterior_samples(chains, n, *, post_burn=None, drop_chains=(), rng=None, replace=None)[source]
Draw n uniformly-random rows from the post-burn concatenated chain.
- Parameters:
n (int) – Number of samples to draw. Must be positive.
post_burn (int | None) – Number of leading rows to skip in each chain.
Nonetriggers automatic detection viagelman_rubin()/convergence_step(); if convergence is not reached aUserWarningis emitted and0is used.drop_chains (Sequence[int]) – Iterable of
chain_indexvalues to exclude.rng (Generator | None) –
numpy.random.Generator. Defaults tonumpy.random.default_rng(), which seeds from system entropy. Pass an explicit generator for reproducibility.replace (bool | None) – Whether to sample with replacement.
None(default) means “with replacement only when n exceeds the available pool”, matching the common case where a smallnis desired and identical rows would be misleading.
- Return type:
- Raises:
ValueError – If n is non-positive, or if all chains are dropped, or if
replace=Falseand n exceeds the pool size.
- class dendros.PosteriorSamples(state, log_posterior, log_likelihood, chain_index, step, parameter_names)[source]
Sampled rows from the post-burn concatenated chain.
- Parameters:
- state
(n_samples, n_params)parameter values.- Type:
- log_posterior
(n_samples,)log posterior at each sample.- Type:
- log_likelihood
(n_samples,)log likelihood at each sample.- Type:
- chain_index
(n_samples,)source chain indices.- Type:
- step
(n_samples,)source simulation-step values.- Type:
Notes
Adjacent steps in an MCMC chain are correlated — N draws here represent significantly fewer independent samples from the posterior. Use
effective_sample_size()to estimate the equivalent count, and thin by the integrated autocorrelation time if independence is required.
- dendros.projection_pursuit(chains, *, post_burn=None, drop_chains=())[source]
Find the linear combinations of parameters best constrained by the data.
Each post-burn parameter column is mapped via its
operatorUnaryMapper, normalised bysqrt(prior variance), and mean-centred. The covariance matrix of the resulting samples is eigendecomposed, and the eigenvalues/eigenvectors are returned sorted by ascending eigenvalue — soeigenvectors[:, 0]is the linear combination most tightly constrained relative to the prior.- Parameters:
- Return type:
- Raises:
NotImplementedError – If any active parameter uses a prior or mapper not yet supported by
projection_pursuit()(currently uniform/normal priors andidentitymapper only).ValueError – If the post-burn pool is empty or contains fewer than two rows.
- class dendros.ProjectionPursuitResult(eigenvalues, eigenvectors, parameter_names, parameter_labels, prior_sigma)[source]
Result of
projection_pursuit().- Parameters:
- eigenvalues
(n_params,)ascending eigenvalues of the rescaled-sample covariance matrix. Smaller is “better constrained”.- Type:
- eigenvectors
(n_params, n_params)matrix whose[:, k]column is the eigenvector foreigenvalues[k], expressed in rescaled-mapped space (i.e. in the same coordinates used for the eigendecomposition).- Type:
- parameter_labels
ModelParameter.display_labelstrings parallel toparameter_names.- Type:
Tuple[str, …]
- prior_sigma
(n_params,)square root of the prior variance for each parameter, the rescaling that was applied before eigendecomposition.- Type:
- direction:
Components of one eigenvector that exceed a contribution threshold.
- latex_summary:
LaTeX-rendered summary line for a chosen direction.
- dendros.multivariate_normal_fit(chains, *, post_burn=None, drop_chains=())[source]
Fit a multivariate normal to the post-burn concatenated chain.
- Parameters:
post_burn (int | None) – Number of leading rows to skip per chain.
Nonetriggers automatic detection viagelman_rubin()/convergence_step().drop_chains (Sequence[int]) –
chain_indexvalues to exclude.
- Return type:
- Raises:
ValueError – If fewer than
n_params + 1post-burn samples remain (so that the sample covariance is rank-deficient).np.linalg.LinAlgError – If the sample covariance is not positive-definite (which can happen for parameters that are degenerate post-burn). Drop the offending parameter or supply more samples.
- class dendros.MVNFit(mean, covariance, cholesky, parameter_names)[source]
Multivariate-normal fit to post-burn samples.
- Parameters:
- mean
(n_params,)sample mean.- Type:
- covariance
(n_params, n_params)sample covariance, symmetrised.- Type:
- cholesky
(n_params, n_params)lower-triangular Cholesky factor ofcovariance. SatisfiesL @ L.T == covariance.- Type:
- write_reparameterization_config:
Emit a Galacticus-style XML config that re-parameterizes the active parameters in terms of independent unit-normal meta parameters.
- write_reparameterization_config(out_path, *, n_sigma=5.0, perturber_scale=1e-05)[source]
Write a Galacticus reparameterization XML config.
For an n-parameter MVN fit with mean \(\mu\) and Cholesky factor \(L\), the emitted config declares n active
metaParameter{i}parameters with truncated unit-normal priors (limits \(\pm n_\sigma\)), and n derived parameters expressing the original active parameters as\[x_i = \mu_i + \sum_j L_{ij} \, m_j .\]Re-running the MCMC against this config samples in coordinates where the posterior is approximately spherical.
- Parameters:
- Returns:
Resolved path of the written file.
- Return type:
Parameter-file emission
- dendros.read_parameter_file(path)[source]
Parse a Galacticus parameter XML file.
- Parameters:
- Return type:
- dendros.resolve_parameter_path(root, path)[source]
Locate the XML element identified by a Galacticus parameter path.
- Parameters:
root (Element) – Root
xml.etree.ElementTree.Elementto search.path (str) – Slash- or
::-separated parameter path. Each segment is an element name, optionally followed by a[N]integer (1-based) instance selector or a[@value='x']attribute filter — matching the Galacticus parameter-file convention.
- Return type:
- Raises:
KeyError – If any segment does not match an element under the current node.
ValueError – If a path segment is malformed.
- dendros.apply_state(tree, parameters, state, *, parameter_map=None)[source]
Set
value=attributes in tree from a state vector.- Parameters:
tree (ElementTree) – Parsed XML tree to modify in place.
parameters (Sequence[ModelParameter]) – Active model parameters (the same ordering used by chain columns).
state (ndarray) –
(n_params,)state vector aligned with parameters.parameter_map (Iterable[str] | None) – Optional iterable of parameter names to apply.
Noneapplies every entry of parameters; non-Noneapplies only those named and is the typical case for anindependentLikelihoodsleaf, whose base parameter file mentions only that leaf’s parameters.
- Raises:
KeyError – If a parameter’s path does not resolve in tree, or if a name in parameter_map is not among parameters.
ValueError – If state doesn’t match the length of parameters.
- Return type:
None
- dendros.emit_parameter_files(state, config, out_dir, *, name_format=None)[source]
Write one Galacticus parameter file per leaf of
config.likelihood.Reads each leaf’s
base_parameters_file, applies the subset of state selected by that leaf’sparameter_map(or the full state when no map is set), and writes the modified XML into out_dir.- Parameters:
state (ndarray) –
(n_params,)state vector (in physical / model space, as stored in the chain log file — no mapper inversion is applied).config –
MCMCConfig.out_dir (str | Path) – Output directory (created if missing).
name_format (str | None) – Format string for output filenames; receives
leaf_indexandstem(the base file’s stem). Defaults to"{stem}.xml"for a single leaf and"{leaf_index:02d}_{stem}.xml"for multiple, so per-leaf files don’t collide when several leaves share a base stem.
- Returns:
One tuple per leaf, in document order.
- Return type:
list of (leaf_index, written_path)
- Raises:
ValueError – If
config.likelihoodisNone, or if any leaf lacks abase_parameters_file.KeyError – If a parameter’s path does not resolve in the corresponding base file, or a
parameter_mapreferences an unknown parameter.
Corner plots
- dendros.corner_plot(chains, *, parameters=None, post_burn=None, drop_chains=(), labels=None, **corner_kwargs)[source]
Render a corner plot of post-burn chain samples.
- Parameters:
chains (ChainSet) –
ChainSetwhose post-burn samples will be plotted.parameters (Iterable[str] | None) – Optional iterable of parameter names to restrict the plot to a subset (in the order given).
Noneplots every active parameter.post_burn (int | None) – Number of leading rows to skip per chain.
Nonetriggers automatic detection viagelman_rubin()/convergence_step().drop_chains (Sequence[int]) – Iterable of
chain_indexvalues to exclude.labels (Sequence[str] | None) – Optional axis labels.
Noneuses each parameter’s LaTeXModelParameter.display_label, wrapped in$...$socornerrenders them in math mode.**corner_kwargs – Additional keyword arguments forwarded to
corner.corner().
- Return type:
matplotlib.figure.Figure
- Raises:
ImportError – If the optional
cornerpackage is not installed. Install viapip install 'dendros[mcmc]'.KeyError – If a name in parameters is not among the active parameters.
ValueError – If the post-burn pool is empty.
Internal helpers
- class dendros._collection.GroupProxy(collection, path)[source]
Read-only h5py-like proxy for an HDF5 group.
- Parameters:
collection (Collection) – Parent
Collection.path (str) – HDF5 path to the group within the file.
- class dendros._collection.DatasetProxy(collection, path)[source]
Read-only h5py-like proxy for an HDF5 dataset.
For multi-file
Collectioninstances,read()concatenates data from all files along axis 0.- Parameters:
collection (Collection) – Parent
Collection.path (str) – HDF5 path to the dataset within the file.
- property dtype
NumPy dtype of the dataset.
- read(where=None)[source]
Read the dataset into a
numpy.ndarray.For multi-file collections the arrays from all files are concatenated along axis 0 before the optional where selection is applied.
- Parameters:
where –
Nonereads everything. A boolean mask or integer index array is applied after concatenation.- Return type: