-
Notifications
You must be signed in to change notification settings - Fork 31
Description
On Twitter, @yebai suggested adding integration with InferenceObjects to MCMCChains: https://twitter.com/Hong_Ge2/status/1560343482216103938. I'm opening this issue for further discussion.
InferenceObjects.InferenceData is the storage format for Monte Carlo draws used by ArviZ.jl. Along with Python's arviz.InferenceData, it follows the cross-language InferenceData schema. PyMC uses Python's implementation as its official sample storage format. InferenceData can be serialized to NetCDF to standardize communicating results of Bayesian analyses across languages and PPLs. In Julia, it is built on DimensionalData. See example usage and plotting examples (using the Tables interface).
@yebai's suggestion is ultimately to deprecate Chains to instead use InferenceData. I see several upsides of this approach:
Chainsis based on the somewhat outdated AxisArrays, while DimensionalData is more modern.Chainsflattens all draws and sampling statistics into a single 3D float array, which discards a lot of the structure of the sampled types (which may themselves be multidimensional or have non-float eltypes, such asIntor evenCholesky).InferenceData's features are a superset ofChains. It can get closer to the original structure of the user's samples with named dimensions, but it also supports storing other metadata and can store prior, predictive, log-likelihood, and warmup draws, as well as the original data.InferenceObjectsis a relatively light dependency (~0.120-0.2s load time on Julia v1.7-1.8 vs MCMCChains with 1.7-3.6s) so would not add much to MCMCChains's load time.
Currently ArviZ.jl has a converter from_mcmcchains, which is used to convert Chains to InferenceData. Integration between Chains and InferenceData might look like the following steps:
- Move
ArviZ.from_mcmcchainshere (with a better name) - Make
InferenceDataa supportedchain_typeforAbstractMCMC.sample(https://beta.turing.ml/AbstractMCMC.jl/dev/api/#Chains), which would bypassChains's flattening entirely. I'm not sure this should live here, but it should not live in InferenceObjects.