-
Notifications
You must be signed in to change notification settings - Fork 31
Import and extend PosteriorStats #431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| changerates, mvchangerate = changerate(chains) | ||
|
|
||
| # Summarize the results in a named tuple. | ||
| nt = (; zip(names_of_params, changerates)..., multivariate = mvchangerate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lacking a parameter column meant the show method was broken. But since there is a changerate for every parameter, it makes more sense to do the same thing as gelmandiag_multivariate and return a SummaryStats for the marginal values and return the multivariate changerate separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Genuine question: what is the change-rate in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From inspecting the code, it's, for each parameter and chain, the fraction of draws that are different from the previous draw. I suppose it's similar to "acceptance rate."
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit
JuliaFormatter v1.0.62
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
MCMCChains.jl/test/summarize_tests.jl
Lines 67 to 71 in e31f4f9
| three_parms_df = DataFrame(summarize( | |
| chns[[:a, :b, :c]], | |
| mean, std; | |
| sections = [:parameters, :internals], | |
| )) |
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
MCMCChains.jl/test/summarize_tests.jl
Lines 75 to 80 in e31f4f9
| three_parms_df_2 = DataFrame(summarize( | |
| chns[[:a, :b, :g]], | |
| :mymean => mean, | |
| :mystd => std; | |
| sections = [:parameters, :internals], | |
| )) |
| @test chn3.info == chn.info | ||
|
|
||
| @test all(MCMCChains.indiscretesupport(chn) .== [false, false, false, true]) | ||
| @test setinfo(chn, NamedTuple{(:A, :B)}((1,2))).info == NamedTuple{(:A, :B)}((1,2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
| @test setinfo(chn, NamedTuple{(:A, :B)}((1,2))).info == NamedTuple{(:A, :B)}((1,2)) | |
| @test setinfo(chn, NamedTuple{(:A, :B)}((1, 2))).info == NamedTuple{(:A, :B)}((1, 2)) |
| end | ||
|
|
||
| @testset "function tests" begin | ||
| tchain = Chains(rand(niter, nparams, nchains), ["a", "b", "c"], Dict(:internals => ["c"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
| tchain = Chains(rand(niter, nparams, nchains), ["a", "b", "c"], Dict(:internals => ["c"])) | |
| tchain = | |
| Chains(rand(niter, nparams, nchains), ["a", "b", "c"], Dict(:internals => ["c"])) |
| lags = MCMCChains._default_lags(c, append_chains) | ||
| @test lags == filter!(x -> x < n, [1, 5, 10, 50]) | ||
|
|
||
| acor = autocor(c; append_chains=append_chains) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
| acor = autocor(c; append_chains=append_chains) | |
| acor = autocor(c; append_chains = append_chains) |
| x = rand(10_000, 40, 10) | ||
| chain = Chains(x) | ||
|
|
||
| for autocov_method in (AutocovMethod(), FFTAutocovMethod(), BDAAutocovMethod()), kind in (:bulk, :basic), f in (ess, ess_rhat, rhat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
| for autocov_method in (AutocovMethod(), FFTAutocovMethod(), BDAAutocovMethod()), kind in (:bulk, :basic), f in (ess, ess_rhat, rhat) | |
| for autocov_method in (AutocovMethod(), FFTAutocovMethod(), BDAAutocovMethod()), | |
| kind in (:bulk, :basic), | |
| f in (ess, ess_rhat, rhat) |
|
|
||
| # analyze array | ||
| ess_array, rhat_array = ess_rhat( | ||
| permutedims(x, (1, 3, 2)); autocov_method = autocov_method, kind = kind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
| permutedims(x, (1, 3, 2)); autocov_method = autocov_method, kind = kind, | |
| permutedims(x, (1, 3, 2)); | |
| autocov_method = autocov_method, | |
| kind = kind, |
|
|
||
| # analyze array | ||
| mcse_array = mcse( | ||
| PermutedDimsArray(x, (1, 3, 2)); autocov_method = autocov_method, kind = kind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
| PermutedDimsArray(x, (1, 3, 2)); autocov_method = autocov_method, kind = kind, | |
| PermutedDimsArray(x, (1, 3, 2)); | |
| autocov_method = autocov_method, | |
| kind = kind, |
| rf_1 = rafterydiag(chn) | ||
| rf_2 = rafterydiag(chn_m) | ||
|
|
||
| @testset "diagnostics missing tests" for i in 1:nchains |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
| @testset "diagnostics missing tests" for i in 1:nchains | |
| @testset "diagnostics missing tests" for i = 1:nchains |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
| - `ci_fun` (default: `eti`): The function used to compute the credible intervals. | ||
| (Can be [`eti`](@ref) or [`hdi`](@ref)) | ||
| - `ci_probs` (default: `[$DEFAULT_CI_PROB, 0.8]`): The probability mass(es) of the credible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are now quite close; perhaps the 0.8 should be decreased. For reference, ArviZ uses [0.89, 0.5] as the default probs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, happy to change it to 0.5.
|
This is now up-to-date with PosteriorStats and ready for review. The remaining formatting suggestions seem to all be in lines not touched by this PR but close to lines touched by this PR. Note that |
penelopeysm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @sethaxen!
I'm probably not going to review line by line, but I can tell that the tests have largely been preserved, so I'm going to trust that all the minutiae have been ironed out as part of making CI pass and focus on broader things.
My first question (not sure if there are more) is to do with the SummaryStats struct. What's the best way to index into it? Is there a way to use the label :a here?
julia> ss = mean(Chains(ones(3, 2), [:a, :b]))
Mean
mean
a 1.00
b 1.00
julia> ss[:mean][1]
1.0
As documented here it can currently be treated like an When I eventually add an official indexing API, it won't require a breaking release here or downstream. |
|
Thanks for the explanation! Just the disclaimer to begin, I have absolutely no intention of reviewing this unfairly, but I'm also cognisant that there's some level of conflict of interest because I'm also separately working on FlexiChains (and trying to solve all these same tricky decisions with APIs!), and I can't prove that I'm fully disinterested. Feel free to say if you'd rather someone else review, I won't be offended in the least. Whilst I'm very happy with the general aim of centralising functionality and reusing code, I think as it stands it's a loss in terms of usability if it's not possible to use the parameter name somehow. IMO, I should be able to combine the three things julia> using MCMCChains; ss = mean(Chains(rand(3, 2), [:a, :b]))
Mean
parameters mean
Symbol Float64
a 0.6386
b 0.5754
julia> ss[:a, :mean]
0.6385968125591274For julia> using MCMCChains; ss = mean(Chains(rand(3, 2), [:a, :b]))
Mean
mean
a 0.460
b 0.647
julia> ss[:mean][findfirst(isequal(:a), ss[:label])]
0.46036692554959524Once there is an interface like that, I'd be very happy to approve this PR. I'm not particularly wedded to the interface being Regarding documentation in general, I just wanted to explain my general stance. I do indeed see that you're documenting things in PosteriorStats.jl (which is great!) but if we bring PosteriorStats into MCMCChains, we also need to worry about it in MCMCChains and (perhaps more importantly) the main Turing docs because this is very user-facing. Currently the interface of MCMCChains is quite underdocumented (it essentially amounts to 'learn by reading the examples in the docs'). This existing situation is of course not your fault at all, but I'd like to make sure that when we add new functionality, someone also ensures that it's explained in the main Turing docs. |
Can we consider this as part 1 of a larger change, and then have part 2 catch up with the interface? @sethaxen, do you think you will be happy with that? Or would you rather have another Turing.jl team member (perhaps @shravanngoswamii) work on part 2? |
|
I would be happy to continue part 2 of this PR, @sethaxen, please let me know any comments or your thoughts that I should keep in mind! |
|
To clarify, no changes would be needed at all to MCMCChains to support a new indexing syntax for |
|
@sethaxen, a gentle reminde on this -- it would be a great to complete this before 2026! |
This PR makes the following replacements everywhere:
hpd->PosteriorStats.hdi(hpdretains its old behavior and is now deprecated)summarize->PosteriorStats.summarizeChainDataFrame->PosteriorStats.SummaryStatsOther changes:
PosteriorStats.etisummarize's defaults), adding new API for setting interval probability and CI typeThe replacement of
ChainDataFrameand theslightchange in API and behavior of the methods makes this a breaking change.Implements #430
e.g.
Closes #491