Skip to content

Commit 3150eb8

Browse files
ArmavicaricardoV94
authored andcommitted
Add example and explanation for pm.Deterministic
1 parent 181040e commit 3150eb8

File tree

1 file changed

+43
-2
lines changed

1 file changed

+43
-2
lines changed

pymc/model.py

Lines changed: 43 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1937,11 +1937,52 @@ def Point(*args, filter_model_vars=False, **kwargs) -> Dict[str, np.ndarray]:
19371937

19381938

19391939
def Deterministic(name, var, model=None, dims=None, auto=False):
1940-
"""Create a named deterministic variable
1940+
"""Create a named deterministic variable.
1941+
1942+
Deterministic nodes are only deterministic given all of their inputs, i.e.
1943+
they don't add randomness to the model. They are generally used to record
1944+
an intermediary result.
1945+
1946+
Indeed, PyMC allows for arbitrary combinations of random variables, for
1947+
example in the case of a logistic regression
1948+
1949+
.. code:: python
1950+
1951+
with pm.Model():
1952+
alpha = pm.Normal("alpha", 0, 1)
1953+
intercept = pm.Normal("intercept", 0, 1)
1954+
p = pm.math.invlogit(alpha * x + intercept)
1955+
outcome = pm.Bernoulli("outcome", p, observed=outcomes)
1956+
1957+
1958+
but doesn't memorize the fact that the expression ``pm.math.invlogit(alpha *
1959+
x + intercept)`` has been affected to the variable ``p``. If the quantity
1960+
``p`` is important and one would like to track its value in the sampling
1961+
trace, then one can use a deterministic node:
1962+
1963+
.. code:: python
1964+
1965+
with pm.Model():
1966+
alpha = pm.Normal("alpha", 0, 1)
1967+
intercept = pm.Normal("intercept", 0, 1)
1968+
p = pm.Deterministic("p", pm.math.invlogit(alpha * x + intercept))
1969+
outcome = pm.Bernoulli("outcome", p, observed=outcomes)
1970+
1971+
These two models are strictly equivalent from a mathematical point of view.
1972+
However, in the first case, the inference data will only contain values for
1973+
the variables ``alpha``, ``intercept`` and ``outcome``. In the second, it
1974+
will also contain sampled values of ``p`` for each of the observed points.
19411975
19421976
Notes
19431977
-----
1944-
Deterministic nodes are ones that given all the inputs are not random variables
1978+
Even though adding a Deterministic node forces PyMC to compute this
1979+
expression, which could have been optimized away otherwise, this doesn't come
1980+
with a performance cost. Indeed, Deterministic nodes are computed outside
1981+
the main computation graph, which can be optimized as though there was no
1982+
Deterministic nodes. Whereas the optimized graph can be evaluated thousands
1983+
of times during a NUTS step, the Deterministic quantities are just
1984+
computeed once at the end of the step, with the final values of the other
1985+
random variables.
19451986
19461987
Parameters
19471988
----------

0 commit comments

Comments
 (0)