Skip to content

Commit f9b3ff9

Browse files
authored
Merge pull request #55 from esahon/rjt
Add rooted junction tree models. Note: the documentation must still be updated before a release
2 parents dd5896b + 3f06a3b commit f9b3ff9

21 files changed

+655
-97
lines changed

Project.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
1111
PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
1212
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
1313
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
14+
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
1415

1516
[compat]
1617
DataFrames = "1.3"

docs/src/api.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ UtilityMatrix(::InfluenceDiagram, ::Name)
5858
add_utilities!
5959
generate_arcs!
6060
generate_diagram!
61+
RJT
6162
indices
6263
I_j_indices
6364
indices_in_vector
@@ -93,10 +94,18 @@ conditional_value_at_risk(::Model, ::InfluenceDiagram, ::PathCompatibilityVariab
9394

9495
### Decision Strategy from Variables
9596
```@docs
96-
LocalDecisionStrategy(::Node, ::Vector{VariableRef})
97+
LocalDecisionStrategy(::Node, ::Array{VariableRef})
9798
DecisionStrategy(::InfluenceDiagram, ::OrderedDict{Name, DecisionProgramming.DecisionVariable})
9899
```
99100

101+
### RJT model
102+
```@docs
103+
RJTVariables
104+
expected_value(::Model, ::InfluenceDiagram, ::DecisionProgramming.RJTVariables)
105+
conditional_value_at_risk(::Model, ::InfluenceDiagram, ::DecisionProgramming.RJTVariables, ::Float64)
106+
generate_model
107+
```
108+
100109
## `heuristics.jl`
101110
### Single policy update
102111
```@docs
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# [RJT model](@id RJT-model)
2+
## Introduction
3+
Influence diagrams can be represented as directed rooted trees composed of clusters, which can be transformed into gradual rooted junction trees (RJTs) by imposing additional constraints. These can then be used to formulate an optimization model. Solving for optimal decision strategies using these formulations can be done with significantly less computing time than for full path based formulations. Using RJT based formulations is thus generally preferable.
4+
5+
The explanations for RJT construction and RJT based model formulation largely follow that of Herrala et al. (2024). [^1]
6+
7+
## Converting influence diagrams to RJTs
8+
9+
An influence diagram $G = (N, A)$ can be represented as a directed rooted tree $\mathscr{G} = (\mathscr{V}, \mathscr{A})$ composed of *clusters* $C \isin V$, which are subsets of the nodes of the ID, that is, $C \subset \mathscr{V}$. Both $G$ and $\mathscr{G}$ are directed acyclic graphs whose vertices are connected with directed arcs in $A$ and $\mathscr{A}$ , respectively. The main difference between these diagrams lies in the nature of the vertices. In an influence diagram, the set of nodes $N$ consists of individual chance events, decisions and consequences, while the clusters in $\mathscr{V}$ comprise multiple nodes, hence the notational distinction between $N$ and $\mathscr{V}$.
10+
11+
In order to reformulate this tree into a MIP model, additional constraints need to be imposed, making $\mathscr{G}$ a *gradual rooted junction tree*. A directed rooted tree $\mathscr{G} = (\mathscr{V}, \mathscr{A})$ consisting of clusters $C \in \mathscr{V}$ of nodes $j \in N$ is a gradual rooted junction tree corresponding to the influence diagram $G$ if:
12+
13+
1. Given two clusters $C_1$ and $C_2$ in the junction tree, any cluster $C$ on the unique undirected path between $C_1$ and $C_2$ satisfies $C_1 \cap C_2 \subset C$.
14+
2. Each cluster $C \in \mathscr{V}$ is the root cluster of exactly one node $j \in N$ (that is, the root of the subgraph induced by the clusters with node $j$) and all nodes $j \in N$ appear in at least one of the clusters.
15+
3. For each cluster, $I(j) \in C_j$, where $C_j$ is the _root cluster_ of $j \in N$.
16+
17+
A rooted tree satisfying condition (1) is said to satisfy the *running intersection property*. This condition is sufficient for making $\mathscr{G}$ a rooted junction tree (RJT). In addition, as a consequence of condition (2), we see that a gradual RJT has as many clusters as the original influence diagram has nodes, and each node $j \in N$ can be thought as corresponding to one of the clusters $C \in \mathscr{V}$. Because of this, we refer to clusters using the corresponding nodes $j \in N$ in the influence diagram as the *root cluster* of node $j \in N$, which is denoted as $C_j \in \mathscr{V}$.
18+
19+
An example of influence diagram (upper figure) conversion to RJT (lower figure) for pig breeding problem with $N=4$ is shown below.
20+
21+
<img src="figures/pig_breeding_N=4.jpg" width="400"> <br>
22+
23+
<img src="figures/pig_breeding_rjt_N=4.jpg" width="330">
24+
25+
## Formulating an optimization problem based on gradual RJT
26+
27+
Formulating an optimization model based on the gradual RJT representation starts by introducing a vector of moments $\mu_{C_j}$ for each cluster $C_j, \ j \in N$. Parmentier et al. (2020) [^2] show that for RJTs, we can impose constraints so that these become moments of a distribution $\mu_N$ that factorizes according to $G(N,A)$. The joint distribution $\mathbb{P}$ is said to factorize [^3] according to $G$ if $\mathbb{P}$ can be expressed as
28+
29+
$$\mathbb{P}(X_N = s_N) = \prod_{j \in N}\mathbb{P}(X_j=s_j \mid X_{I(j)}=s_{I(j)}).\tag{1}$$
30+
31+
In the formulation, $\mu_{C_j}(s_{C_j})$ represents the probability of the nodes within the cluster $C_j$ being in states $s_{C_j}$ and condition (3) of definition above ensures that $\mathbb{P}(X_j=s_j \mid X_{I(j)}=s_{I(j)})$ can thus be obtained from $\mu_{C_j}(s_{C_j})$ for each $j \in N$.
32+
33+
Variable definitions are given in the influence diagram section.
34+
35+
## MIP model formulation
36+
37+
### Objective function
38+
39+
Based on the observations above, the MIP model can be formulated. The objective function becomes:
40+
41+
$${\text{maximize}} \sum_{j \in N^V} \sum_{s_{C_j} \in S_{C_j}} \mu_{C_j}(s_{C_j}) u_{C_j}(s_{C_j})\tag{2}$$
42+
43+
This represents an expected utility maximization problem where $u_{C_j}$ represent the utility values associated with different realizations of the nodes within the cluster $C_j$.
44+
45+
### Constraints
46+
#### Decision variables
47+
48+
Only one positive decision can be taken given the same information:
49+
50+
$$\sum_{s_j \in S_{I(j)}}\delta (s_j \mid s_{I(j)})=1, \ \forall j \in N^D, s_{I(j)} \in S_{I(j)}\tag{3}$$
51+
52+
#### μ-variables
53+
54+
μ-variables need to represent valid probability distributions with probabilities summing up to 1. Thus, we get constraint:
55+
56+
$$\sum_{s_{C_j} \in S_{C_j}} \mu_{C_j}(s_{C_j}) = 1, \ \forall j \in N\\\tag{3}$$
57+
58+
We use moments $\mu_{\overline{C}_j}$ (μ_bar in the code) to ease the notation. These are defined as
59+
60+
$$\mu_{\overline{C}_j}(s_{\overline{C}_j}) = \sum_{s_j \in S_j} \mu_{C_j}(s_{C_j})\tag{4}$$
61+
62+
63+
representing the marginal distribution for cluster $C_j$ with the node $j$ marginalized out. $\overline{C}_j = C_j \setminus j$ is used for notational brevity.
64+
65+
66+
#### Local consistency
67+
68+
The following constraint enforces local consistency between adjacent clusters, meaning that for a pair $C_i, C_j$ of adjacent clusters, the marginal distribution for the nodes in both $C_i$ and $C_j$ (that is, $C_i \cap C_j$) must be the same when obtained from either $C_i$ or $C_j$. This is formulated as:
69+
70+
$$\sum_{\substack{s_{C_i} \in S_{C_i}, \\ s_{C_i \cap C_j} = s^*_{C_i \cap C_j}}} \mu_{C_i}(s_{C_i})= \sum_{\substack{s_{C_j} \in S_{C_j}, \\ s_{C_i \cap C_j} = s^*_{C_i \cap C_j}}} \mu_{C_j}(s_{C_j}), \nonumber\\ \tag{5}$$
71+
72+
$$\qquad\qquad\qquad\qquad \ \forall (C_i,C_j) \in \mathscr{A}, s^*_{C_i \cap C_j} \in S_{C_i \cap C_j}\\$$
73+
74+
#### Conditional probabilities and decision strategies
75+
76+
Moments $\mu_{\overline{C}_j}$ defined above are used here. We get constraints:
77+
78+
$$\mu_{C_j}(s_{C_j}) = \mu_{\overline{C}_j}(s_{\overline{C}_j}) p(s_j \mid s_{I(j)}), \ \forall j \in N^C \cup N^V, s_{C_j} \in S_{C_j} \tag{6}\\$$
79+
80+
$$\mu_{C_j}(s_{C_j}) \le \delta(s_j \mid s_{I(j)}), \ \forall j \in N^D, s_{C_j} \in S_{C_j} \tag{7}\\$$
81+
82+
The value $p(s_j \mid s_{I(j)})$ is the conditional probability of a state $s_j$ given the information state $s_{I(j)}$ and $\delta(s_j \mid s_{I(j)})$ the decision strategy in node $j \in N^D$.
83+
84+
#### Non-negativity and integer constraints
85+
86+
All probability and decision variables have to be non-negative. Decision variables are binary.
87+
88+
$$\mu_{C_j}(s_{C_j}) \ge 0, \ \forall j \in N, s_{C_j} \in S_{C_j}\tag{8}$$
89+
90+
$$\delta(s_j \mid s_{I(j)}) \in \{0,1\}, \ \forall j \in N^D, s_j \in S_j, s_{I(j)} \in S_{I(j)}\tag{9}$$
91+
92+
## Limitations
93+
94+
Currently, the RJT formulation commands in the package do not support forbidden path or fixed path features.
95+
96+
## Computational considerations
97+
98+
Around 2-3 magnitudes faster solving times are expected using RJT formulations. [^1] In problems with small treewidths, such as the pig breeding problem, the solution times hardly even change when increasing the number of nodes.
99+
100+
## References
101+
102+
[^1]: Herrala, O., Terho, T., Oliveira, F., 2024. Risk-averse decision strategies for influence diagrams using rooted junction trees. Retrieved from [https://arxiv.org/abs/2401.03734]
103+
104+
[^2]: Parmentier, A., Cohen, V., Leclere, V., Obozinski, G., Salmon, J., 2020. `
105+
Integer programming on the junction tree polytope for influence diagrams. INFORMS Journal on Optimization 2, 209–228.
106+
107+
[^3]: Koller, D., Friedman, N., 2009. Probabilistic graphical models: principles
108+
and techniques. MIT press

docs/src/decision-programming/decision-model.md

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,10 @@ $$\mathcal{U}^-(𝐬) = \mathcal{U}(𝐬) - \max_{𝐬∈𝐒} \mathcal{U}(𝐬)
7474
## Conditional Value-at-Risk
7575
The section [Measuring Risk](@ref) explains and visualizes the relationships between the formulation of expected value, value-at-risk and conditional value-at-risk for discrete probability distribution.
7676

77+
In this section, CVaR models are defined for both path-based and RJT models.
78+
79+
### Path-based model
80+
7781
Given decision strategy $Z,$ we define the cumulative distribution of compatible paths' probabilities as
7882

7983
$$F_Z(t) = ∑_{𝐬∈𝐒∣\mathcal{U}(𝐬)≤t} x(𝐬) p(𝐬).$$
@@ -138,10 +142,44 @@ We can express the conditional value-at-risk objective as
138142

139143
$$\operatorname{CVaR}_α(Z)=\frac{1}{α}∑_{𝐬∈𝐒}\bar{ρ}(𝐬) \mathcal{U}(𝐬)\tag{25}.$$
140144

145+
### RJT model
146+
147+
CVaR formulation for the RJT model is close to that of path-based model. A diagram can have only a single value node, when using RJT-based CVaR. Trying to call the RJT-based CVaR function using a diagram with more than one value node results in an error.
148+
149+
We denote the possible utility values with $u ∈ U$ and suppose we can define the probability $p(u)$ of attaining a given utility value. In the presence of a single value node, we define $p(u) = ∑_{s_{C_v}∈ \text{\{} S_{C_v} \vert U(s_{C_v})=u \text{\}} }µ(s_{C_v})$. We can then pose the constraints
150+
151+
$$η-u≤M λ(u),\quad ∀u∈U \tag{26}$$
152+
153+
$$η-u≥(M+ϵ) λ(u) - M,\quad ∀u∈U \tag{27}$$
154+
155+
$$η-u≤(M+ϵ) \bar{λ}(u) - ϵ,\quad ∀u∈U \tag{28}$$
156+
157+
$$η-u≥M (\bar{λ}(u) - 1),\quad ∀u∈U \tag{29}$$
158+
159+
$$\bar{ρ}(u) ≤ \bar{λ}(u),\quad ∀u∈U \tag{30}$$
160+
161+
$$p(u) - (1 - λ(u)) ≤ ρ(u) ≤ λ(u),\quad ∀u∈U \tag{31}$$
162+
163+
$$ρ(u) ≤ \bar{ρ}(u) ≤ p(u),\quad ∀u∈U \tag{32}$$
164+
165+
$$∑_{u∈U}\bar{ρ}(u) = α \tag{33}$$
166+
167+
$$\bar{λ}(u), λ(u)∈\{0, 1\},\quad ∀u∈U \tag{34}$$
168+
169+
$$\bar{ρ}(u),ρ(u)∈[0, 1],\quad ∀u∈U \tag{35}$$
170+
171+
$$η∈\mathbb{R} \tag{36}$$
172+
173+
where where α is the probability level in VaR<sub>α</sub>.
174+
175+
$CVaR_α$ can be obtained as $1/α ∑_{u∈U} \bar{ρ}(u)u$.
176+
177+
More details, including explanations of variables and constraints, can be found from Herrala et al. (2024)[^4].
178+
141179
## Convex Combination
142180
We can combine expected value and conditional value-at-risk using a convex combination at a fixed probability level $α∈(0, 1]$ as follows
143181

144-
$$w \operatorname{E}(Z) + (1-w) \operatorname{CVaR}_α(Z), \tag{26}$$
182+
$$w \operatorname{E}(Z) + (1-w) \operatorname{CVaR}_α(Z), \tag{37}$$
145183

146184
where the parameter $w∈[0, 1]$ expresses the decision maker's **risk tolerance**.
147185

@@ -152,3 +190,5 @@ where the parameter $w∈[0, 1]$ expresses the decision maker's **risk tolerance
152190
[^2]: Hölsä, O. (2020). Decision Programming Framework for Evaluating Testing Costs of Disease-Prone Pigs. Retrieved from [http://urn.fi/URN:NBN:fi:aalto-202009295618](http://urn.fi/URN:NBN:fi:aalto-202009295618)
153191

154192
[^3]: Hankimaa, H., Herrala, O., Oliveira, F., Tollander de Balsch, J. (2023). DecisionProgramming.jl -- A framework for modelling decision problems using mathematical programming. Retrieved from [https://arxiv.org/abs/2307.13299](https://arxiv.org/abs/2307.13299)
193+
194+
[^4]: Herrala, O., Terho, T., Oliveira, F., 2024. Risk-averse decision strategies for influence diagrams using rooted junction trees. Retrieved from [https://arxiv.org/abs/2401.03734]
29.3 KB
Loading
32.6 KB
Loading

docs/src/examples/n-monitoring.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -189,8 +189,19 @@ The expected utility is used as the objective and the problem is solved using Gu
189189
```julia
190190
EV = expected_value(model, diagram, x_s)
191191
@objective(model, Max, EV)
192+
```
192193

194+
Alternatively, RJT formulation can be used by replacing commands on path compatibility variables and objective function creation with commands
193195

196+
```julia
197+
μ_s = RJTVariables(model, diagram, z)
198+
EV = expected_value(model, diagram, μ_s)
199+
@objective(model, Max, EV)
200+
```
201+
202+
and then solving using the solver. Significantly faster solving times are expected using RJT formulation.
203+
204+
```julia
194205
optimizer = optimizer_with_attributes(
195206
() -> HiGHS.Optimizer()
196207
)

docs/src/examples/pig-breeding.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,16 @@ EV = expected_value(model, diagram, x_s)
185185

186186
and set up the solver.
187187

188+
Alternatively, RJT formulation can be used by replacing commands on path compatibility variables and objective function creation with commands
189+
190+
```julia
191+
μ_s = RJTVariables(model, diagram, z)
192+
EV = expected_value(model, diagram, μ_s)
193+
@objective(model, Max, EV)
194+
```
195+
196+
and then solving using the solver. Significantly faster solving times are expected using RJT formulation.
197+
188198
```julia
189199
optimizer = optimizer_with_attributes(
190200
() -> HiGHS.Optimizer()
@@ -198,6 +208,15 @@ spu = singlePolicyUpdate(diagram, model, z, x_s)
198208
optimize!(model)
199209
```
200210

211+
<!-- Onko tämä hyvä, voisi tehdä kunnon esimerkin myös CVaRista, mutta onko nyt tarpeen? -->
212+
213+
CVaR model can be created by adding the following constraint to the model. The model has to be built so that there is only one value node. The constraint with this specific numerical value here is tested and meaningful for N = 6.
214+
215+
```
216+
α = 0.05
217+
CVaR = conditional_value_at_risk(model, diagram, μ_s, α; probability_scale_factor = 1.0)
218+
@constraint(model, CVaR>=300.0)
219+
```
201220

202221
## Analyzing results
203222

docs/src/examples/used-car-buyer.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,16 @@ EV = expected_value(model, diagram, x_s)
139139
@objective(model, Max, EV)
140140
```
141141

142+
Alternatively, RJT formulation can be used by replacing commands on path compatibility variables and objective function creation with commands
143+
144+
```julia
145+
μ_s = RJTVariables(model, diagram, z)
146+
EV = expected_value(model, diagram, μ_s)
147+
@objective(model, Max, EV)
148+
```
149+
150+
and then solving using the solver. Significantly faster solving times are expected using RJT formulation.
151+
142152
We can perform the optimization using an optimizer such as HiGHS.
143153

144154
```julia

examples/CHD_preventative_care.jl

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -171,21 +171,12 @@ Y_HB["no CHD", "treatment"] = 7.64528451705134
171171
Y_HB["no CHD", "no treatment"] = 7.70088349200034
172172
add_utilities!(diagram, "HB", Y_HB)
173173

174-
generate_diagram!(diagram)
175-
176-
177-
@info("Creating the decision model.")
178-
model = Model()
179-
z = DecisionVariables(model, diagram)
180-
181174
# Defining forbidden paths to include all those where a test is repeated twice
182175
forbidden_tests = ForbiddenPath(diagram, ["T1","T2"], [("TRS", "TRS"),("GRS", "GRS"),("no test", "TRS"), ("no test", "GRS")])
183176
fixed_R0 = FixedPath(diagram, Dict("R0" => chosen_risk_level))
184-
scale_factor = 10000.0
185-
x_s = PathCompatibilityVariables(model, diagram, z; fixed = fixed_R0, forbidden_paths = [forbidden_tests], probability_cut=false)
186177

187-
EV = expected_value(model, diagram, x_s)
188-
@objective(model, Max, EV)
178+
@info("Creating the decision model.")
179+
model, z, x_s = generate_model(diagram, model_type="DP", forbidden_paths=[forbidden_tests], fixed=fixed_R0)
189180

190181
@info("Starting the optimization process.")
191182
optimizer = optimizer_with_attributes(

0 commit comments

Comments
 (0)