Skip to content

Commit fbbd5a4

Browse files
authored
Merge pull request #70 from TensorBFS/fix-62-69-66
Fix issues: 62 69 66
2 parents b9797d4 + ea48d21 commit fbbd5a4

File tree

7 files changed

+278
-15
lines changed

7 files changed

+278
-15
lines changed

docs/make.jl

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,13 +48,16 @@ makedocs(;
4848
),
4949
pages=[
5050
"Home" => "index.md",
51-
"Background" => "background.md",
51+
"Background" => [
52+
"Probabilistic Inference" => "probabilisticinference.md",
53+
"Tensor Networks" => "tensornetwork.md",
54+
"UAI file formats" => "uai-file-formats.md"
55+
],
5256
"Examples" => [
5357
"Overview" => "examples-overview.md",
5458
"Asia Network" => "generated/asia/main.md",
5559
"Hard-core Lattice Gas" => "generated/hard-core-lattice-gas/main.md",
5660
],
57-
"UAI file formats" => "uai-file-formats.md",
5861
"Performance tips" => "generated/performance.md",
5962
"API" => [
6063
"Public" => "api/public.md",
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
\usepackage{tikz}
2+
\usepackage{xcolor}
3+
\usetikzlibrary{positioning}
4+
5+
\definecolor{c01}{HTML}{5790fc}
6+
\definecolor{c02}{HTML}{f89c20}
7+
\definecolor{c03}{HTML}{e42536}
8+
\definecolor{c04}{HTML}{964a8b}
9+
\definecolor{c05}{HTML}{9c9ca1}
10+
\definecolor{c06}{HTML}{7a21dd}
11+
12+
\tikzset {
13+
mytensor/.style={
14+
circle,
15+
thick,
16+
fill=white,
17+
draw=black!100,
18+
font=\small,
19+
minimum size=0.5cm
20+
},
21+
myedge/.style={
22+
line width=0.80pt,
23+
}
24+
}

docs/src/index.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,10 @@ more complex, real-world models.
5959
## Outline
6060
```@contents
6161
Pages = [
62-
"background.md",
63-
"examples-overview.md",
62+
"probabilisticinference.md",
63+
"tensornetwork.md",
6464
"uai-file-formats.md",
65+
"examples-overview.md",
6566
"performance.md",
6667
"api/public.md",
6768
"api/internal.md",

docs/src/background.md renamed to docs/src/probabilisticinference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Background
1+
# Probabilistic inference
22

33
*TensorInference* implements efficient methods to perform Bayesian inference in
44
*probabilistic graphical models*, such as Bayesian Networks or Markov random

docs/src/tensornetwork.md

Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,229 @@
1+
# Tensor networks
2+
3+
We now introduce the core ideas of tensor networks, highlighting their
4+
connections with probabilistic graphical models (PGM) to align the terminology
5+
between them.
6+
7+
For our purposes, a tensor is equivalent to the concept of a factor as defined
8+
in the PGM domain, which we detail more formally below.
9+
10+
## What is a tensor?
11+
12+
*Definition*: A tensor $T$ is defined as:
13+
```math
14+
T: \prod_{V \in \bm{V}} \mathcal{D}_{V} \rightarrow \texttt{number}.
15+
```
16+
Here, the function $T$ maps each possible instantiation of the random
17+
variables in its scope $\bm{V}$ to a generic number type. In the context of tensor networks,
18+
a minimum requirement is that the number type is a commutative semiring.
19+
To define a commutative semiring with the addition operation $\oplus$ and the multiplication operation $\odot$ on a set $S$, the following relations must hold for any arbitrary three elements $a, b, c \in S$.
20+
```math
21+
\newcommand{\mymathbb}[1]{\mathbb{#1}}
22+
\begin{align*}
23+
(a \oplus b) \oplus c = a \oplus (b \oplus c) & \hspace{5em}\text{$\triangleright$ commutative monoid $\oplus$ with identity $\mymathbb{0}$}\\
24+
a \oplus \mymathbb{0} = \mymathbb{0} \oplus a = a &\\
25+
a \oplus b = b \oplus a &\\
26+
&\\
27+
(a \odot b) \odot c = a \odot (b \odot c) & \hspace{5em}\text{$\triangleright$ commutative monoid $\odot$ with identity $\mymathbb{1}$}\\
28+
a \odot \mymathbb{1} = \mymathbb{1} \odot a = a &\\
29+
a \odot b = b \odot a &\\
30+
&\\
31+
a \odot (b\oplus c) = a\odot b \oplus a\odot c & \hspace{5em}\text{$\triangleright$ left and right distributive}\\
32+
(a\oplus b) \odot c = a\odot c \oplus b\odot c &\\
33+
&\\
34+
a \odot \mymathbb{0} = \mymathbb{0} \odot a = \mymathbb{0}
35+
\end{align*}
36+
```
37+
Tensors are represented using multidimensional arrays of nonnegative numbers
38+
with labeled dimensions. These labels correspond to the array's indices, which
39+
in turn represent the set of random variables that the tensor is a function
40+
of. Thus, in this context, the terms **label**, **index**, and
41+
**variable** are synonymous and hence used interchangeably.
42+
43+
## What is a tensor network?
44+
45+
We now turn our attention to defining a **tensor network**, a mathematical
46+
object used to represent a multilinear map between tensors. This concept is
47+
widely employed in fields like condensed matter physics
48+
[^Orus2014][^Pfeifer2014], quantum simulation [^Markov2008][^Pan2022], and
49+
even in solving combinatorial optimization problems [^Liu2023]. It's worth
50+
noting that we use a generalized version of the conventional notation, most
51+
commonly known through the
52+
[eisnum](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html)
53+
function, which is commonly used in high-performance computing. Packages that
54+
implement this conventional notation include
55+
- [numpy](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html)
56+
- [OMEinsum.jl](https://github.com/under-Peter/OMEinsum.jl)
57+
- [PyTorch](https://pytorch.org/docs/stable/generated/torch.einsum.html)
58+
- [TensorFlow](https://www.tensorflow.org/api_docs/python/tf/einsum)
59+
60+
This approach allows us to represent a broader range of sum-product
61+
multilinear operations between tensors, thus meeting the requirements of the
62+
PGM field.
63+
64+
*Definition*[^Liu2023]: A tensor network is a multilinear map represented by the triple
65+
$\mathcal{N} = (\Lambda, \mathcal{T}, \bm{\sigma}_0)$, where:
66+
- $\Lambda$ is the set of variables present in the network
67+
$\mathcal{N}$.
68+
- $\mathcal{T} = \{ T^{(k)}_{\bm{\sigma}_k} \}_{k=1}^{M}$ is the set of
69+
input tensors, where each tensor $T^{(k)}_{\bm{\sigma}_k}$ is identified
70+
by a superscript $(k)$ and has an associated scope $\bm{\sigma}_k$.
71+
- $\bm{\sigma}_0$ specifies the scope of the output tensor.
72+
73+
More specifically, each tensor $T^{(k)}_{\bm{\sigma}_k} \in \mathcal{T}$ is
74+
labeled by a string $\bm{\sigma}_k \in \Lambda^{r \left(T^{(k)} \right)}$, where
75+
$r \left(T^{(k)} \right)$ is the rank of $T^{(k)}$. The multilinear map, also
76+
known as the `contraction`, applied to this triple is defined as
77+
```math
78+
\texttt{contract}(\Lambda, \mathcal{T}, \bm{\sigma}_0) = \sum_{\bm{\sigma}_{\Lambda
79+
\setminus [\bm{\sigma}_0]}} \prod_{k=1}^{M} T^{(k)}_{\bm{\sigma}_k},
80+
```
81+
Notably, the summation extends over all instantiations of the variables that
82+
are not part of the output tensor.
83+
84+
As an example, consider matrix multiplication, which can be specified as a
85+
tensor network contraction:
86+
```math
87+
(AB)_{ik} = \texttt{contract}\left(\{i,j,k\}, \{A_{ij}, B_{jk}\}, ik\right),
88+
```
89+
Here, matrices $A$ and $B$ are input tensors labeled by strings $ij, jk \in
90+
\{i, j, k\}^2$. The output tensor is labeled by string $ik$. Summations run
91+
over indices $\Lambda \setminus [ik] = \{j\}$. The contraction corresponds to
92+
```math
93+
\texttt{contract}\left(\{i,j,k\}, \{A_{ij}, B_{jk}\}, ik\right) = \sum_j
94+
A_{ij}B_{jk},
95+
```
96+
In the einsum notation commonly used in various programming languages, this is
97+
equivalent to `ij, jk -> ik`.
98+
99+
Diagrammatically, a tensor network can be represented as an *open hypergraph*.
100+
In this diagram, a tensor maps to a vertex, and a variable maps to a
101+
hyperedge. Tensors sharing the same variable are connected by the same
102+
hyperedge for that variable. The diagrammatic representation of matrix
103+
multiplication is:
104+
```@eval
105+
using TikzPictures
106+
107+
tp = TikzPicture(
108+
L"""
109+
\matrix[row sep=0.8cm,column sep=0.8cm,ampersand replacement= \& ] {
110+
\node (1) {}; \&
111+
\node (a) [mytensor] {$A$}; \&
112+
\node (b) [mytensor] {$B$}; \&
113+
\node (2) {}; \&
114+
\\
115+
};
116+
\draw [myedge, color=c01] (1) edge node[below] {$i$} (a);
117+
\draw [myedge, color=c02] (a) edge node[below] {$j$} (b);
118+
\draw [myedge, color=c03] (b) edge node[below] {$k$} (2);
119+
""",
120+
options="every node/.style={scale=2.0}",
121+
preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "the-tensor-network") * "}",
122+
)
123+
save(SVG("the-tensor-network1"), tp)
124+
```
125+
126+
```@raw html
127+
<img src="the-tensor-network1.svg" style="margin-left: auto; margin-right: auto; display:block; width=50%">
128+
```
129+
130+
In this diagram, we use different colors to denote different hyperedges. Hyperedges for
131+
$i$ and $j$ are left open to denote variables in the output string
132+
$\bm{\sigma}_0$. The reason we use hyperedges rather than regular edges will
133+
become clear in the following star contraction example.
134+
```math
135+
\texttt{contract}(\{i,j,k,l\}, \{A_{il}, B_{jl}, C_{kl}\}, ijk) = \sum_{l}A_{il}
136+
B_{jl} C_{kl}
137+
```
138+
The equivalent einsum notation employed by many programming languages is `il,
139+
jl, kl -> ijk`.
140+
141+
Since the variable $l$ is shared across all three tensors, a simple graph
142+
can't capture the diagram's complexity. The more appropriate hypergraph
143+
representation is shown below.
144+
```@eval
145+
using TikzPictures
146+
147+
tp = TikzPicture(
148+
L"""
149+
\matrix[row sep=0.4cm,column sep=0.4cm,ampersand replacement= \& ] {
150+
\&
151+
\&
152+
\node[color=c01] (j) {$j$}; \&
153+
\&
154+
\&
155+
\\
156+
\&
157+
\&
158+
\node (b) [mytensor] {$B$}; \&
159+
\&
160+
\&
161+
\\
162+
\node[color=c03] (i) {$i$}; \&
163+
\node (a) [mytensor] {$A$}; \&
164+
\node[color=c02] (l) {$l$}; \&
165+
\node (c) [mytensor] {$C$}; \&
166+
\node[color=c04] (k) {$k$}; \&
167+
\\
168+
};
169+
\draw [myedge, color=c01] (j) edge (b);
170+
\draw [myedge, color=c02] (b) edge (l);
171+
\draw [myedge, color=c03] (i) edge (a);
172+
\draw [myedge, color=c02] (a) edge (l);
173+
\draw [myedge, color=c02] (l) edge (c);
174+
\draw [myedge, color=c04] (c) edge (k);
175+
""",
176+
options="every node/.style={scale=2.0}",
177+
preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "the-tensor-network") * "}",
178+
)
179+
save(SVG("the-tensor-network2"), tp)
180+
```
181+
182+
```@raw html
183+
<img src="the-tensor-network2.svg" style="margin-left: auto; margin-right: auto; display:block; width=50%">
184+
```
185+
186+
As a final note, our definition of a tensor network allows for repeated
187+
indices within the same tensor, which translates to self-loops in their
188+
corresponding diagrams.
189+
190+
## Tensor network contraction orders
191+
192+
The performance of a tensor network contraction depends on the order in which
193+
the tensors are contracted. The order of contraction is usually specified by
194+
binary trees, where the leaves are the input tensors and the internal nodes
195+
represent the order of contraction. The root of the tree is the output tensor.
196+
197+
Numerous approaches have been proposed to determine efficient contraction
198+
orderings, which include:
199+
- Greedy algorithms
200+
- Breadth-first search and Dynamic programming [^Pfeifer2014]
201+
- Graph bipartitioning [^Gray2021]
202+
- Local search [^Kalachev2021]
203+
204+
Some of these have been implemented in the
205+
[OMEinsum](https://github.com/under-Peter/OMEinsum.jl) package. Please check
206+
[Performance Tips](@ref) for more details.
207+
208+
## References
209+
210+
[^Orus2014]:
211+
Orús R. A practical introduction to tensor networks: Matrix product states and projected entangled pair states[J]. Annals of physics, 2014, 349: 117-158.
212+
213+
[^Markov2008]:
214+
Markov I L, Shi Y. Simulating quantum computation by contracting tensor networks[J]. SIAM Journal on Computing, 2008, 38(3): 963-981.
215+
216+
[^Pfeifer2014]:
217+
Pfeifer R N C, Haegeman J, Verstraete F. Faster identification of optimal contraction sequences for tensor networks[J]. Physical Review E, 2014, 90(3): 033315.
218+
219+
[^Gray2021]:
220+
Gray J, Kourtis S. Hyper-optimized tensor network contraction[J]. Quantum, 2021, 5: 410.
221+
222+
[^Kalachev2021]:
223+
Kalachev G, Panteleev P, Yung M H. Multi-tensor contraction for XEB verification of quantum circuits[J]. arXiv:2108.05665, 2021.
224+
225+
[^Pan2022]:
226+
Pan F, Chen K, Zhang P. Solving the sampling problem of the sycamore quantum circuits[J]. Physical Review Letters, 2022, 129(9): 090502.
227+
228+
[^Liu2023]:
229+
Liu J G, Gao X, Cain M, et al. Computing solution space properties of combinatorial optimization problems via generic tensor networks[J]. SIAM Journal on Scientific Computing, 2023, 45(3): A1239-A1270.

examples/asia/main.jl

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,8 +60,9 @@ tn = TensorNetworkModel(model)
6060

6161
# ---
6262

63-
# Calculate the ``\log_{10}`` partition function
64-
probability(tn) |> first |> log10
63+
# Calculate the partition function.
64+
# Since the factors in this model is normalized, the partition function is the same as total probability, $1$.
65+
probability(tn) |> first
6566

6667
# ---
6768

examples/hard-core-lattice-gas/main.jl

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,19 +26,23 @@ using GenericTensorNetworks.Graphs: edges, nv
2626
graph = unit_disk_graph(vec(sites), blockade_radius)
2727
show_graph(graph; locs=sites, texts=fill("", length(sites)))
2828

29-
# These constraints defines a independent set problem that characterized by the following energy based model.
30-
# Let $G = (V, E)$ be a graph, where $V$ is the set of vertices and $E$ be the set of edges. The energy model for the hard-core lattice gas problem is
29+
# These constraints defines an independent set problem that characterized by the following energy based model.
30+
# Let $G = (V, E)$ be a graph, where $V$ is the set of vertices and $E$ is the set of edges.
31+
# The energy model for the hard-core lattice gas problem is
3132
# ```math
32-
# E(\mathbf{n}) = -\sum_{i \in V}w_i n_i + \infty \sum_{(i, j) \in E} n_i n_j
33+
# E(\mathbf{n}) = -\sum_{i \in V}w_i n_i + U \sum_{(i, j) \in E} n_i n_j
3334
# ```
3435
# where $n_i \in \{0, 1\}$ is the number of particles at site $i$, and $w_i$ is the weight associated with it. For unweighted graphs, the weights are uniform.
35-
# The solution space hard-core lattice gas is equivalent to that of an independent set problem. The independent set problem involves finding a set of vertices in a graph such that no two vertices in the set are adjacent (i.e., there is no edge connecting them).
36+
# $U$ is the repulsive interaction strength between two particles.
37+
# To represent the independence constraint, we let $U = \infty$, i.e. coexitence of two particles at two sites connected by an edge is completely forbidden.
38+
# The solution space hard-core lattice gas is equivalent to that of an independent set problem.
39+
# The independent set problem involves finding a set of vertices in a graph such that no two vertices in the set are adjacent (i.e., there is no edge connecting them).
3640
# One can create a tensor network based modeling of an independent set problem with package [`GenericTensorNetworks.jl`](https://github.com/QuEraComputing/GenericTensorNetworks.jl).
3741
using GenericTensorNetworks
3842
problem = IndependentSet(graph; optimizer=GreedyMethod());
3943

40-
# There has been a lot of discussions related to solution space properties in the `GenericTensorNetworks` [documentaion page](https://queracomputing.github.io/GenericTensorNetworks.jl/dev/generated/IndependentSet/).
41-
# In this example, we show how to use `TensorInference` to use probabilistic inference for understand the finite temperature properties of this statistic physics model.
44+
# There are plenty of discussions related to solution space properties in the `GenericTensorNetworks` [documentaion page](https://queracomputing.github.io/GenericTensorNetworks.jl/dev/generated/IndependentSet/).
45+
# In this example, we show how to use `TensorInference` to use probabilistic inference for understand the finite temperature properties of this statistical model.
4246
# We use [`TensorNetworkModel`](@ref) to convert a combinatorial optimization problem to a probabilistic model.
4347
# Here, we let the inverse temperature be $\beta = 3$.
4448

@@ -62,7 +66,8 @@ pmodel2 = TensorNetworkModel(problem, β; mars=[[e.src, e.dst] for e in edges(gr
6266
mars = marginals(pmodel2);
6367

6468
# We show the probability that both sites on an edge are not occupied
65-
show_graph(graph; locs=sites, edge_colors=[(b = mars[[e.src, e.dst]][1, 1]; (1-b, 1-b, 1-b)) for e in edges(graph)], texts=fill("", nv(graph)), edge_line_width=5)
69+
show_graph(graph; locs=sites, edge_colors=[(b = mars[[e.src, e.dst]][1, 1]; (1-b, 1-b, 1-b)) for e in edges(graph)], texts=fill("", nv(graph)),
70+
edge_line_widths=edge_colors=[8*mars[[e.src, e.dst]][1, 1] for e in edges(graph)])
6671

6772
# ## The most likely configuration
6873
# The MAP and MMAP can be used to get the most likely configuration given an evidence.
@@ -90,5 +95,5 @@ sum(config2)
9095
# One can ue [`sample`](@ref) to generate samples from hard-core lattice gas at finite temperature.
9196
# The return value is a matrix, with the columns correspond to different samples.
9297
configs = sample(pmodel3, 1000)
93-
sizes = sum(configs; dims=1)
98+
sizes = sum.(configs)
9499
[count(==(i), sizes) for i=0:34] # counting sizes

0 commit comments

Comments
 (0)