Skip to content

Commit 8608240

Browse files
committed
fix 66 - the documentation part
1 parent e2a235b commit 8608240

File tree

5 files changed

+249
-10
lines changed

5 files changed

+249
-10
lines changed

docs/make.jl

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -48,13 +48,16 @@ makedocs(;
4848
),
4949
pages=[
5050
"Home" => "index.md",
51-
"Background" => "background.md",
52-
"Examples" => [
53-
"Overview" => "examples-overview.md",
54-
"Asia Network" => "generated/asia/main.md",
55-
"Hard-core Lattice Gas" => "generated/hard-core-lattice-gas/main.md",
56-
],
57-
"UAI file formats" => "uai-file-formats.md",
51+
"Background" => [
52+
"Probabilistic Inference" => "probabilisticinference.md",
53+
"Tensor Networks" => "tensornetwork.md",
54+
"UAI file formats" => "uai-file-formats.md"
55+
],
56+
#"Examples" => [
57+
# "Overview" => "examples-overview.md",
58+
# "Asia Network" => "generated/asia/main.md",
59+
# "Hard-core Lattice Gas" => "generated/hard-core-lattice-gas/main.md",
60+
# ],
5861
"Performance tips" => "generated/performance.md",
5962
"API" => [
6063
"Public" => "api/public.md",
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
\usepackage{tikz}
2+
\usepackage{xcolor}
3+
\usetikzlibrary{positioning}
4+
5+
\definecolor{c01}{HTML}{5790fc}
6+
\definecolor{c02}{HTML}{f89c20}
7+
\definecolor{c03}{HTML}{e42536}
8+
\definecolor{c04}{HTML}{964a8b}
9+
\definecolor{c05}{HTML}{9c9ca1}
10+
\definecolor{c06}{HTML}{7a21dd}
11+
12+
\tikzset {
13+
mytensor/.style={
14+
circle,
15+
thick,
16+
fill=white,
17+
draw=black!100,
18+
font=\small,
19+
minimum size=0.5cm
20+
},
21+
myedge/.style={
22+
line width=0.80pt,
23+
}
24+
}

docs/src/index.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,10 @@ more complex, real-world models.
5959
## Outline
6060
```@contents
6161
Pages = [
62-
"background.md",
63-
"examples-overview.md",
62+
"probabilisticinference.md",
63+
"tensornetwork.md",
6464
"uai-file-formats.md",
65+
"examples-overview.md",
6566
"performance.md",
6667
"api/public.md",
6768
"api/internal.md",

docs/src/background.md renamed to docs/src/probabilisticinference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Background
1+
# Probabilistic inference
22

33
*TensorInference* implements efficient methods to perform Bayesian inference in
44
*probabilistic graphical models*, such as Bayesian Networks or Markov random

docs/src/tensornetwork.md

Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
# Tensor networks
2+
3+
We now introduce the core ideas of tensor networks, highlighting their
4+
connections with the probabilistic graphical models (PGM) domain to align the terminology between them.
5+
6+
For our purposes, a **tensor** is equivalent with the concept of a factor
7+
presented above, which we detail more formally below.
8+
9+
## What is a tensor?
10+
*Definition*: A tensor $T$ is defined as:
11+
```math
12+
T: \prod_{V \in \bm{V}} \mathcal{D}_{V} \rightarrow \texttt{number}.
13+
```
14+
Here, the function $T$ maps each possible instantiation of the random
15+
variables in its scope $\bm{V}$ to a generic number type. In the context of tensor networks,
16+
a minimum requirement is that the number type is a commutative semiring.
17+
To define a commutative semiring with the addition operation $\oplus$ and the multiplication operation $\odot$ on a set $S$, the following relations must hold for any arbitrary three elements $a, b, c \in S$.
18+
```math
19+
\newcommand{\mymathbb}[1]{\mathbb{#1}}
20+
\begin{align*}
21+
(a \oplus b) \oplus c = a \oplus (b \oplus c) & \hspace{5em}\text{$\triangleright$ commutative monoid $\oplus$ with identity $\mymathbb{0}$}\\
22+
a \oplus \mymathbb{0} = \mymathbb{0} \oplus a = a &\\
23+
a \oplus b = b \oplus a &\\
24+
&\\
25+
(a \odot b) \odot c = a \odot (b \odot c) & \hspace{5em}\text{$\triangleright$ commutative monoid $\odot$ with identity $\mymathbb{1}$}\\
26+
a \odot \mymathbb{1} = \mymathbb{1} \odot a = a &\\
27+
a \odot b = b \odot a &\\
28+
&\\
29+
a \odot (b\oplus c) = a\odot b \oplus a\odot c & \hspace{5em}\text{$\triangleright$ left and right distributive}\\
30+
(a\oplus b) \odot c = a\odot c \oplus b\odot c &\\
31+
&\\
32+
a \odot \mymathbb{0} = \mymathbb{0} \odot a = \mymathbb{0}
33+
\end{align*}
34+
```
35+
Tensors are represented using multidimensional arrays of nonnegative numbers
36+
with labeled dimensions. These labels correspond to the array's indices, which
37+
in turn represent the set of random variables that the tensor is a function
38+
of. Thus, in this context, the terms **label**, **index**, and
39+
**variable** are synonymous and hence used interchangeably.
40+
41+
## What is a tensor network?
42+
We now turn our attention to defining a **tensor network**.
43+
Tensor network a mathematical object that can be used to represent a multilinear map between tensors. It is widely used in condensed matter physics [^Orus2014][^Pfeifer2014] and quantum simulation [^Markov2008][^Pan2022]. It is also a powerful tool for solving combinatorial optimization problems [^Liu2023].
44+
It is important to note that we use a generalized version of the conventional
45+
notation, which is also knwon as the [eisnum](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html) function that widely used in high performance computing.
46+
Packages that implement the conventional notation include
47+
- [numpy](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html)
48+
- [OMEinsum.jl](https://github.com/under-Peter/OMEinsum.jl)
49+
- [PyTorch](https://pytorch.org/docs/stable/generated/torch.einsum.html)
50+
- [TensorFlow](https://www.tensorflow.org/api_docs/python/tf/einsum)
51+
52+
This approach allows us to represent a more extensive set of sum-product multilinear operations between tensors, meeting the requirements of the PGM field.
53+
54+
*Definition*[^Liu2023]: A tensor network is a multilinear map represented by the triple
55+
$\mathcal{N} = (\Lambda, \mathcal{T}, \bm{\sigma}_0)$, where:
56+
- $\Lambda$ is the set of variables present in the network
57+
$\mathcal{N}$.
58+
- $\mathcal{T} = \{ T^{(k)}_{\bm{\sigma}_k} \}_{k=1}^{M}$ is the set of
59+
input tensors, where each tensor $T^{(k)}_{\bm{\sigma}_k}$ is identified
60+
by a superscript $(k)$ and has an associated scope $\bm{\sigma}_k$.
61+
- $\bm{\sigma}_0$ specifies the scope of the output tensor.
62+
63+
More specifically, each tensor $T^{(k)}_{\bm{\sigma}_k} \in \mathcal{T}$ is
64+
labeled by a string $\bm{\sigma}_k \in \Lambda^{r \left(T^{(k)} \right)}$, where
65+
$r \left(T^{(k)} \right)$ is the rank of $T^{(k)}$. The multilinear map, or
66+
the `contraction`, applied to this triple is defined as
67+
```math
68+
\texttt{contract}(\Lambda, \mathcal{T}, \bm{\sigma}_0) = \sum_{\bm{\sigma}_{\Lambda
69+
\setminus [\bm{\sigma}_0]}} \prod_{k=1}^{M} T^{(k)}_{\bm{\sigma}_k},
70+
```
71+
Notably, the summation extends over all instantiations of the variables that
72+
are not part of the output tensor.
73+
74+
As an example, the matrix multiplication can be specified as a tensor network
75+
contraction
76+
```math
77+
(AB)_{ik} = \texttt{contract}\left(\{i,j,k\}, \{A_{ij}, B_{jk}\}, ik\right),
78+
```
79+
where matrices $A$ and $B$ are input tensors labeled by strings $ij, jk \in
80+
\{i, j, k\}^2$. The output tensor is labeled by string $ik$. The
81+
summation runs over indices $\Lambda \setminus [ik] = \{j\}$. The contraction
82+
corresponds to
83+
```math
84+
\texttt{contract}\left(\{i,j,k\}, \{A_{ij}, B_{jk}\}, ik\right) = \sum_j
85+
A_{ij}B_{jk},
86+
```
87+
In programming languages, this is equivalent to einsum notation `ij, jk -> ik`.
88+
89+
Diagrammatically, a tensor network can be represented as an *open hypergraph*. In the tensor network diagram, a tensor is mapped to a vertex,
90+
and a variable is mapped to a hyperedge. If and only if tensors share the same variable, we connect
91+
them with the same hyperedge for that variable. The diagrammatic
92+
representation of matrix multiplication is as bellow.
93+
```@eval
94+
using TikzPictures
95+
96+
tp = TikzPicture(
97+
L"""
98+
\matrix[row sep=0.8cm,column sep=0.8cm,ampersand replacement= \& ] {
99+
\node (1) {}; \&
100+
\node (a) [mytensor] {$A$}; \&
101+
\node (b) [mytensor] {$B$}; \&
102+
\node (2) {}; \&
103+
\\
104+
};
105+
\draw [myedge, color=c01] (1) edge node[below] {$i$} (a);
106+
\draw [myedge, color=c02] (a) edge node[below] {$j$} (b);
107+
\draw [myedge, color=c03] (b) edge node[below] {$k$} (2);
108+
""", options="scale=3.8",
109+
preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "the-tensor-network") * "}",
110+
)
111+
save(SVG("the-tensor-network1"), tp)
112+
```
113+
114+
```@raw html
115+
<img src="the-tensor-network1.svg" style="margin-left: auto; margin-right: auto; display:block; width=50%">
116+
```
117+
118+
Here, we use different colors to denote different hyperedges. Hyperedges for
119+
$i$ and $j$ are left open to denote variables in the output string
120+
$\bm{\sigma}_0$. The reason why we should use hyperedges rather than regular edge
121+
will be made clear by the followng star contraction example.
122+
```math
123+
\texttt{contract}(\{i,j,k,l\}, \{A_{il}, B_{jl}, C_{kl}\}, ijk) = \sum_{l}A_{il}
124+
B_{jl} C_{kl}
125+
```
126+
In programming languages, this is equivalent to einsum notation `il, jl, kl -> ijk`.
127+
128+
Among the variables, $l$ is shared by all three tensors, hence the diagram can
129+
not be represented as a simple graph. The hypergraph representation is as
130+
below.
131+
```@eval
132+
using TikzPictures
133+
134+
tp = TikzPicture(
135+
L"""
136+
\matrix[row sep=0.4cm,column sep=0.4cm,ampersand replacement= \& ] {
137+
\&
138+
\&
139+
\node[color=c01] (j) {$j$}; \&
140+
\&
141+
\&
142+
\\
143+
\&
144+
\&
145+
\node (b) [mytensor] {$B$}; \&
146+
\&
147+
\&
148+
\\
149+
\node[color=c03] (i) {$i$}; \&
150+
\node (a) [mytensor] {$A$}; \&
151+
\node[color=c02] (l) {$l$}; \&
152+
\node (c) [mytensor] {$C$}; \&
153+
\node[color=c04] (k) {$k$}; \&
154+
\\
155+
};
156+
\draw [myedge, color=c01] (j) edge (b);
157+
\draw [myedge, color=c02] (b) edge (l);
158+
\draw [myedge, color=c03] (i) edge (a);
159+
\draw [myedge, color=c02] (a) edge (l);
160+
\draw [myedge, color=c02] (l) edge (c);
161+
\draw [myedge, color=c04] (c) edge (k);
162+
""", options="",
163+
preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "the-tensor-network") * "}",
164+
)
165+
save(SVG("the-tensor-network2"), tp)
166+
```
167+
168+
```@raw html
169+
<img src="the-tensor-network2.svg" style="margin-left: auto; margin-right: auto; display:block; width=50%">
170+
```
171+
172+
As a final comment, repeated indices in the same tensor is not forbidden in
173+
the definition of a tensor network, hence self-loops are also allowed in a tensor
174+
network diagram.
175+
176+
## Tensor network contraction orders
177+
The performance of a tensor network contraction depends on the order in which
178+
the tensors are contracted. The order of contraction is usually specified by
179+
binary trees, where the leaves are the input tensors and the internal nodes
180+
represent the order of contraction. The root of the tree is the output tensor.
181+
182+
Plenty of algorithms have been proposed to find the optimal contraction order, which includes
183+
- Greedy algorithms
184+
- Breadth-first search and Dynamic programming [^Pfeifer2014]
185+
- Graph bipartitioning [^Gray2021]
186+
- Local search [^Kalachev2021]
187+
188+
Some of them have already been included in the [OMEinsum](https://github.com/under-Peter/OMEinsum.jl) package. Please check [Performance Tips](@ref) for more details.
189+
190+
## References
191+
192+
[^Orus2014]:
193+
Orús R. A practical introduction to tensor networks: Matrix product states and projected entangled pair states[J]. Annals of physics, 2014, 349: 117-158.
194+
195+
[^Markov2008]:
196+
Markov I L, Shi Y. Simulating quantum computation by contracting tensor networks[J]. SIAM Journal on Computing, 2008, 38(3): 963-981.
197+
198+
[^Pfeifer2014]:
199+
Pfeifer R N C, Haegeman J, Verstraete F. Faster identification of optimal contraction sequences for tensor networks[J]. Physical Review E, 2014, 90(3): 033315.
200+
201+
[^Gray2021]:
202+
Gray J, Kourtis S. Hyper-optimized tensor network contraction[J]. Quantum, 2021, 5: 410.
203+
204+
[^Kalachev2021]:
205+
Kalachev G, Panteleev P, Yung M H. Multi-tensor contraction for XEB verification of quantum circuits[J]. arXiv:2108.05665, 2021.
206+
207+
[^Pan2022]:
208+
Pan F, Chen K, Zhang P. Solving the sampling problem of the sycamore quantum circuits[J]. Physical Review Letters, 2022, 129(9): 090502.
209+
210+
[^Liu2023]:
211+
Liu J G, Gao X, Cain M, et al. Computing solution space properties of combinatorial optimization problems via generic tensor networks[J]. SIAM Journal on Scientific Computing, 2023, 45(3): A1239-A1270.

0 commit comments

Comments
 (0)