Skip to content

Commit 89b7ea4

Browse files
committed
More doc updates
1 parent 29c97f1 commit 89b7ea4

File tree

3 files changed

+182
-10
lines changed

3 files changed

+182
-10
lines changed

docs/Manual/Deployment/Kubernetes/Storage.md

Lines changed: 62 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,45 @@ In the `ArangoDeployment` resource, one can specify the type of storage
1010
used by groups of servers using the `spec.<group>.storageClassName`
1111
setting.
1212

13+
This is an example of a `Cluster` deployment that stores its agent & dbserver
14+
data on `PersistentVolumes` that using the `my-local-ssd` `StorageClass`
15+
16+
```yaml
17+
apiVersion: "database.arangodb.com/v1alpha"
18+
kind: "ArangoDeployment"
19+
metadata:
20+
name: "cluster-using-local-ssh"
21+
spec:
22+
mode: Cluster
23+
agents:
24+
storageClassName: my-local-ssd
25+
dbservers:
26+
storageClassName: my-local-ssd
27+
```
28+
1329
The amount of storage needed is configured using the
1430
`spec.<group>.resources.requests.storage` setting.
1531

1632
Note that configuring storage is done per group of servers.
1733
It is not possible to configure storage per individual
1834
server.
1935

36+
This is an example of a `Cluster` deployment that requests volumes of 80GB
37+
for every dbserver, resulting in a total storage capacity of 240GB (with 3 dbservers).
38+
39+
```yaml
40+
apiVersion: "database.arangodb.com/v1alpha"
41+
kind: "ArangoDeployment"
42+
metadata:
43+
name: "cluster-using-local-ssh"
44+
spec:
45+
mode: Cluster
46+
dbservers:
47+
resources:
48+
requests:
49+
storage: 80Gi
50+
```
51+
2052
## Local storage
2153

2254
For optimal performance, ArangoDB should be configured with locally attached
@@ -26,6 +58,28 @@ The easiest way to accomplish this is to deploy an
2658
[`ArangoLocalStorage` resource](./StorageResource.md).
2759
The ArangoDB Storage Operator will use it to provide `PersistentVolumes` for you.
2860

61+
This is an example of an `ArangoLocalStorage` resource that will result in
62+
`PersistentVolumes` created on any node of the Kubernetes cluster
63+
under the directory `/mnt/big-ssd-disk`.
64+
65+
```yaml
66+
apiVersion: "storage.arangodb.com/v1alpha"
67+
kind: "ArangoLocalStorage"
68+
metadata:
69+
name: "example-arangodb-storage"
70+
spec:
71+
storageClass:
72+
name: my-local-ssd
73+
localPath:
74+
- /mnt/big-ssd-disk
75+
```
76+
77+
Note that using local storage required `VolumeScheduling` to be enabled in your
78+
Kubernetes cluster. ON Kubernetes 1.10 this is enabled by default, on version
79+
1.9 you have to enable it with a `--feature-gate` setting.
80+
81+
### Manually creating `PersistentVolumes`
82+
2983
The alternative is to create `PersistentVolumes` manually, for all servers that
3084
need persistent storage (single, agents & dbservers).
3185
E.g. for a `Cluster` with 3 agents and 5 dbservers, you must create 8 volumes.
@@ -54,14 +108,14 @@ metadata:
54108
]}
55109
}'
56110
spec:
57-
capacity:
58-
storage: 100Gi
59-
accessModes:
60-
- ReadWriteOnce
61-
persistentVolumeReclaimPolicy: Delete
62-
storageClassName: local-ssd
63-
local:
64-
path: /mnt/disks/ssd1
111+
capacity:
112+
storage: 100Gi
113+
accessModes:
114+
- ReadWriteOnce
115+
persistentVolumeReclaimPolicy: Delete
116+
storageClassName: local-ssd
117+
local:
118+
path: /mnt/disks/ssd1
65119
```
66120

67121
For Kubernetes 1.9 and up, you should create a `StorageClass` which is configured

docs/Manual/Deployment/Kubernetes/Tls.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ kubectl get secret <deploy-name>-ca --template='{{index .data "ca.crt"}}' | base
2323

2424
### Windows
2525

26-
TODO
26+
To install a CA certificate in Windows, follow the
27+
[procedure described here](http://wiki.cacert.org/HowTo/InstallCAcertRoots).
2728

2829
### MacOS
2930

@@ -41,4 +42,9 @@ sudo /usr/bin/security remove-trusted-cert -d ca.crt
4142

4243
### Linux
4344

44-
TODO
45+
To install a CA certificate in Linux, on Ubuntu, run:
46+
47+
```bash
48+
sudo cp ca.crt /usr/local/share/ca-certificates/<some-name>.crt
49+
sudo update-ca-certificates
50+
```
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# Troubleshooting
2+
3+
While Kubernetes and the ArangoDB Kubernetes operator will automatically
4+
resolve a lot of issues, there are always cases where human attention
5+
is needed.
6+
7+
This chapter gives your tips & tricks to help you troubleshoot deployments.
8+
9+
## Where to look
10+
11+
In Kubernetes all resources can be inspected using `kubectl` using either
12+
the `get` or `describe` command.
13+
14+
To get all details of the resource (both specification & status),
15+
run the following command:
16+
17+
```bash
18+
kubectl get <resource-type> <resource-name> -n <namespace> -o yaml
19+
```
20+
21+
For example, to get the entire specification and status
22+
of an `ArangoDeployment` resource named `my-arangodb` in the `default` namespace,
23+
run:
24+
25+
```bash
26+
kubectl get ArangoDeployment my-arango -n default -o yaml
27+
# or shorter
28+
kubectl get arango my-arango -o yaml
29+
```
30+
31+
Several types of resources (including all ArangoDB custom resources) support
32+
events. These events show what happened to the resource over time.
33+
34+
To show the events (and most important resource data) of a resource,
35+
run the following command:
36+
37+
```bash
38+
kubectl describe <resource-type> <resource-name> -n <namespace>
39+
```
40+
41+
## Getting logs
42+
43+
Another invaluable source of information is the log of containers being run
44+
in Kubernetes.
45+
These logs are accessible through the `Pods` that group these containers.
46+
47+
The fetch the logs of the default container running in a `Pod`, run:
48+
49+
```bash
50+
kubectl logs <pod-name> -n <namespace>
51+
# or with follow option to keep inspecting logs while their are written
52+
kubectl logs <pod-name> -n <namespace> -f
53+
```
54+
55+
To inspect the logs of a specific container in `Pod`, add `-c <container-name>`.
56+
You can find the names of the containers in the `Pod`, using `kubectl describe pod ...`.
57+
58+
{% hint 'info' %}
59+
Note that the ArangoDB operators are being deployed themselves as a Kubernetes `Deployment`
60+
with 2 replicas. This means that you will have to fetch the logs of 2 `Pods` running
61+
those replicas.
62+
{% endhint %}
63+
64+
## What if
65+
66+
### The `Pods` of a deployment stay in `Pending` state
67+
68+
There are two common causes for this.
69+
70+
1) The `Pods` cannot be scheduled because there are no enough nodes available.
71+
This is usally only the case with a `spec.environment` setting that has a value of `Production`.
72+
73+
Solution: Add more nodes.
74+
1) There are no `PersistentVolumes` available to be bound to the `PersistentVolumeClaims`
75+
created by the operator.
76+
77+
Solution: Use `kubectl get persistentvolumes` to inspect the available `PersistentVolumes`
78+
and if needed, use the [`ArangoLocalStorage` operator](./StorageResource.md) to provision `PersistentVolumes`.
79+
80+
### When restarting a `Node`, the `Pods` scheduled on that node remain in `Terminating` state
81+
82+
When a `Node` no longer makes regular calls to the Kubernetes API server, it is
83+
marked as not available. Depending on specific settings in your `Pods`, Kubernetes
84+
will at some point decide to terminate the `Pod`. As long as the `Node` is not
85+
completely removed from the Kubernetes API server, Kubernetes will try to use
86+
the `Node` itself to terminate the `Pod`.
87+
88+
The `ArangoDeployment` operator recognizes this condition and will try to replace those
89+
`Pods` with `Pods` on different nodes. The exact behavior differs per type of server.
90+
91+
### What happens when a `Node` with local data is broken
92+
93+
When a `Node` with `PersistentVolumes` hosted on that `Node` is broken and
94+
cannot be repaired, the data in those `PersistentVolumes` is lost.
95+
96+
If an `ArangoDeployment` of type `Single` was using one of those `PersistentVolumes`
97+
the database is lost and must be restored from a backup.
98+
99+
If an `ArangoDeployment` of type `ActiveFailover` or `Cluster` was using one of
100+
those `PersistentVolumes`, it depends on the type of server that was using the volume.
101+
102+
- If an `Agent` was using the volume, it can be repaired as long as 2 other agents are still healthy.
103+
- If a `DBServer` was using the volume, and the replication factor of all database
104+
collections is 2 of higher, and the remaining dbservers are still healthy,
105+
the cluster will duplicate the remaining replicas to
106+
bring the number of replicases back to the original number.
107+
- If a `DBServer` was using the volume, and the replication factor of a database
108+
collection is 1 and happens to be stored on that dbserver, the data is lost.
109+
- If a single server of an `ActiveFailover` deployment was using the volume, and the
110+
other single server is still healthy, the other single server will become leader.
111+
After replacing the failed single server, the new follower will synchronize with
112+
the leader.

0 commit comments

Comments
 (0)