Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,66 +13,7 @@ To learn more about static and dynamic settings, see [Configuring OpenSearch]({{

## Discovery settings

The discovery process is used when a cluster is formed. It consists of discovering nodes and electing a cluster manager node.

### Static discovery settings

The following **static** discovery settings must be configured before a cluster starts:

- `discovery.seed_hosts` (Static, list): Provides a list of the addresses of the cluster-manager-eligible nodes in the cluster. Each address has the format `host:port` or `host`. If a hostname resolves to multiple addresses via DNS, OpenSearch uses all of them. This setting is essential in order for nodes to find each other during cluster formation. Default is `["127.0.0.1", "[::1]"]`.

- `discovery.seed_providers` (Static, list): Specifies which types of seed hosts provider to use to obtain the addresses of the seed nodes used to start the discovery process. By default, this uses the settings-based seed hosts provider, which obtains seed node addresses from the `discovery.seed_hosts` setting.

- `discovery.type` (Static, string): Specifies whether OpenSearch should form a multiple-node cluster or operate as a single node. When set to `single-node`, OpenSearch forms a single-node cluster and suppresses certain timeouts. This setting is useful for development and testing environments. Valid values are `multi-node` (default) and `single-node`.

- `cluster.initial_cluster_manager_nodes` (Static, list): Establishes the initial set of cluster-manager-eligible nodes in a new cluster. This setting is required when bootstrapping a cluster for the first time and should contain the node names (as defined by `node.name`) of the initial cluster-manager-eligible nodes. This list should be empty for nodes joining an existing cluster. Default is `[]` (empty list).


### Dynamic discovery settings

The following **dynamic** discovery settings can be updated while the cluster is running:

- `cluster.auto_shrink_voting_configuration` (Dynamic, Boolean): Controls whether the voting configuration automatically shrinks when nodes are removed from the cluster. If `true`, the voting configuration adjusts to maintain optimal cluster manager election behavior by removing nodes that are no longer part of the cluster. If `false`, you must remove the nodes that are no longer part of the cluster using the [Voting Configuration Exclusions API]({{site.url}}{{site.baseurl}}/api-reference/cluster-api/cluster-voting-configuration-exclusions/). Default is `true`.

- `cluster.max_voting_config_exclusions` (Dynamic, integer): Sets the maximum number of voting configuration exclusions that can be in place simultaneously during cluster manager node operations. This setting is used during node removal and cluster maintenance operations to temporarily exclude nodes from voting. Default is `10`.

### Static cluster coordination settings

The following cluster coordination settings control cluster formation and node joining behavior:

- `cluster.join.timeout` (Static, time unit): The amount of time a node waits after sending a request to join a cluster before it considers the request to have failed and retries. This timeout does not apply when `discovery.type` is set to `single-node`. Default is `60s`.

- `cluster.publish.info_timeout` (Static, time unit): The amount of time the cluster manager node waits for each cluster state update to be completely published to all nodes before logging a message indicating that some nodes are responding slowly. This setting helps identify slow-responding nodes during cluster state updates. Default is `10s`.

### Cluster election settings

The following settings control cluster manager election behavior:

- `cluster.election.back_off_time` (Static, time unit): Sets the incremental delay added to election retry attempts after each failure. Uses linear backoff, in which each failed election increases the wait time by this amount before the next attempt. Default is `100ms`. **Warning**: Changing this setting from the default may cause your cluster to fail to elect a cluster manager node.

- `cluster.election.duration` (Static, time unit): Sets how long each election is allowed to take before a node considers it to have failed and schedules a retry. This controls the maximum duration of the election process. Default is `500ms`. **Warning**: Changing this setting from the default may cause your cluster to fail to elect a cluster manager node.

- `cluster.election.initial_timeout` (Static, time unit): Sets the upper bound for how long a node will wait initially, or after the elected cluster manager fails, before attempting its first election. This controls the initial election delay. Default is `100ms`. **Warning**: Changing this setting from the default may cause your cluster to fail to elect a cluster manager node.

- `cluster.election.max_timeout` (Static, time unit): Sets the maximum upper bound for how long a node will wait before attempting an election, preventing excessively sparse elections during long network partitions. This caps the maximum election delay. Default is `10s`. **Warning**: Changing this setting from the default may cause your cluster to fail to elect a cluster manager node.

### Expert-level discovery settings

The following discovery settings are for expert-level configuration. **Warning**: Changing these settings from their defaults may cause cluster instability:

- `discovery.cluster_formation_warning_timeout` (Static, time unit): Sets how long a node will try to form a cluster before logging a warning that the cluster did not form. If a cluster has not formed after this timeout has elapsed, the node will log a warning message that starts with the phrase "cluster manager not discovered" and describes the current state of the discovery process. Default is `10s`.

- `discovery.find_peers_interval` (Static, time unit): Sets how long a node will wait before attempting another discovery round. This controls the frequency of peer discovery attempts during cluster formation. Default is `1s`.

- `discovery.probe.connect_timeout` (Static, time unit): Sets how long to wait when attempting to connect to each address during node discovery. This timeout applies to the initial connection attempt to potential cluster members. Default is `3s`.

- `discovery.probe.handshake_timeout` (Static, time unit): Sets how long to wait when attempting to identify the remote node via a handshake during the discovery process. This timeout applies to the node identification phase after a successful connection. Default is `1s`.

- `discovery.request_peers_timeout` (Static, time unit): Sets how long a node will wait after asking its peers for information before considering the request to have failed. This timeout applies to peer information requests during the discovery process. Default is `3s`.

- `discovery.seed_resolver.max_concurrent_resolvers` (Static, integer): Specifies how many concurrent DNS lookups to perform when resolving the addresses of seed nodes during cluster discovery. This setting controls the parallelism of DNS resolution for seed hosts. Default is `10`.

- `discovery.seed_resolver.timeout` (Static, time unit): Specifies how long to wait for each DNS lookup performed when resolving the addresses of seed nodes. This timeout applies to individual DNS resolution operations during cluster discovery. Default is `5s`.
The discovery process is used when a cluster is formed. It consists of discovering nodes and electing a cluster manager node. For comprehensive information about discovery and cluster formation settings, see [Discovery and cluster formation settings]({{site.url}}{{site.baseurl}}/tuning-your-cluster/discovery-cluster-formation/settings/).


## Gateway settings
Expand Down
222 changes: 222 additions & 0 deletions _tuning-your-cluster/discovery-cluster-formation/bootstrapping.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
---
layout: default
title: Cluster bootstrapping
parent: Discovery and cluster formation
nav_order: 40
---

# Cluster bootstrapping

When starting an OpenSearch cluster for the very first time, you must explicitly define the initial set of cluster-manager-eligible nodes that will participate in the first cluster manager election. This process is called _cluster bootstrapping_ and is critical for preventing split-brain scenarios during initial cluster formation.

Cluster bootstrapping is required in the following situations:

- Starting a brand-new cluster for the very first time
- No existing cluster state exists on any node
- Initial cluster manager election needs to take place

Bootstrapping is not required in the following situations:

- Nodes joining an existing cluster - they get configuration from the current cluster manager
- Cluster restarts - nodes that have previously joined a cluster store the necessary information
- Full cluster restarts - existing cluster state is preserved and used for recovery

## Configuring the bootstrap nodes

Use the `cluster.initial_cluster_manager_nodes` setting to define which nodes should participate in the initial cluster manager election. Set this configuration in `opensearch.yml` on each cluster-manager-eligible node:

```yaml
cluster.initial_cluster_manager_nodes:
- cluster-manager-1
- cluster-manager-2
- cluster-manager-3
```
{% include copy.html %}

Alternatively, you can specify the bootstrap configuration when starting OpenSearch:

```bash
./bin/opensearch -Ecluster.initial_cluster_manager_nodes=cluster-manager-1,cluster-manager-2,cluster-manager-3
```
{% include copy.html %}

You can identify nodes in the bootstrap configuration using any of these methods:

1. Use the value of `node.name` (recommended):

```yaml
cluster.initial_cluster_manager_nodes:
- cluster-manager-1
- cluster-manager-2
```
{% include copy.html %}

2. Use the node's hostname if `node.name` is not explicitly set:

```yaml
cluster.initial_cluster_manager_nodes:
- server1.example.com
- server2.example.com
```
{% include copy.html %}

3. Use the node's publish IP address:

```yaml
cluster.initial_cluster_manager_nodes:
- 192.168.1.10
- 192.168.1.11
```
{% include copy.html %}

4. Use the node's IP address and port when multiple nodes share the same IP:

```yaml
cluster.initial_cluster_manager_nodes:
- 192.168.1.10:9300
- 192.168.1.10:9301
```
{% include copy.html %}

## Critical bootstrapping requirements

Proper bootstrapping ensures that all cluster-manager-eligible nodes start with a consistent and accurate configuration, preventing cluster splits and ensuring a stable initial election process.

### Identical configuration across all nodes

All cluster-manager-eligible nodes must have the same `cluster.initial_cluster_manager_nodes` setting. This ensures that only one cluster forms during bootstrapping.

**Correct configuration**:

```yaml
# Node 1
cluster.initial_cluster_manager_nodes:
- cluster-manager-1
- cluster-manager-2
- cluster-manager-3

# Node 2
cluster.initial_cluster_manager_nodes:
- cluster-manager-1
- cluster-manager-2
- cluster-manager-3

# Node 3
cluster.initial_cluster_manager_nodes:
- cluster-manager-1
- cluster-manager-2
- cluster-manager-3
```
{% include copy.html %}

**Incorrect configuration**:

```yaml
# Node 1 – different list
cluster.initial_cluster_manager_nodes:
- cluster-manager-1
- cluster-manager-2

# Node 2 – different list
cluster.initial_cluster_manager_nodes:
- cluster-manager-2
- cluster-manager-3
```

When nodes have inconsistent bootstrap lists, multiple independent clusters may form.

### Exact name matching

Node names in the bootstrap configuration must exactly match each node's `node.name` value.

**Common naming issues**:

* If a node's name is `server1.example.com`, the bootstrap list must also use `server1.example.com`, not `server1`.
* Node names are case-sensitive.
* The names must match exactly, with no added characters or whitespace.

Check failure on line 136 in _tuning-your-cluster/discovery-cluster-formation/bootstrapping.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _tuning-your-cluster/discovery-cluster-formation/bootstrapping.md#L136

[OpenSearch.SubstitutionsError] Use 'white space' instead of 'whitespace'.
Raw output
{"message": "[OpenSearch.SubstitutionsError] Use 'white space' instead of 'whitespace'.", "location": {"path": "_tuning-your-cluster/discovery-cluster-formation/bootstrapping.md", "range": {"start": {"line": 136, "column": 61}}}, "severity": "ERROR"}

If a node's name does not exactly match an entry in the bootstrap configuration, the log will contain an error message. In this example, the node name `cluster-manager-1.example.com` does not match the bootstrap entry `cluster-manager-1`:

```
[cluster-manager-1.example.com] cluster manager not discovered yet, this node has
not previously joined a bootstrapped cluster, and this node must discover
cluster-manager-eligible nodes [cluster-manager-1, cluster-manager-2] to
bootstrap a cluster: have discovered [{cluster-manager-2.example.com}...]
```

## Naming your cluster

Choose a descriptive cluster name to distinguish your cluster from others:

```yaml
cluster.name: production-search-cluster
```
{% include copy.html %}

When naming your cluster, follow these guidelines:

- Each cluster must have a unique name to avoid conflicts.

- Ensure that all nodes verify that the cluster name matches before joining.

- Avoid the default `opensearch` name in production environments.

- Choose descriptive names that reflect the cluster's purpose.

## Development mode auto-bootstrapping

OpenSearch can automatically bootstrap clusters in development environments under the following conditions:

- No discovery settings are explicitly configured.
- Multiple nodes are running on the same machine.
- OpenSearch detects that it is running in a development environment.

### Settings that disable auto-bootstrapping

If any of these settings are configured, you must explicitly configure `cluster.initial_cluster_manager_nodes`:

- `discovery.seed_providers`
- `discovery.seed_hosts`
- `cluster.initial_cluster_manager_nodes`

### Auto-bootstrapping limitations

Auto-bootstrapping is intended only for development. Do not use it in production because:

- Nodes may not discover each other quickly enough, leading to delays.

- Network conditions can cause discovery to fail.

- Behavior can be unpredictable and is not guaranteed.

- There is a risk of forming multiple clusters, resulting in split-brain scenarios.

## Troubleshooting bootstrap issues

If you accidentally start nodes on different hosts without proper configuration, they may form separate clusters. You can detect this by checking cluster UUIDs:

```bash
curl -X GET "localhost:9200/"
```
{% include copy.html %}

If each node reports a different `cluster_uuid`, they belong to separate clusters. To correct this and form a single cluster, use the following steps:

1. Stop all nodes.
2. Delete all data from each node's data directory.
3. Configure proper bootstrap settings.
4. Restart all nodes and verify single cluster formation.

## Bootstrap verification

After starting your cluster, verify successful bootstrap using the [monitoring commands]({{site.url}}{{site.baseurl}}/tuning-your-cluster/discovery-cluster-formation/#monitoring-discovery-and-cluster-formation) for checking cluster health and formation:

- Verify cluster health status and node count.
- Confirm that one node is elected as cluster manager.
- Ensure that all nodes report the same cluster UUID.

## Related documentation

- [Voting configuration management]({{site.url}}{{site.baseurl}}/tuning-your-cluster/discovery-cluster-formation/voting-configuration/): How OpenSearch manages voting after bootstrap
- [Discovery and cluster formation settings]({{site.url}}{{site.baseurl}}/tuning-your-cluster/discovery-cluster-formation/settings/): Complete settings reference
- [Creating a cluster]({{site.url}}{{site.baseurl}}/tuning-your-cluster/): Step-by-step cluster setup guide
Loading