You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You may have several [Active-Active databases]({{< relref "/operate/rs/databases/active-active" >}})
39
+
or independent Redis servers that are all suitable to serve your app.
40
+
Typically, you would prefer to use some database endpoints over others for a particular
41
+
instance of your app (perhaps the ones that are closest geographically to the app server
42
+
to reduce network latency). However, if the best endpoint is not available due
43
+
to a failure, it is generally better to switch to another, suboptimal endpoint
44
+
than to let the app fail completely.
45
+
46
+
*Failover* is the technique of actively checking for connection failures or
47
+
unacceptably slow connections and automatically switching to the best available endpoint
48
+
when they occur. This requires you to specify a list of endpoints to try, ordered by priority. The diagram below shows this process:
49
+
50
+
{{< image filename="images/failover/failover-client-reconnect.svg" alt="Failover and client reconnection" >}}
51
+
52
+
The complementary technique of *failback* then involves periodically checking the health
53
+
of all endpoints that have failed. If any endpoints recover, the failback mechanism
54
+
automatically switches the connection to the one with the highest priority.
55
+
This could potentially be repeated until the optimal endpoint is available again.
56
+
57
+
{{< image filename="images/failover/failover-client-failback.svg" alt="Failback: client switches back to original server" width="75%" >}}
58
+
59
+
### Detecting connection problems
60
+
61
+
Redis clients use a [circuit breaker design pattern](https://en.wikipedia.org/wiki/Circuit_breaker_design_pattern) to detect connection problems.
62
+
63
+
The circuit breaker is a software component that tracks the sequence of recent
64
+
Redis connection attempts and commands, recording which ones have succeeded and
65
+
which have failed.
66
+
(Note that many command failures are caused by transient errors such as timeouts,
67
+
so before recording a failure, the first response should usually be just to retry
68
+
the command a few times.)
69
+
70
+
The status of the attempted command calls is kept in a "sliding window", which
71
+
is simply a buffer where the least recent item is dropped as each new
72
+
one is added. The buffer can be configured to have a fixed number of failures and/or a failure ratio (specified as a percentage), both based on a time window.
73
+
74
+
{{< image filename="images/failover/failover-sliding-window.svg" alt="Sliding window of recent connection attempts" >}}
75
+
76
+
When the number of failures in the window exceeds a configured
77
+
threshold, the circuit breaker declares the server to be unhealthy and triggers
78
+
a failover.
79
+
80
+
### Selecting a failover target
81
+
82
+
Since you may have multiple Redis servers available to fail over to, the client
83
+
lets you configure a list of endpoints to try, ordered by priority or
84
+
"weight". When a failover is triggered, the client selects the highest-weighted
85
+
endpoint that is still healthy and uses it for the temporary connection.
86
+
87
+
### Health checks
88
+
89
+
Given that the original endpoint had some geographical or other advantage
90
+
over the failover target, you will generally want to fail back to it as soon
91
+
as it recovers. In the meantime, another server might recover that is
92
+
still better than the current failover target, so it might be worth
93
+
failing back to that server even if it is not optimal.
94
+
95
+
Clients periodically run a "health check" on each server to see if it has recovered.
96
+
The health check can be as simple as sending a Redis
97
+
[`PING`]({{< relref "/commands/ping" >}}) or
98
+
[ECHO]({{< relref "/commands/echo" >}}) command and ensuring that it gives the
99
+
expected response.
100
+
101
+
You can also configure the client to run health checks on the current target
102
+
server during periods of inactivity, even if no failover has occurred. This can
103
+
help to detect problems even if your app is not actively using the server.
The circuit breaker is a software component that tracks the sequence of recent
52
-
Redis connection attempts and commands, recording which ones have succeeded and
53
-
which have failed.
54
-
(Note that many command failures are caused by transient errors such as timeouts,
55
-
so before recording a failure, the first response should usually be just to retry
56
-
the command a few times.)
57
-
58
-
The status of the attempted command calls is kept in a "sliding window", which
59
-
is simply a buffer where the least recent item is dropped as each new
60
-
one is added. The buffer can be configured to have a fixed number of failures and/or a failure ratio (specified as a percentage), both based on a time window.
61
-
62
-
{{< image filename="images/failover/failover-sliding-window.svg" alt="Sliding window of recent connection attempts" >}}
63
-
64
-
When the number of failures in the window exceeds a configured
65
-
threshold, the circuit breaker declares the server to be unhealthy and triggers
66
-
a failover.
67
-
68
-
### Selecting a failover target
69
-
70
-
Since you may have multiple Redis servers available to fail over to, Jedis
71
-
lets you configure a list of endpoints to try, ordered by priority or
72
-
"weight". When a failover is triggered, Jedis selects the highest-weighted
73
-
endpoint that is still healthy and uses it for the temporary connection.
74
-
75
-
### Health checks
76
-
77
-
Given that the original endpoint had some geographical or other advantage
78
-
over the failover target, you will generally want to fail back to it as soon
79
-
as it recovers. In the meantime, another server might recover that is
80
-
still better than the current failover target, so it might be worth
81
-
failing back to that server even if it is not optimal.
82
-
83
-
Jedis periodically runs a "health check" on each server to see if it has recovered.
84
-
The health check can be as simple as
85
-
sending a Redis [`PING`]({{< relref "/commands/ping" >}}) command and ensuring
86
-
that it gives the expected response.
87
-
88
-
You can also configure Jedis to run health checks on the current target
89
-
server during periods of inactivity, even if no failover has occurred. This can
90
-
help to detect problems even if your app is not actively using the server.
91
-
92
-
## Failover configuration
93
-
94
38
The example below shows a simple case with a list of two servers,
95
39
`redis-east` and `redis-west`, where `redis-east` is the preferred
96
40
target. If `redis-east` fails, Jedis should fail over to
Supply the weighted list of endpoints using the `MultiDbConfig` builder.
97
+
Supply the weighted list of endpoints using the `MultiDbConfig` builder
98
+
(see [Selecting a failover target]({{< relref "/develop/clients/failover#selecting-a-failover-target" >}}) for a full description of how
99
+
the weighted list is used).
154
100
Use the `weight` option to order the endpoints, with the highest
155
101
weight being tried first.
156
102
@@ -203,7 +149,8 @@ but will also handle the connection management and failover transparently.
203
149
### Circuit breaker configuration
204
150
205
151
The `MultiDbConfig.CircuitBreakerConfig` builder lets you pass several options to configure
206
-
the circuit breaker:
152
+
the circuit breaker (see [Detecting connection problems]({{< relref "/develop/clients/failover#detecting-connection-problems" >}}) for more information on how the
There are several strategies available for health checks that you can configure using the
279
-
`MultiDbConfig` builder. The sections below explain these strategies
280
-
in more detail.
225
+
Each health check consists of one or more separate "probes", each of which is a simple
226
+
test (such as a [`PING`]({{< relref "/commands/ping" >}}) command) to determine if the database is available. The results of the separate probes are combined
227
+
using a configurable policy to determine if the database is healthy.
228
+
229
+
There are several strategies available for health checks that you can deploy using the
230
+
`MultiDbConfig` builder. Each strategy is a class that implements the `HealthCheckStrategy`
231
+
interface. Use the constructor of a `HealthCheckStrategy` implementation to pass
232
+
a `HealthCheckStrategy.Config` object to configure the health check behavior.
233
+
The methods of the base `HealthCheckStrategy.Config` builder are shown below.
234
+
Note that some strategies (including your own custom strategies) may use a
235
+
subclass of `HealthCheckStrategy.Config` to provide extra options.
236
+
237
+
| Builder method | Default value | Description|
238
+
| --- | --- | --- |
239
+
|`interval()`|`1000`| Interval in milliseconds between health checks. |
240
+
|`timeout()`|`1000`| Timeout in milliseconds for health check requests. |
241
+
|`numProbes()`|`3`| Number of probes to perform during each health check. |
242
+
|`delayInBetweenProbes()`|`100`| Delay in milliseconds between probes during a health check. |
243
+
|`policy()`|`ProbingPolicy.BuiltIn.ALL_SUCCESS`| Policy to determine if the database is healthy based on the probe results. The options are `ALL_SUCCESS` (all probes must succeed), `ANY_SUCCESS` (at least one probe must succeed), and `MAJORITY_SUCCESS` (majority of probes must succeed). |
244
+
245
+
The sections below explain the available strategies in more detail.
281
246
282
247
### `PingStrategy` (default)
283
248
@@ -287,6 +252,23 @@ and checks that it gives the expected response. Any unexpected response
287
252
or exception indicates an unhealthy server. Although `PingStrategy` is
288
253
very simple, it is a good basic approach for most Redis deployments.
289
254
255
+
Although `PingStrategy` is the default, you can still activate it
256
+
explicitly using the `healthCheckStrategy()` method of the `MultiDbConfig.DatabaseConfig`
257
+
builder. Use this approach if you want to configure the default
258
+
`PingStrategy` with custom options, as shown in the example below.
Although you will typically configure all databases during the initial connection, you can also modify the configuration at runtime. The example below shows how to add and remove database endpoints.
353
+
354
+
```java
355
+
HostAndPort other =newHostAndPort("redis-south.example.com", 14000);
356
+
357
+
// Create the database config as you would for the initial connection.
0 commit comments