You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The shadow cluster operates in read-only mode while continuously receiving updates from the source cluster. During a disaster, you can failover individual topics or an entire shadow link to make resources fully writable for production traffic. See xref:deploy:redpanda/manual/disaster-recovery/shadowing/failover-runbook.adoc[] for emergency procedures.
23
23
24
+
=== New rpk shadow commands
25
+
26
+
This release introduces new xref:reference:rpk/rpk-shadow/rpk-shadow.adoc[`rpk shadow`] commands for managing Redpanda Shadow Links:
These commands provide complete command-line management of your disaster recovery infrastructure.
38
+
24
39
== Connected client monitoring
25
40
26
41
You can view details about Kafka client connections using `rpk` or the Admin API ListKafkaConnections endpoint. This allows you to view detailed information about active client connections on a cluster, and identify and troubleshoot problematic clients. For more information, see the xref:manage:cluster-maintenance/manage-throughput.adoc#view-connected-client-details[connected client details] example in the Manage Throughput guide.
Copy file name to clipboardExpand all lines: modules/manage/pages/disaster-recovery/shadowing/failover-runbook.adoc
+10-1Lines changed: 10 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -128,7 +128,16 @@ Name: <topic-name>, State: ACTIVE
128
128
1 2345 2579 2568 11
129
129
----
130
130
131
-
IMPORTANT: Note the replication lag to estimate potential data loss during failover.
131
+
The partition information shows:
132
+
* **SRC_LSO**: Source partition Last Stable Offset
133
+
* **SRC_HWM**: Source partition High Watermark
134
+
* **DST_HWM**: Shadow (destination) partition High Watermark
135
+
* **Lag**: Message count difference between source and shadow partitions
136
+
137
+
[IMPORTANT]
138
+
====
139
+
Note the replication lag to estimate potential data loss during failover. The `Tasks` section shows the health of shadow link replication tasks. For details about what each task does, see xref:setup.adoc#shadow-link-tasks[].
Copy file name to clipboardExpand all lines: modules/manage/pages/disaster-recovery/shadowing/monitor.adoc
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,8 +52,8 @@ The status output includes:
52
52
53
53
* **Shadow link state**: Overall operational state (`ACTIVE`)
54
54
* **Individual topic states**: Current state of each replicated topic (`ACTIVE`, `FAULTED`, `FAILING_OVER`, `FAILED_OVER`)
55
-
* **Task status**: Health of replication tasks across brokers (`ACTIVE`, `FAULTED`, `NOT_RUNNING`, `LINK_UNAVAILABLE`)
56
-
* **Lag information**: Replication lag per partition showing source vs shadow watermarks
55
+
* **Task status**: Health of replication tasks across brokers (`ACTIVE`, `FAULTED`, `NOT_RUNNING`, `LINK_UNAVAILABLE`). For details about shadow link tasks, see xref:setup.adoc#shadow-link-tasks[].
56
+
* **Lag information**: Replication lag per partition showing source vs shadow high watermarks (HWM)
57
57
58
58
[[shadow-link-metrics]]
59
59
== Metrics
@@ -66,7 +66,7 @@ Shadowing provides comprehensive metrics to track replication performance and he
66
66
67
67
|`redpanda_shadow_link_shadow_lag`
68
68
|Gauge
69
-
|The lag of the shadow partition against the source partition, calculated as source partition LSO minus shadow partition HWM. Monitor by `shadow_link_name`, `topic`, and `partition` to understand replication lag for each partition.
69
+
|The lag of the shadow partition against the source partition, calculated as source partition LSO (Last Stable Offset) minus shadow partition HWM (High Watermark). Monitor by `shadow_link_name`, `topic`, and `partition` to understand replication lag for each partition.
Copy file name to clipboardExpand all lines: modules/manage/pages/disaster-recovery/shadowing/setup.adoc
+64-5Lines changed: 64 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,6 +72,63 @@ To set up Shadowing:
72
72
* **Configure filters**: Define which topics, consumer groups, and ACLs should replicate by creating include/exclude patterns that match your disaster recovery requirements. See <<set-filters>>.
73
73
* **Create a shadow link**: Establish the connection between clusters using `rpk`, the Admin API, or Redpanda Console with authentication and network settings. See <<create-a-shadow-link>>.
74
74
75
+
== Shadow Link Tasks
76
+
77
+
Shadow linking operates through specialized tasks that handle different aspects of replication. Each task corresponds to a configuration section in your shadow link setup and runs continuously to maintain synchronization with the source cluster.
78
+
79
+
[#source-topic-sync-task]
80
+
=== Source Topic Sync Task
81
+
82
+
The **Source Topic Sync Task** manages topic discovery and metadata synchronization. This task periodically queries the source cluster to discover available topics, applies your configured topic filters to determine which topics should become shadow topics, and synchronizes topic properties between clusters.
83
+
84
+
The task is controlled by the `topic_metadata_sync_options` configuration section, which includes:
85
+
86
+
* **Auto-creation filters**: Determines which source topics automatically become shadow topics
87
+
* **Property synchronization**: Controls which topic properties replicate from source to shadow
88
+
* **Starting offset**: Sets where new shadow topics begin replication (earliest, latest, or timestamp-based)
89
+
* **Sync interval**: How frequently to check for new topics and property changes
90
+
91
+
When this task discovers a new topic that matches your filters, it creates the corresponding shadow topic and begins replication from your configured starting offset.
92
+
93
+
[#consumer-group-shadowing-task]
94
+
=== Consumer Group Shadowing Task
95
+
96
+
The **Consumer Group Shadowing task** replicates consumer group offsets and membership information from the source cluster. This ensures that consumer applications can resume processing from the correct position after failover.
97
+
98
+
The task is controlled by the `consumer_offset_sync_options` configuration section, which includes:
99
+
100
+
* **Group filters**: Determines which consumer groups have their offsets replicated
101
+
* **Sync interval**: How frequently to synchronize consumer group offsets
102
+
* **Offset clamping**: Automatically adjusts replicated offsets to valid ranges on the shadow cluster
103
+
104
+
This task runs on brokers that host the `__consumer_offsets` topic and continuously tracks consumer group coordinators to optimize offset synchronization.
105
+
106
+
[#security-migrator-task]
107
+
=== Security Migrator Task
108
+
109
+
The **Security Migrator task** replicates security policies, primarily ACLs (Access Control Lists), from the source cluster to maintain consistent authorization across both environments.
110
+
111
+
The task is controlled by the `security_sync_options` configuration section, which includes:
112
+
113
+
* **ACL filters**: Determines which security policies replicate
114
+
* **Sync interval**: How frequently to synchronize security settings
115
+
116
+
By default, all ACLs replicate to ensure your shadow cluster maintains the same security posture as your source cluster.
117
+
118
+
=== Task Status and Monitoring
119
+
120
+
Each task reports its status through the shadow link status API. Task states include:
121
+
122
+
* **`ACTIVE`**: Task is running normally and performing synchronization
123
+
* **`PAUSED`**: Task has been manually paused through configuration
124
+
* **`FAULTED`**: Task encountered an error and requires attention
125
+
* **`NOT_RUNNING`**: Task is not currently executing
126
+
* **`LINK_UNAVAILABLE`**: Task cannot communicate with the source cluster
127
+
128
+
You can pause individual tasks by setting the `paused` field to `true` in the corresponding configuration section. This allows you to selectively disable parts of the replication process without affecting the entire shadow link.
129
+
130
+
For monitoring task health and troubleshooting task issues, see xref:disaster-recovery:shadowing:monitor.adoc[Monitor Shadow Links].
131
+
75
132
== What gets replicated
76
133
77
134
Shadowing replicates your topic data with complete fidelity, preserving all message records with their original offsets, timestamps, headers, and metadata. The partition structure remains identical between source and shadow clusters, ensuring applications can resume processing from the exact same position after failover.
@@ -82,7 +139,7 @@ Partition count is always replicated to ensure the shadow topic matches the sour
82
139
83
140
=== Topic properties replication
84
141
85
-
For topic properties, Redpanda follows these replication rules:
142
+
The <<source-topic-sync-task,Source Topic Sync task>> handles topic property replication. For topic properties, Redpanda follows these replication rules:
86
143
87
144
**Never replicated:**
88
145
@@ -210,7 +267,7 @@ Redpanda system topics have the following specific filtering restrictions:
210
267
211
268
=== ACL filtering
212
269
213
-
By default all ACLs are replicated. This is recommended in order to ensure that your shadow cluster has the same permissions as your source cluster. ACL filters should be used with care:
270
+
ACLs are replicated by the <<security-migrator-task,Security Migrator task>>. This is recommended to ensure that your shadow cluster has the same permissions as your source cluster. To configure ACL filters:
214
271
215
272
[,yaml]
216
273
----
@@ -239,7 +296,9 @@ acl_filters:
239
296
240
297
=== Consumer group filtering and behavior
241
298
242
-
Consumer group filters determine which consumer groups have their offsets replicated to the shadow cluster. By default, all consumer groups are replicated unless you specify filters.
299
+
Consumer group filters determine which consumer groups have their offsets replicated to the shadow cluster by the <<consumer-group-shadowing-task,Consumer Group Shadowing task>>.
300
+
301
+
Offset replication operates selectively within each consumer group. Only committed offsets for active shadow topics are synchronized, even if the consumer group has offsets for additional topics that aren't being shadowed. For example, if consumer group "app-consumers" has committed offsets for "orders", "payments", and "inventory" topics, but only "orders" is an active shadow topic, then only the "orders" offsets will be replicated to the shadow cluster.
**Avoid name conflicts:** If you plan to consume data from the shadow cluster, do not use the same consumer group names as those used on the source cluster. While this won't break shadow linking, it can impact your RPO/RTO because conflicting group names may interfere with offset replication and consumer resumption during disaster recovery.
261
320
262
-
**Offset clamping:** When Redpanda replicates consumer group offsets from the source cluster, offsets are automatically "clamped" during the commit process. If a replicated offset is above the high watermark (HWM) of the shadow partition, Redpanda clamps the offset to the shadow partition's HWM. This ensures offsets remain valid and prevents consumers from seeking beyond available data on the shadow cluster.
321
+
**Offset clamping:** When Redpanda replicates consumer group offsets from the source cluster, offsets are automatically "clamped" during the commit process on the shadow cluster. If a committed offset from the source cluster is above the high watermark (HWM) of the corresponding shadow partition, Redpanda clamps the offset to the shadow partition's HWM before committing it to the shadow cluster. This ensures offsets remain valid and prevents consumers from seeking beyond available data on the shadow cluster.
263
322
264
323
=== Starting offset for new shadow topics
265
324
266
-
When a shadow topic is created for the first time, you can control where replication begins on the source topic. This setting only applies to empty shadow partitions and is crucial for disaster recovery planning.
325
+
When the <<source-topic-sync-task,Source Topic Sync task>> creates a shadow topic for the first time, you can control where replication begins on the source topic. This setting only applies to empty shadow partitions and is crucial for disaster recovery planning. Changing this configuration only affects new shadow topics, existing shadow topics continue replicating from their current position.
0 commit comments