Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions modules/deploy/partials/high-availability.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Consider the following Redpanda deployment strategies for the most common types
| Offline backups
|===

ifndef::env-kubernetes[See also: xref:./production/production-deployment.adoc[Deploy for Production]]
ifndef::env-kubernetes[See also: xref:deploy:redpanda/manual/production/production-deployment.adoc[Deploy for Production]]

== HA deployment options

Expand Down Expand Up @@ -253,7 +253,7 @@ rpk redpanda config set redpanda.rack <rackid>

The modified Ansible playbooks take a per-instance rack variable from the Terraform output and use that to set the relevant cluster and broker configuration. Redpanda deployment automation can provision public cloud infrastructure with discrete failure domains (`-var=ha=true`) and use the resulting inventory to provision rack-aware clusters using Ansible.

See also: xref:./production/production-deployment-automation.adoc[Automated Deployment]
See also: xref:deploy:redpanda/manual/production/production-deployment-automation.adoc[Automated Deployment]

=== Single-AZ example

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -129,14 +129,15 @@ Name: <topic-name>, State: ACTIVE
----

The partition information shows:

* **SRC_LSO**: Source partition Last Stable Offset
* **SRC_HWM**: Source partition High Watermark
* **DST_HWM**: Shadow (destination) partition High Watermark
* **Lag**: Message count difference between source and shadow partitions

[IMPORTANT]
====
Note the replication lag to estimate potential data loss during failover. The `Tasks` section shows the health of shadow link replication tasks. For details about what each task does, see xref:setup.adoc#shadow-link-tasks[].
Note the replication lag to estimate potential data loss during failover. The `Tasks` section shows the health of shadow link replication tasks. For details about what each task does, see xref:manage:disaster-recovery/shadowing/setup.adoc#shadow-link-tasks[Shadow link tasks].
====

[[initiate-failover]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ The output shows individual topic states and any issues encountered during the f
* **`NOT_RUNNING`**: Task is not currently executing
* **`LINK_UNAVAILABLE`**: Task cannot communicate with the source cluster

For detailed information about shadow link tasks and their roles, see xref:setup.adoc#shadow-link-tasks[].
For detailed information about shadow link tasks and their roles, see xref:manage:disaster-recovery/shadowing/setup.adoc#shadow-link-tasks[Shadow link tasks].


== Post-failover cluster behavior
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ The status output includes:

* **Shadow link state**: Overall operational state (`ACTIVE`)
* **Individual topic states**: Current state of each replicated topic (`ACTIVE`, `FAULTED`, `FAILING_OVER`, `FAILED_OVER`)
* **Task status**: Health of replication tasks across brokers (`ACTIVE`, `FAULTED`, `NOT_RUNNING`, `LINK_UNAVAILABLE`). For details about shadow link tasks, see xref:setup.adoc#shadow-link-tasks[].
* **Task status**: Health of replication tasks across brokers (`ACTIVE`, `FAULTED`, `NOT_RUNNING`, `LINK_UNAVAILABLE`). For details about shadow link tasks, see xref:manage:disaster-recovery/shadowing/setup.adoc#shadow-link-tasks[Shadow link tasks].
* **Lag information**: Replication lag per partition showing source vs shadow high watermarks (HWM)

[[shadow-link-metrics]]
Expand Down Expand Up @@ -118,7 +118,7 @@ Configure monitoring alerts for:
* **Connection errors**: When `redpanda_shadow_link_client_errors` increases rapidly
* **Topic state changes**: When topics move to `FAULTED` state
* **Task failures**: When replication tasks enter `FAULTED` or `NOT_RUNNING` states
* **Throughput drops**: When bytes/records fetched drops significantly
* **Link unavailability**: When tasks show `LINK_UNAVAILABLE` indicating source cluster connectivity issues

For more information about shadow link tasks and their states, see xref:setup.adoc#shadow-link-tasks[].
* **Throughput drops**: When bytes/records fetched drops significantly
+
For more information about shadow link tasks and their states, see xref:manage:disaster-recovery/shadowing/setup.adoc#shadow-link-tasks[Shadow link tasks].
14 changes: 6 additions & 8 deletions modules/manage/pages/disaster-recovery/shadowing/overview.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,13 @@ Shadowing complements Redpanda's existing availability and recovery capabilities

== Limitations

Shadowing is designed for active-passive disaster recovery scenarios. Each shadow cluster can maintain only one shadow link.
Shadowing for disaster recovery currently has the following limitations:

Shadowing operates exclusively in asynchronous mode and doesn't support active-active configurations. This means there will always be some replication lag. You cannot write to both clusters simultaneously.

xref:develop:data-transforms/index.adoc[Data transforms] are not supported on shadow clusters while Shadowing is active. Writing to shadow topics is blocked.

During a disaster, xref:manage:audit-logging.adoc[audit log] history from the source cluster is lost, though the shadow cluster begins generating new audit logs immediately after the failover.

After you failover shadow topics, automatic fallback to the original source cluster is not supported.
* Shadowing is designed for active-passive disaster recovery scenarios. Each shadow cluster can maintain only one shadow link.
* Shadowing operates exclusively in asynchronous mode and doesn't support active-active configurations. This means there will always be some replication lag. You cannot write to both clusters simultaneously.
* xref:develop:data-transforms/index.adoc[Data transforms] are not supported on shadow clusters while Shadowing is active. Writing to shadow topics is blocked.
* During a disaster, xref:manage:audit-logging.adoc[audit log] history from the source cluster is lost, though the shadow cluster begins generating new audit logs immediately after the failover.
* After you failover shadow topics, automatic fallback to the original source cluster is not supported.

== Best Practices

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ Each task reports its status through the shadow link status API. Task states inc

You can pause individual tasks by setting the `paused` field to `true` in the corresponding configuration section. This allows you to selectively disable parts of the replication process without affecting the entire shadow link.

For monitoring task health and troubleshooting task issues, see xref:disaster-recovery:shadowing:monitor.adoc[Monitor Shadow Links].
For monitoring task health and troubleshooting task issues, see xref:manage:disaster-recovery/shadowing/monitor.adoc[Monitor Shadow Links].

== What gets replicated

Expand Down