From 4a08c8a7058d6d1dafe864f6997df66e817caa9e Mon Sep 17 00:00:00 2001 From: fengyubiao Date: Mon, 18 Nov 2024 19:26:47 +0800 Subject: [PATCH 1/6] [improve] [pip] PIP-394 Add two interfaces CursorMetadataSerializerProvider and CursorMetadataDeSerializerProvider to support newer of customized cursor metadata serializations --- pip/pip-394.md | 102 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) create mode 100644 pip/pip-394.md diff --git a/pip/pip-394.md b/pip/pip-394.md new file mode 100644 index 0000000000000..233b90750f1f6 --- /dev/null +++ b/pip/pip-394.md @@ -0,0 +1,102 @@ +# PIP-394: Add two interfaces `CursorMetadataSerializerProvider` and `CursorMetadataDeSerializerProvider` to support newer of customized cursor metadata serializations + +# Background knowledge + +**1. What does cursor metadata contain** + +- cursor properties. +- entry id that indicates the latest persist cursor metadata into. +- information of individual acknowledged messages, we call it `individualDeletedMessages`. +- information of individual acknowledged batched messages, we call it `batchedEntryDeletionIndexInfo`. + +**2. The improvements we did for the persistence for cursor metadata** +- https://github.com/apache/pulsar/pull/758: skip to information that over the max limitation of max ranges to persist. +- https://github.com/apache/pulsar/issues/14529: compress the info when persisting. +- https://github.com/apache/pulsar/pull/9292: add a new compression strategy: change Range Objects to `long[]`. + +# Motivation + +**Issue-1: Compatible of improvements** + +- The third improvement was contributed with `release:4.0`, which is a new LTS version. + - It changed the default implementation of serialization that contains https://github.com/apache/pulsar/pull/9292. +- Users can not rollback to `3.0.x` once upgraded to `4.0.x` because `release:3.0.x` does not contain the deserialization that introduced by https://github.com/apache/pulsar/pull/9292. + +**Issue-2: Frequently Young GC relates to the cursor metadata persistence if there are too many active subscriptions in a broker, even if we did so many improvements** + +`individualDeletedMessages` and `batchedEntryDeletionIndexInfo` often is the largest attributes of the metadata. They are serialized to a proto data when being persisted. But we can not recycle the object which typed proto due to it is immutable. + +# Goals + +- Guarantee compatability for rollback from `4.0.x` to `3.0.x`. + - This PIP will be cherry-picked into `branch-3.0` and `branch-3.3`. +- Support customized cursor metadata serializer to improve the issues users encountered, such as **Issue-1** in the Motivation. + +# High Level Design + +### Design + +- We call the serialization that implemented before `4.0.0` `V1`, and call after the https://github.com/apache/pulsar/pull/9292 `v2`. +- Add all version of serialization into `branch-3.0`. + - Set the default value of `3.0.x` is `V1`, which is the same as the current status. + - Set the default value of `4.0.x` is `V1`, which is the same as the current status. +- Add two interfaces `CursorMetadataSerializerProvider` and `CursorMetadataDeSerializerProvider` to support newer of customized cursor metadata serializations. + +### Public API + +**CursorMetadataSerializerProvider.java** +```java +CursorMetadataSerializer newProvider(Name, PulsarService); +``` + +**CursorMetadataDeSerializerProvider.java** +```java +CursorMetadataDeserializer newProvider(Name, PulsarService); +``` + +**CursorMetadataSerializer.java** +```java +ManagedCursorInfo serialize(Position markDeletePosition, + Map properties, + RangeSetWrapper individualDeletedMessages, + ConcurrentSkipListMap batchDeletedIndexes); +``` + +**CursorMetadataDeserializer.java** +```java +ManagedCursorInfo deserialize(byte[] data); +``` + +### Public-facing Changes & Binary protocol +- If you used your customized `CursorMetadataSerializer`, it may break the tools who will read cursor ZK node, such as the tool `pulsar-managed-ledger-admin`. + +### Configuration + +**broker.conf** +```properties +cursorMetadataSerializerProvider=V2 +cursorMetadataDeserializerProvider=V1,V2 +``` + +### InScope and out of Scope + +This PIP will only add the interfaces named `CursorMetadataSerializerProvider` and `CursorMetadataDeSerializerProvider`, the implementations other than `V1` and `V2` will not be provided. + +# Backward & Forward Compatibility + +## Upgrade + +Nothing to do. + +## Downgrade / Rollback + +- I will cherry-pick this PIP into `branch-3.0` and `branch-3.3`. +- Since https://github.com/apache/pulsar/pull/9292 changed the cursor metadata serialization. Once you upgraded to `4.0.x` from a lower version, you can only downgrade to the version that contains the current PIP. + +# Links + + +* Mailing List discussion thread: +* Mailing List voting thread: From 950a23f705c855ca74e53f39f6b76b339a0edfcd Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Mon, 18 Nov 2024 19:30:43 +0800 Subject: [PATCH 2/6] Update pip-394.md --- pip/pip-394.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/pip/pip-394.md b/pip/pip-394.md index 233b90750f1f6..b507efdf269db 100644 --- a/pip/pip-394.md +++ b/pip/pip-394.md @@ -26,6 +26,9 @@ `individualDeletedMessages` and `batchedEntryDeletionIndexInfo` often is the largest attributes of the metadata. They are serialized to a proto data when being persisted. But we can not recycle the object which typed proto due to it is immutable. +![375661781-51d5bd6d-f5a1-48d7-921a-975875fe8bed](https://github.com/user-attachments/assets/dd1eb135-7dee-4dd1-84ba-994618a8198e) + + # Goals - Guarantee compatability for rollback from `4.0.x` to `3.0.x`. From 8527669dccc431e7bf38c93aa74118f8963f3bd2 Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Mon, 18 Nov 2024 20:41:32 +0800 Subject: [PATCH 3/6] Update pip-394.md --- pip/pip-394.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/pip/pip-394.md b/pip/pip-394.md index b507efdf269db..7d97bba9fc544 100644 --- a/pip/pip-394.md +++ b/pip/pip-394.md @@ -61,13 +61,13 @@ CursorMetadataDeserializer newProvider(Name, PulsarService); ```java ManagedCursorInfo serialize(Position markDeletePosition, Map properties, - RangeSetWrapper individualDeletedMessages, - ConcurrentSkipListMap batchDeletedIndexes); + LongPairRangeSet individualDeletedMessages, + ConcurrentMap batchDeletedIndexes); ``` **CursorMetadataDeserializer.java** ```java -ManagedCursorInfo deserialize(byte[] data); +ManagedCursorInfo deserialize(ByteBuf data); ``` ### Public-facing Changes & Binary protocol From 1eeea464f107fb2c6f711b7208ffe18ef4d7c69c Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Mon, 18 Nov 2024 20:43:08 +0800 Subject: [PATCH 4/6] Update pip-394.md --- pip/pip-394.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-394.md b/pip/pip-394.md index 7d97bba9fc544..9a1f54a624409 100644 --- a/pip/pip-394.md +++ b/pip/pip-394.md @@ -62,7 +62,7 @@ CursorMetadataDeserializer newProvider(Name, PulsarService); ManagedCursorInfo serialize(Position markDeletePosition, Map properties, LongPairRangeSet individualDeletedMessages, - ConcurrentMap batchDeletedIndexes); + Map batchDeletedIndexes); ``` **CursorMetadataDeserializer.java** From 24b944711badde3e8f0380f8bbab17f07fbf4a8c Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Tue, 19 Nov 2024 15:25:01 +0800 Subject: [PATCH 5/6] Update pip-394.md --- pip/pip-394.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-394.md b/pip/pip-394.md index 9a1f54a624409..e477e3302e49d 100644 --- a/pip/pip-394.md +++ b/pip/pip-394.md @@ -101,5 +101,5 @@ Nothing to do. -* Mailing List discussion thread: +* Mailing List discussion thread: https://lists.apache.org/thread/xy1prwcv4wdoobphcgloj7s5gxy05qq3 * Mailing List voting thread: From 68d3c1db2529287298b6d0ecf413bf3a9d10c55e Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Wed, 20 Nov 2024 10:36:18 +0800 Subject: [PATCH 6/6] Update pip-394.md --- pip/pip-394.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-394.md b/pip/pip-394.md index e477e3302e49d..363758607dd4a 100644 --- a/pip/pip-394.md +++ b/pip/pip-394.md @@ -102,4 +102,4 @@ Nothing to do. Updated afterwards --> * Mailing List discussion thread: https://lists.apache.org/thread/xy1prwcv4wdoobphcgloj7s5gxy05qq3 -* Mailing List voting thread: +* Mailing List voting thread: https://lists.apache.org/thread/x8bf9hvk1pvo0dl0q3mcjh08wg90s89k