-
Notifications
You must be signed in to change notification settings - Fork 108
Jet Stateful Stage Documentation [CTT-811] #2037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for hardcore-allen-f5257d ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
| The major use case of stateful mapping is recognizing a pattern in the | ||
| event stream, such as matching start-transaction with end-transaction | ||
| events based on an event correlation ID. More generally, you can | ||
| implement any kind of state machine and detect patterns in an input of | ||
| any complexity. | ||
|
|
||
| As with other stateful operations, you can also use a `groupingKey` to | ||
| have a unique state per key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these paragraphs ("The major use case...", "As with other stateful operations...") should be first - they are high-level description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. I think it would be better to first provide a general description of the parameter, explain the difference between keyed and non-keyed cases, and then describe the general pattern and how it can be solved using this method. Let’s leave it to @Rob-Hazelcast to decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either works. I think the current structure is good because the high level use case description leads into the example code. The alternative would work better if it was broken up into child sections - intro, detailed description, example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer top-to-bottom approach in documentation (first what it is for and later how it works), but that is not a hill I am willing to die on
| The state object is created using `createFn` and is passed to `mapFn` together with each input item. | ||
| The function can update the state and emit a single output item (or null if no output is needed). | ||
|
|
||
| For each grouping key, Jet maintains a dedicated state object created by the supplied `createFn`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd consider merging some of the paragraphs - currently there are a lot of short paragraphs, which makes it harder to scan the text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or possibly group keyed and non-keyed descrtiption in bulletpoints or subsections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the short paragraphs are fine. Bullets are a good idea and it might be nice to resolve the repetition ("the function can update the state and..."), but I'm not too worried about it.
NB: Sentences on adjacent lines are combined into paragraphs in the output, so they're not as short as they appear here - not sure if that was your concern or if you think they're too short in the output too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The non-keyed mapStateful" paragraph applies to non-keyed variant, the next 5 paragraphs apply ONLY to keyed variant (TTL, eviction, forced eviction) but this is not visible in the text and a bit confusing and harder to understand
Co-authored-by: Krzysztof Jamróz <79092062+k-jamroz@users.noreply.github.com>
Co-authored-by: Krzysztof Jamróz <79092062+k-jamroz@users.noreply.github.com>
Co-authored-by: Krzysztof Jamróz <79092062+k-jamroz@users.noreply.github.com>
Co-authored-by: Krzysztof Jamróz <79092062+k-jamroz@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks! Good level of explanation / examples, very clear.
| The state object is created using `createFn` and is passed to `mapFn` together with each input item. | ||
| The function can update the state and emit a single output item (or null if no output is needed). | ||
|
|
||
| For each grouping key, Jet maintains a dedicated state object created by the supplied `createFn`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the short paragraphs are fine. Bullets are a good idea and it might be nice to resolve the repetition ("the function can update the state and..."), but I'm not too worried about it.
NB: Sentences on adjacent lines are combined into paragraphs in the output, so they're not as short as they appear here - not sure if that was your concern or if you think they're too short in the output too.
| The major use case of stateful mapping is recognizing a pattern in the | ||
| event stream, such as matching start-transaction with end-transaction | ||
| events based on an event correlation ID. More generally, you can | ||
| implement any kind of state machine and detect patterns in an input of | ||
| any complexity. | ||
|
|
||
| As with other stateful operations, you can also use a `groupingKey` to | ||
| have a unique state per key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either works. I think the current structure is good because the high level use case description leads into the example code. The alternative would work better if it was broken up into child sections - intro, detailed description, example.
| The state object is created using `createFn` and is passed to `mapFn` together with each input item. | ||
| The `mapFn` function can update the state and emit a single output item or return null if no output is needed. | ||
|
|
||
| For each grouping key, Jet maintains a dedicated state object created by the supplied `createFn`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| For each grouping key, Jet maintains a dedicated state object created by the supplied `createFn`. | |
| In the keyed `mapStateful` Jet maintains a dedicated state object created by the supplied `createFn` for each grouping key. |
so it differentiates from previous paragraph
| implement any kind of state machine and detect patterns in an input of | ||
| any complexity. | ||
|
|
||
| As with other stateful operations, you can also use a `groupingKey` to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"As with other stateful operations, you can also use a groupingKey to" - is this part of the use case or rather a general description of the operator? We are introducing keyed variant again, which we did technically and not very explicitly ~7 paragraphs before. In the original version it was very close to the beginning so was much more natural, now it is after a long technical description.
| () -> new TransactionEvent[2], | ||
| (state, id, event) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| () -> new TransactionEvent[2], | |
| (state, id, event) -> { | |
| () -> new TransactionEvent[2], | |
| (state, id, event) -> { |
| // End of transaction received; this state is no longer needed | ||
| return state[1] != null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot evict after receiving only TRANSACTION_END because the events can come out of order:
| // End of transaction received; this state is no longer needed | |
| return state[1] != null; | |
| // we have both start and end events | |
| return state[0] != null && state[1] != null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, this seems to be a general pattern: deleteStatePredicate will often have the same condition as condition for generating final output from mapFn
| }, | ||
| // delete state immediately after transaction end | ||
| (state, txId, event) -> | ||
| event.getType() == EventType.TRANSACTION_END, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this example is less than ideal if events are allowed out of order and some may come after TRANSACTION_END
k-jamroz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving in advance, the content as a whole is good. Need to fix example. As to the form - I expressed my preferences, but we can also keep it as is.
https://github.com/hazelcast/hazelcast-mono/pull/5716