in_splunk: Implement handling remote addr feature#11398
in_splunk: Implement handling remote addr feature#11398
Conversation
📝 WalkthroughWalkthroughThe Splunk input plugin gains remote address tracking capabilities, extracting client IP addresses from X-Forwarded-For headers or connection metadata and injecting them into log records via configurable options. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler as Request Handler
participant Parser as Header Parser
participant Extractor as Address Extractor
participant Processor as Payload Processor
participant Record as Log Record
Client->>Handler: HTTP Request + X-Forwarded-For Header
Handler->>Parser: Parse HTTP Headers
Parser->>Extractor: Lookup X-Forwarded-For
alt XFF Header Present
Extractor->>Extractor: Extract IP from XFF
else XFF Not Present
Extractor->>Extractor: Use Peer Address
end
Extractor->>Handler: Return Remote Address
Handler->>Processor: Process Payload + Remote Address
Processor->>Record: Append Remote Address Field
Record->>Record: Emit Enhanced Log Event
Handler->>Handler: Clear Current Remote Address
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba50be0c16
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: In splunk_prot_handle_ng() the per-request fields
context->current_remote_addr and context->current_remote_addr_len are set but
never reset, allowing stale addresses to persist across requests; update the
function to ensure these fields are cleared before every return (or funnel
returns through a single cleanup label), e.g., after using
extract_remote_address(), when falling back to peer
(flb_connection_get_remote_address(parent_session->connection)), and prior to
any early exits: set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0 (or free/reset as appropriate) so the same
cleanup performed in splunk_prot_handle() is applied here.
🧹 Nitpick comments (5)
plugins/in_splunk/splunk.h (1)
76-79: Consider usingconst char *instead offlb_sds_tfor borrowed pointers.
current_remote_addris assigned non-owned pointers from either the XFF header value orflb_connection_get_remote_address()insplunk_prot.c. Usingflb_sds_tis misleading since it implies an owned/allocated string that should be managed withflb_sds_*functions.For clarity and to prevent accidental misuse:
♻️ Suggested change
/* Remote address */ - flb_sds_t current_remote_addr; + const char *current_remote_addr; size_t current_remote_addr_len;plugins/in_splunk/splunk_prot.c (4)
265-290: Const-correctness issue in output parameter.The function assigns
const char *values (fromextract_xff_valueandflb_connection_get_remote_address) to*out, butoutis declared aschar **. This discards theconstqualifier and may cause compiler warnings.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len) {Also update the corresponding field type in
splunk.hand call sites insplunk_prot_handle()andsplunk_prot_handle_ng().
424-428: Unused parameters in function signature.The
remote_addrandremote_addr_lenparameters are added to the signature but never used. The function usesctx->current_remote_addrandctx->current_remote_addr_lendirectly at lines 478-480.Either use the passed parameters or remove them from the signature to avoid confusion:
♻️ Option 1: Remove unused parameters
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm)♻️ Option 2: Use the passed parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }
775-780: Unused parameters in function signature.Similar to
process_flb_log_append(), theremote_addrandremote_addr_lenparameters are not used within this function. The downstreamprocess_raw_payload_pack()accessesctx->current_remote_addrdirectly.Consider removing these unused parameters for consistency:
♻️ Suggested change
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request)
1115-1118: Missing cleanup on early return paths.The per-request remote address is cleared at the end of successful processing, but multiple early return paths (lines 861, 928, 974, 1040, 1066, 1088, 1104) skip this cleanup. While the state is re-initialized at the start of each request (lines 1003-1004), for defensive coding it would be cleaner to use a
goto cleanuppattern to ensure consistent cleanup.Alternatively, since the state is always re-initialized at the start of
splunk_prot_handle(), this might be acceptable as-is. Just ensure this initialization always happens before any potential use.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)
424-481: Unused parameters:remote_addrandremote_addr_lenare never referenced.The function signature was updated to accept
remote_addrandremote_addr_len, but the implementation usesctx->current_remote_addrandctx->current_remote_addr_lendirectly (lines 479-480). Either use the parameters or remove them from the signature.♻️ Option 1: Remove unused parameters (simpler)
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm) {And update call sites accordingly.
♻️ Option 2: Use the parameters instead of context fields
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }
775-814: Unused parameters:remote_addrandremote_addr_lenare never referenced.Similar to
process_flb_log_append, these parameters are added to the signature but never used. The underlyingprocess_raw_payload_packreads fromctx->current_remote_addrdirectly.♻️ Suggested fix - remove unused parameters
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request) {Update the call site at line 1027 accordingly.
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1455-1459: In splunk_prot_handle_ng() the cleanup lines refer to
an undefined variable ctx; replace those uses with the correct function-local
variable name context (i.e., set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0) so the per-request remote address is
cleared on the correct struct instance.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
265-290: Const-correctness issue: discardingconstqualifier.The function accepts
char **outbut assignsconst char *values to it (fromextract_xff_valueandflb_connection_get_remote_address). This silently discards theconstqualifier. Consider changing the output parameter type to preserve const-correctness.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len) { const char *value = NULL; size_t len = 0; extract_xff_value(xff_value, xff_value_len, &value, &len); if (value == NULL && connection != NULL) { value = flb_connection_get_remote_address(connection); if (value != NULL) { len = strlen(value); } } if (value == NULL || len == 0) { return -1; } - *out = value; + *out = value; *out_len = len; return 0; }Also update the callers (
splunk_prot_handleandsplunk_prot_handle_ng) to declarehvalasconst char *:- char *hval = NULL; + const char *hval = NULL;And update the context fields if they aren't already
const char *.
7435927 to
ae2ad69
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: Add a NULL-check for request->stream->parent before
dereferencing parent_session->connection: after assigning parent_session =
(struct flb_http_server_session *) request->stream->parent, verify
parent_session != NULL and return an error (e.g., -1) or perform appropriate
error handling if it is NULL; then continue with the existing logic that uses
parent_session->connection (used by extract_remote_address and
flb_connection_get_remote_address) to avoid a crash if the parent session is
missing.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
424-481: Use the passedremote_addrparameters to avoid shared mutable state.
Right nowprocess_flb_log_append()ignores its new parameters and re-readsctx->current_remote_addr. Using the parameters makes the function’s contract explicit and reduces reliance on shared state.♻️ Suggested change
- if (ret == FLB_EVENT_ENCODER_SUCCESS) { - ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); - } + if (ret == FLB_EVENT_ENCODER_SUCCESS) { + ret = append_remote_addr(ctx, remote_addr, remote_addr_len); + }Also applies to: 775-780
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)
424-486: Unused parameters:remote_addrandremote_addr_lenare passed but ignored.The function signature accepts
remote_addrandremote_addr_lenparameters (lines 427-428), but the implementation at lines 482-485 usesctx->current_remote_addrandctx->current_remote_addr_lendirectly instead. This inconsistency makes the API misleading.🐛 Proposed fix: use the passed parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }Alternatively, if the intent is to always use the context's current address, remove the unused parameters from the function signature.
780-819: Unused parameters inprocess_hec_raw_payload.The function signature was extended to include
remote_addrandremote_addr_len(lines 784-785), but these parameters are never used in the function body. The call toprocess_raw_payload_packat line 816 doesn't pass them, andprocess_raw_payload_packreads fromctx->current_remote_addrdirectly.Either remove the unused parameters from the signature, or if they were intended for future use, add a comment explaining this.
♻️ Proposed fix: remove unused parameters
static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn, flb_sds_t tag, struct mk_http_session *session, - struct mk_http_request *request, - const char *remote_addr, - size_t remote_addr_len) + struct mk_http_request *request)Then update the call site at line 1032 accordingly.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)
265-290: Const-correctness issue: output parameter should beconst char **.The function assigns a
const char *(fromextract_xff_valueandflb_connection_get_remote_address) to*out, but the parameter is declared aschar **. This casts away const, which could lead to undefined behavior if callers attempt to modify the returned string.♻️ Proposed fix
-static int extract_remote_address(const char *xff_value, - size_t xff_value_len, - struct flb_connection *connection, - char **out, - size_t *out_len) +static int extract_remote_address(const char *xff_value, + size_t xff_value_len, + struct flb_connection *connection, + const char **out, + size_t *out_len)This will require updating the callers to use
const char *for the corresponding local variables (hvalinsplunk_prot_handleandsplunk_prot_handle_ng).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 454-456: The call to
flb_log_event_encoder_set_body_from_msgpack_object is passing the wrong argument
(ctx) — replace the first parameter with the log encoder instance by passing
&ctx->log_encoder instead of ctx so the function receives a pointer to the
flb_log_event_encoder (change the call in the else branch where
flb_log_event_encoder_set_body_from_msgpack_object(ctx, record) is used).
🧹 Nitpick comments (3)
plugins/in_splunk/splunk_prot.c (3)
265-290: Consider improving const-correctness.The function assigns
const char *valuetochar **out, discarding the const qualifier. Since the returned pointer refers to either header data or the connection's address string (both effectively read-only), the output parameter should beconst char **outto preserve type safety.♻️ Suggested fix
static int extract_remote_address(const char *xff_value, size_t xff_value_len, struct flb_connection *connection, - char **out, + const char **out, size_t *out_len)This would require updating the callers to use
const char *for the address variables as well.
424-428: Unused parameters:remote_addrandremote_addr_lenare never used.The function signature was updated to accept
remote_addrandremote_addr_lenparameters, but the body usesctx->current_remote_addrandctx->current_remote_addr_leninstead. Either use the passed parameters or remove them from the signature to avoid confusion.♻️ Option 1: Use the parameters
if (ret == FLB_EVENT_ENCODER_SUCCESS) { ret = append_remote_addr(ctx, - ctx->current_remote_addr, - ctx->current_remote_addr_len); + remote_addr, + remote_addr_len); }♻️ Option 2: Remove unused parameters
static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record, flb_sds_t tag, flb_sds_t tag_from_record, - struct flb_time tm, - const char *remote_addr, - size_t remote_addr_len) + struct flb_time tm)And update all call sites to remove the extra arguments.
Also applies to: 482-486
783-785: Unused parameters:remote_addrandremote_addr_lenare never used.Similar to
process_flb_log_append, these parameters are added to the signature but never used in the function body. The called functionprocess_raw_payload_packusesctx->current_remote_addrdirectly. Consider removing these parameters for consistency.
0cc7554 to
4c8b53d
Compare
patrick-stephens
left a comment
There was a problem hiding this comment.
I would probably break the commit linter changes out into a separate PR to allow easier merging.
I sent a separated PR as: |
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
4c8b53d to
c012c36
Compare
|
I rebased off master for building on top of the recent changes of commit linter. |
Currently, in_splunk does not handle remote address.
This could be inconvenient to track remote address for traceability.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
in_splunk: Add remote_addr related parameters' descriptions fluent-bit-docs#2360
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.