Skip to content

in_splunk: Implement handling remote addr feature#11398

Open
cosmo0920 wants to merge 6 commits intomasterfrom
cosmo0920-extract-remote-addr-on-in_splunk
Open

in_splunk: Implement handling remote addr feature#11398
cosmo0920 wants to merge 6 commits intomasterfrom
cosmo0920-extract-remote-addr-on-in_splunk

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Jan 26, 2026

Currently, in_splunk does not handle remote address.
This could be inconvenient to track remote address for traceability.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added configuration options to the Splunk input plugin to include remote address information in events via X-Forwarded-For headers or connection addresses.
  • Tests

    • Added test for remote address extraction functionality.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

The Splunk input plugin gains remote address tracking capabilities, extracting client IP addresses from X-Forwarded-For headers or connection metadata and injecting them into log records via configurable options.

Changes

Cohort / File(s) Summary
Configuration & State
plugins/in_splunk/splunk.c, plugins/in_splunk/splunk.h, plugins/in_splunk/splunk_config.c
Added configuration options add_remote_addr and remote_addr_key, along with runtime state fields current_remote_addr and current_remote_addr_len to track the remote address per request. State fields initialized during context creation.
Protocol & Header Handling
plugins/in_splunk/splunk_prot.c, plugins/in_splunk/splunk_prot.h
Implements HTTP header parsing and remote address extraction from X-Forwarded-For headers with fallback to peer address. Added helper functions for header lookup, XFF parsing, and remote address resolution. Integrated extraction into request handlers and propagated remote address through payload processing pipelines. Cleanup of state occurs after request completion.
Testing
tests/runtime/in_splunk.c
Added new test flb_test_splunk_xff_extract() to validate X-Forwarded-For header extraction and injection into log output. Test appears to be defined and registered twice in the file.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Handler as Request Handler
    participant Parser as Header Parser
    participant Extractor as Address Extractor
    participant Processor as Payload Processor
    participant Record as Log Record

    Client->>Handler: HTTP Request + X-Forwarded-For Header
    Handler->>Parser: Parse HTTP Headers
    Parser->>Extractor: Lookup X-Forwarded-For
    alt XFF Header Present
        Extractor->>Extractor: Extract IP from XFF
    else XFF Not Present
        Extractor->>Extractor: Use Peer Address
    end
    Extractor->>Handler: Return Remote Address
    Handler->>Processor: Process Payload + Remote Address
    Processor->>Record: Append Remote Address Field
    Record->>Record: Emit Enhanced Log Event
    Handler->>Handler: Clear Current Remote Address
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A hop and a skip through headers so bright,
X-Forwarded-For dances into the night,
Remote addresses flow like carrots in streams,
Enriching each log with networked dreams,
The Splunk plugin now tracks from afar!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing remote address handling in the Splunk input plugin. It is specific, concise, and directly reflects the core functionality added across the modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba50be0c16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: In splunk_prot_handle_ng() the per-request fields
context->current_remote_addr and context->current_remote_addr_len are set but
never reset, allowing stale addresses to persist across requests; update the
function to ensure these fields are cleared before every return (or funnel
returns through a single cleanup label), e.g., after using
extract_remote_address(), when falling back to peer
(flb_connection_get_remote_address(parent_session->connection)), and prior to
any early exits: set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0 (or free/reset as appropriate) so the same
cleanup performed in splunk_prot_handle() is applied here.
🧹 Nitpick comments (5)
plugins/in_splunk/splunk.h (1)

76-79: Consider using const char * instead of flb_sds_t for borrowed pointers.

current_remote_addr is assigned non-owned pointers from either the XFF header value or flb_connection_get_remote_address() in splunk_prot.c. Using flb_sds_t is misleading since it implies an owned/allocated string that should be managed with flb_sds_* functions.

For clarity and to prevent accidental misuse:

♻️ Suggested change
     /* Remote address */
-    flb_sds_t current_remote_addr;
+    const char *current_remote_addr;
     size_t current_remote_addr_len;
plugins/in_splunk/splunk_prot.c (4)

265-290: Const-correctness issue in output parameter.

The function assigns const char * values (from extract_xff_value and flb_connection_get_remote_address) to *out, but out is declared as char **. This discards the const qualifier and may cause compiler warnings.

♻️ Suggested fix
 static int extract_remote_address(const char *xff_value,
                                   size_t xff_value_len,
                                   struct flb_connection *connection,
-                                  char **out,
+                                  const char **out,
                                   size_t *out_len)
 {

Also update the corresponding field type in splunk.h and call sites in splunk_prot_handle() and splunk_prot_handle_ng().


424-428: Unused parameters in function signature.

The remote_addr and remote_addr_len parameters are added to the signature but never used. The function uses ctx->current_remote_addr and ctx->current_remote_addr_len directly at lines 478-480.

Either use the passed parameters or remove them from the signature to avoid confusion:

♻️ Option 1: Remove unused parameters
 static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record,
                                    flb_sds_t tag, flb_sds_t tag_from_record,
-                                   struct flb_time tm,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct flb_time tm)
♻️ Option 2: Use the passed parameters
     if (ret == FLB_EVENT_ENCODER_SUCCESS) {
         ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
+                                 remote_addr,
+                                 remote_addr_len);
     }

775-780: Unused parameters in function signature.

Similar to process_flb_log_append(), the remote_addr and remote_addr_len parameters are not used within this function. The downstream process_raw_payload_pack() accesses ctx->current_remote_addr directly.

Consider removing these unused parameters for consistency:

♻️ Suggested change
 static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn,
                                    flb_sds_t tag,
                                    struct mk_http_session *session,
-                                   struct mk_http_request *request,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct mk_http_request *request)

1115-1118: Missing cleanup on early return paths.

The per-request remote address is cleared at the end of successful processing, but multiple early return paths (lines 861, 928, 974, 1040, 1066, 1088, 1104) skip this cleanup. While the state is re-initialized at the start of each request (lines 1003-1004), for defensive coding it would be cleaner to use a goto cleanup pattern to ensure consistent cleanup.

Alternatively, since the state is always re-initialized at the start of splunk_prot_handle(), this might be acceptable as-is. Just ensure this initialization always happens before any potential use.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)

424-481: Unused parameters: remote_addr and remote_addr_len are never referenced.

The function signature was updated to accept remote_addr and remote_addr_len, but the implementation uses ctx->current_remote_addr and ctx->current_remote_addr_len directly (lines 479-480). Either use the parameters or remove them from the signature.

♻️ Option 1: Remove unused parameters (simpler)
 static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record,
                                    flb_sds_t tag, flb_sds_t tag_from_record,
-                                   struct flb_time tm,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct flb_time tm)
 {

And update call sites accordingly.

♻️ Option 2: Use the parameters instead of context fields
     if (ret == FLB_EVENT_ENCODER_SUCCESS) {
         ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
+                                 remote_addr,
+                                 remote_addr_len);
     }

775-814: Unused parameters: remote_addr and remote_addr_len are never referenced.

Similar to process_flb_log_append, these parameters are added to the signature but never used. The underlying process_raw_payload_pack reads from ctx->current_remote_addr directly.

♻️ Suggested fix - remove unused parameters
 static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn,
                                    flb_sds_t tag,
                                    struct mk_http_session *session,
-                                   struct mk_http_request *request,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct mk_http_request *request)
 {

Update the call site at line 1027 accordingly.

🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1455-1459: In splunk_prot_handle_ng() the cleanup lines refer to
an undefined variable ctx; replace those uses with the correct function-local
variable name context (i.e., set context->current_remote_addr = NULL and
context->current_remote_addr_len = 0) so the per-request remote address is
cleared on the correct struct instance.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)

265-290: Const-correctness issue: discarding const qualifier.

The function accepts char **out but assigns const char * values to it (from extract_xff_value and flb_connection_get_remote_address). This silently discards the const qualifier. Consider changing the output parameter type to preserve const-correctness.

♻️ Suggested fix
 static int extract_remote_address(const char *xff_value,
                                   size_t xff_value_len,
                                   struct flb_connection *connection,
-                                  char **out,
+                                  const char **out,
                                   size_t *out_len)
 {
     const char *value = NULL;
     size_t len = 0;

     extract_xff_value(xff_value, xff_value_len, &value, &len);

     if (value == NULL && connection != NULL) {
         value = flb_connection_get_remote_address(connection);
         if (value != NULL) {
             len = strlen(value);
         }
     }

     if (value == NULL || len == 0) {
         return -1;
     }

-    *out = value;
+    *out = value;
     *out_len = len;
     return 0;
 }

Also update the callers (splunk_prot_handle and splunk_prot_handle_ng) to declare hval as const char *:

-    char *hval = NULL;
+    const char *hval = NULL;

And update the context fields if they aren't already const char *.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 1381-1399: Add a NULL-check for request->stream->parent before
dereferencing parent_session->connection: after assigning parent_session =
(struct flb_http_server_session *) request->stream->parent, verify
parent_session != NULL and return an error (e.g., -1) or perform appropriate
error handling if it is NULL; then continue with the existing logic that uses
parent_session->connection (used by extract_remote_address and
flb_connection_get_remote_address) to avoid a crash if the parent session is
missing.
🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)

424-481: Use the passed remote_addr parameters to avoid shared mutable state.
Right now process_flb_log_append() ignores its new parameters and re-reads ctx->current_remote_addr. Using the parameters makes the function’s contract explicit and reduces reliance on shared state.

♻️ Suggested change
-    if (ret == FLB_EVENT_ENCODER_SUCCESS) {
-        ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
-    }
+    if (ret == FLB_EVENT_ENCODER_SUCCESS) {
+        ret = append_remote_addr(ctx, remote_addr, remote_addr_len);
+    }

Also applies to: 775-780

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/in_splunk/splunk_prot.c (2)

424-486: Unused parameters: remote_addr and remote_addr_len are passed but ignored.

The function signature accepts remote_addr and remote_addr_len parameters (lines 427-428), but the implementation at lines 482-485 uses ctx->current_remote_addr and ctx->current_remote_addr_len directly instead. This inconsistency makes the API misleading.

🐛 Proposed fix: use the passed parameters
     if (ret == FLB_EVENT_ENCODER_SUCCESS) {
         ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
+                                 remote_addr,
+                                 remote_addr_len);
     }

Alternatively, if the intent is to always use the context's current address, remove the unused parameters from the function signature.


780-819: Unused parameters in process_hec_raw_payload.

The function signature was extended to include remote_addr and remote_addr_len (lines 784-785), but these parameters are never used in the function body. The call to process_raw_payload_pack at line 816 doesn't pass them, and process_raw_payload_pack reads from ctx->current_remote_addr directly.

Either remove the unused parameters from the signature, or if they were intended for future use, add a comment explaining this.

♻️ Proposed fix: remove unused parameters
 static int process_hec_raw_payload(struct flb_splunk *ctx, struct splunk_conn *conn,
                                    flb_sds_t tag,
                                    struct mk_http_session *session,
-                                   struct mk_http_request *request,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct mk_http_request *request)

Then update the call site at line 1032 accordingly.

🧹 Nitpick comments (1)
plugins/in_splunk/splunk_prot.c (1)

265-290: Const-correctness issue: output parameter should be const char **.

The function assigns a const char * (from extract_xff_value and flb_connection_get_remote_address) to *out, but the parameter is declared as char **. This casts away const, which could lead to undefined behavior if callers attempt to modify the returned string.

♻️ Proposed fix
-static int extract_remote_address(const char *xff_value,
-                                  size_t xff_value_len,
-                                  struct flb_connection *connection,
-                                  char **out,
-                                  size_t *out_len)
+static int extract_remote_address(const char *xff_value,
+                                  size_t xff_value_len,
+                                  struct flb_connection *connection,
+                                  const char **out,
+                                  size_t *out_len)

This will require updating the callers to use const char * for the corresponding local variables (hval in splunk_prot_handle and splunk_prot_handle_ng).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@plugins/in_splunk/splunk_prot.c`:
- Around line 454-456: The call to
flb_log_event_encoder_set_body_from_msgpack_object is passing the wrong argument
(ctx) — replace the first parameter with the log encoder instance by passing
&ctx->log_encoder instead of ctx so the function receives a pointer to the
flb_log_event_encoder (change the call in the else branch where
flb_log_event_encoder_set_body_from_msgpack_object(ctx, record) is used).
🧹 Nitpick comments (3)
plugins/in_splunk/splunk_prot.c (3)

265-290: Consider improving const-correctness.

The function assigns const char *value to char **out, discarding the const qualifier. Since the returned pointer refers to either header data or the connection's address string (both effectively read-only), the output parameter should be const char **out to preserve type safety.

♻️ Suggested fix
 static int extract_remote_address(const char *xff_value,
                                   size_t xff_value_len,
                                   struct flb_connection *connection,
-                                  char **out,
+                                  const char **out,
                                   size_t *out_len)

This would require updating the callers to use const char * for the address variables as well.


424-428: Unused parameters: remote_addr and remote_addr_len are never used.

The function signature was updated to accept remote_addr and remote_addr_len parameters, but the body uses ctx->current_remote_addr and ctx->current_remote_addr_len instead. Either use the passed parameters or remove them from the signature to avoid confusion.

♻️ Option 1: Use the parameters
     if (ret == FLB_EVENT_ENCODER_SUCCESS) {
         ret = append_remote_addr(ctx,
-                                 ctx->current_remote_addr,
-                                 ctx->current_remote_addr_len);
+                                 remote_addr,
+                                 remote_addr_len);
     }
♻️ Option 2: Remove unused parameters
 static void process_flb_log_append(struct flb_splunk *ctx, msgpack_object *record,
                                    flb_sds_t tag, flb_sds_t tag_from_record,
-                                   struct flb_time tm,
-                                   const char *remote_addr,
-                                   size_t remote_addr_len)
+                                   struct flb_time tm)

And update all call sites to remove the extra arguments.

Also applies to: 482-486


783-785: Unused parameters: remote_addr and remote_addr_len are never used.

Similar to process_flb_log_append, these parameters are added to the signature but never used in the function body. The called function process_raw_payload_pack uses ctx->current_remote_addr directly. Consider removing these parameters for consistency.

Copy link
Contributor

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably break the commit linter changes out into a separate PR to allow easier merging.

@cosmo0920
Copy link
Contributor Author

I would probably break the commit linter changes out into a separate PR to allow easier merging.

I sent a separated PR as:
#11407

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
@cosmo0920
Copy link
Contributor Author

I rebased off master for building on top of the recent changes of commit linter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants