Add logging configuration and setup by PGijsbers · Pull Request #306 · openml/server-api

PGijsbers · 2026-04-01T14:43:35Z

Related to #182.

Adds logging configuration and setup.

By default log everything to develop.log
Additional log sinks can be defined in config.toml
Every request is automatically assigned a request identifier, which is automatically included in every log call in that request.
The request and response is (partially) logged with every request

The exact interaction for logging response and request content needs to be examined (which should be logged, make use of background tasks, ...).

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

In setup_log_sinks, you mutate the configuration dict by doing sink_configuration.pop("sink"), which alters the load_configuration() result globally; consider copying the dict (e.g. cfg = sink_configuration.copy()) before popping to avoid surprising side effects if the config is reused elsewhere.
The request_response_logger middleware logs full headers, cookies, query params, and request bodies, which can easily include sensitive data and large payloads; it would be safer to filter/whitelist fields or truncate payloads before logging.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `setup_log_sinks`, you mutate the configuration dict by doing `sink_configuration.pop("sink")`, which alters the `load_configuration()` result globally; consider copying the dict (e.g. `cfg = sink_configuration.copy()`) before popping to avoid surprising side effects if the config is reused elsewhere.
- The `request_response_logger` middleware logs full headers, cookies, query params, and request bodies, which can easily include sensitive data and large payloads; it would be safer to filter/whitelist fields or truncate payloads before logging.

## Individual Comments

### Comment 1
<location path="src/core/logging.py" line_range="16-18" />
<code_context>
+def setup_log_sinks() -> None:
+    """Configure loguru based on app configuration."""
+    configuration = load_configuration()
+    for nickname, sink_configuration in configuration.get("logging", {}).items():
+        logger.info("Configuring sink", nickname=nickname, **sink_configuration)
+        sink = sink_configuration.pop("sink")
+        logger.add(sink, serialize=True, **sink_configuration)
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Avoid mutating configuration dict when extracting `sink` for logger setup.

`sink_configuration.pop("sink")` mutates the dict from `load_configuration()`. If that config is cached, reused, or if `setup_log_sinks` is called multiple times, later calls may fail or behave unexpectedly once `sink` has been removed. Prefer either copying first (e.g. `cfg = dict(sink_configuration); sink = cfg.pop("sink")`) or accessing without mutation (e.g. `sink = sink_configuration["sink"]`).
</issue_to_address>

### Comment 2
<location path="src/core/logging.py" line_range="38-45" />
<code_context>
-logger.info("Loading configuration file from '%s'", _config_file)
-logger.info("Loading environment variables from '%s'", _dotenv_file)
+
+logger.info(
+    "Determined configuration sources.",
+    configuration_directory=_config_directory,
</code_context>
<issue_to_address>
**🚨 issue (security):** Logging full request bodies, headers, and cookies at INFO level is heavy and may expose sensitive data.

In addition to bodies, this also logs headers, cookies, and query params for every request, which increases cost and risk of leaking secrets (e.g., auth tokens or credentials) to logs. Please either lower this to DEBUG or implement safeguards such as redacting/safelisting sensitive fields, limiting logged body size, or skipping logging for specific paths or content types.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

src/core/logging.py

coderabbitai · 2026-04-01T14:45:14Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Added Loguru to project dependencies and updated .gitignore to ignore *.log and logs/. Switched module logging to use Loguru and changed load_configuration signature to file: Path | None = None. Added a [logging.develop] section to src/config.toml. Introduced src/core/logging.py with setup_log_sinks, add_request_context_to_log, and request_response_logger. Updated create_api(configuration_file: Path | None = None) to propagate the configuration file and register logging middleware. Added tests/config.test.toml and updated the test fixture to pass that file.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add logging configuration and setup' directly and clearly summarizes the main change: implementing logging infrastructure with configuration support.
Description check	✅ Passed	The description is directly related to the changeset, explaining the logging configuration objectives, default behavior, request context tracking, and noting areas needing further examination.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch logging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-01T14:47:03Z

Codecov Report

❌ Patch coverage is 97.82609% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 93.12%. Comparing base (3f1a7ac) to head (4e5d3da).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/core/logging.py	96.00%	0 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (97.82%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #306      +/-   ##
==========================================
+ Coverage   93.04%   93.12%   +0.08%     
==========================================
  Files          71       72       +1     
  Lines        2991     3026      +35     
  Branches      221      223       +2     
==========================================
+ Hits         2783     2818      +35     
+ Misses        152      151       -1     
- Partials       56       57       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

.gitignore (1)
2-3: Duplicate *.log pattern.

Line 64 already contains *.log under the Django section. Consider removing one of them to avoid redundancy.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore around lines 2 - 3, Remove the duplicate '*.log' entry in
.gitignore to avoid redundancy: keep one '*.log' pattern (prefer the one under
the Django section) and remove the other occurrence (the standalone '*.log' near
the top that sits alongside 'logs/'), ensuring 'logs/' remains if you still want
to ignore that directory.
src/config.toml (1)
12-20: Consider using an absolute or configured log path.

The relative path develop.log will create the log file in the current working directory, which may vary depending on how the application is launched. Consider:

Using an absolute path, or

Making it relative to a well-known directory (e.g., the config directory), or

Documenting the expected behavior.

This is minor since it's a development configuration, but worth considering for consistency across deployment scenarios.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.toml` around lines 12 - 20, The logging configuration uses a
relative sink "develop.log" under the [logging.develop] table which can produce
logs in unpredictable working directories; update the sink value to either an
absolute path or construct it from a known config directory (for example derive
path from your app config dir) or add a configurable setting (e.g.,
logging.develop.sinkPath) and use that when initializing the logger so the sink
points to a stable location instead of "develop.log"; ensure references to the
sink key in logger initialization (the code that reads logging.develop.sink) are
updated to use the new path value or config variable.
src/core/logging.py (1)
16-18: Mutating configuration dict with pop() is safe but fragile.

Since load_configuration() returns a fresh dict on each call, the pop("sink") mutation doesn't corrupt shared state. However, calling setup_log_sinks() multiple times would add duplicate sinks to Loguru. Consider adding a guard or documenting this as a single-call function.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/logging.py` around lines 16 - 18, The loop in setup_log_sinks
mutates sink_configuration with pop("sink") which is fragile and can produce
duplicate Loguru sinks if setup_log_sinks() is called more than once; change the
implementation to avoid mutating the original dict (e.g., read sink =
sink_configuration.get("sink") or operate on a shallow copy of
sink_configuration before removing the key) and make setup_log_sinks idempotent
by tracking configured sink names (e.g., a module-level set) or skipping sinks
already added based on nickname; reference the configuration.get("logging", {}),
sink_configuration, logger and setup_log_sinks when locating where to apply the
fix.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/core/logging.py`:
- Around line 38-46: The current logger.info call logs headers, cookies, and
full body (logger.info ... url=request.url, headers=request.headers,
cookies=request.cookies, body=await request.body()), which can leak sensitive
data; update the logging in the function using logger.info to: redact sensitive
headers (at least Authorization, Cookie, Set-Cookie) from request.headers, avoid
logging raw cookies (replace with a redacted marker or boolean flag), only log
bodies for safe content-types (e.g., text/* or application/json) and truncate
them to a reasonable length (or omit entirely), and move detailed request
payload logging to DEBUG level while keeping non-sensitive metadata (url,
path_params, query_params) at INFO; locate and modify the logger.info call and
any helper that reads request.body() to implement these redaction/truncation
rules.
- Line 45: The middleware currently calls await request.body(), which drains the
request stream and prevents downstream handlers from reading it; instead, read
and store the body into a local variable, then replace the request's receive
callable with a wrapper that replays the cached body (i.e., create a new receive
coroutine that returns the cached bytes and use Request(scope, receive=receive)
or set request._receive) so downstream code (in the same middleware
function/logging middleware where body=await request.body()) can call
request.body()/json()/form() normally; ensure the logged body uses the cached
value rather than consuming the original stream.

---

Nitpick comments:
In @.gitignore:
- Around line 2-3: Remove the duplicate '*.log' entry in .gitignore to avoid
redundancy: keep one '*.log' pattern (prefer the one under the Django section)
and remove the other occurrence (the standalone '*.log' near the top that sits
alongside 'logs/'), ensuring 'logs/' remains if you still want to ignore that
directory.

In `@src/config.toml`:
- Around line 12-20: The logging configuration uses a relative sink
"develop.log" under the [logging.develop] table which can produce logs in
unpredictable working directories; update the sink value to either an absolute
path or construct it from a known config directory (for example derive path from
your app config dir) or add a configurable setting (e.g.,
logging.develop.sinkPath) and use that when initializing the logger so the sink
points to a stable location instead of "develop.log"; ensure references to the
sink key in logger initialization (the code that reads logging.develop.sink) are
updated to use the new path value or config variable.

In `@src/core/logging.py`:
- Around line 16-18: The loop in setup_log_sinks mutates sink_configuration with
pop("sink") which is fragile and can produce duplicate Loguru sinks if
setup_log_sinks() is called more than once; change the implementation to avoid
mutating the original dict (e.g., read sink = sink_configuration.get("sink") or
operate on a shallow copy of sink_configuration before removing the key) and
make setup_log_sinks idempotent by tracking configured sink names (e.g., a
module-level set) or skipping sinks already added based on nickname; reference
the configuration.get("logging", {}), sink_configuration, logger and
setup_log_sinks when locating where to apply the fix.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bf2bd90d-f495-40f1-a4cf-04eb490adb18

📥 Commits

Reviewing files that changed from the base of the PR and between 3f1a7ac and 48a73ef.

📒 Files selected for processing (6)

.gitignore
pyproject.toml
src/config.py
src/config.toml
src/core/logging.py
src/main.py

src/core/logging.py

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main.py`:
- Line 9: The logging pipeline is disabled because the log setup import and
calls and the request logging/context middlewares are commented out; restore the
logging behavior by uncommenting the import (loguru/logger), re-enabling the
call to setup_log_sinks(...) with the same args used originally, and
re-registering the request logging and request context middlewares so request
IDs and request/response logging are applied at runtime; make sure
setup_log_sinks and middleware registration occur early during app
initialization (before registering routes/startup) so RequestContext and
request-logging behavior is active for incoming requests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8d268304-b0b2-42f1-941f-143e685150ad

📥 Commits

Reviewing files that changed from the base of the PR and between 2659219 and 38d1d8d.

📒 Files selected for processing (3)

src/config.py
src/core/logging.py
src/main.py

✅ Files skipped from review due to trivial changes (1)

src/core/logging.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/config.py

src/main.py

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/config.py (1)
28-34: Structured logging kwargs may not appear in output without format configuration.

Loguru places extra keyword arguments in the record's extra dict. To see these values in log output, the sink format must include them (e.g., {extra}). If the default format doesn't include extras, these context values won't be visible.

Consider including placeholders in the message or verifying the log sinks are configured to display extras:
logger.info(
    "Determined configuration sources: config_dir={configuration_directory}, "
    "config_file={configuration_file}, dotenv={dotenv_file}",
    configuration_directory=_config_directory,
    configuration_file=_config_file,
    dotenv_file=_dotenv_file,
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.py` around lines 28 - 34, The structured kwargs passed to
logger.info (the call that supplies configuration_directory, configuration_file,
dotenv_file) won't be visible unless the sink format includes extra; update the
logger call to embed these values into the message string (e.g., include
placeholders for configuration_directory, configuration_file, dotenv_file) or
alternatively ensure the Loguru sink format is configured to render {extra} so
those keys appear in output; change the call in the logger.info invocation in
src/config.py to either include the placeholders in the message or verify logger
configuration emits extras.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config.py`:
- Around line 82-84: load_configuration currently re-reads and re-parses the
file by calling tomllib.loads(file.read_text()) instead of delegating to the
cached helper; change load_configuration to simply delegate to the cached
_load_configuration by calling and returning _load_configuration(file or
_config_file) so it reuses the existing functools.cache behavior and removes the
duplicated parsing logic (refer to load_configuration, _load_configuration, and
_config_file).

---

Nitpick comments:
In `@src/config.py`:
- Around line 28-34: The structured kwargs passed to logger.info (the call that
supplies configuration_directory, configuration_file, dotenv_file) won't be
visible unless the sink format includes extra; update the logger call to embed
these values into the message string (e.g., include placeholders for
configuration_directory, configuration_file, dotenv_file) or alternatively
ensure the Loguru sink format is configured to render {extra} so those keys
appear in output; change the call in the logger.info invocation in src/config.py
to either include the placeholders in the message or verify logger configuration
emits extras.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 12a61132-5463-4c74-96cd-3d1beba8ff31

📥 Commits

Reviewing files that changed from the base of the PR and between 38d1d8d and 9f58623.

📒 Files selected for processing (3)

src/config.py
src/core/logging.py
src/main.py

✅ Files skipped from review due to trivial changes (1)

src/core/logging.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/main.py

coderabbitai · 2026-04-02T14:53:02Z

src/config.py

+def load_configuration(file: Path | None = None) -> TomlTable:
+    file = file or _config_file
    return tomllib.loads(file.read_text())


🛠️ Refactor suggestion | 🟠 Major

Missing caching - delegates to uncached file read.

load_configuration duplicates the tomllib.loads(file.read_text()) logic from _load_configuration (line 49-50) but lacks the @functools.cache decorator. Each call will re-read and re-parse the file.

Delegate to _load_configuration to benefit from caching and reduce duplication:

♻️ Proposed fix

def load_configuration(file: Path | None = None) -> TomlTable: file = file or _config_file - return tomllib.loads(file.read_text()) + return _load_configuration(file)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/config.py` around lines 82 - 84, load_configuration currently re-reads and re-parses the file by calling tomllib.loads(file.read_text()) instead of delegating to the cached helper; change load_configuration to simply delegate to the cached _load_configuration by calling and returning _load_configuration(file or _config_file) so it reuses the existing functools.cache behavior and removes the duplicated parsing logic (refer to load_configuration, _load_configuration, and _config_file).

PGijsbers added 6 commits April 1, 2026 14:01

Logging but request id context is not bound properly

b6ae097

Fix middleware order

346c97f

Fully transition to Loguru and make it configurable

186d196

ignore log files

ee38e33

Add function to setup loguru sinks based on configuration

964e55c

Move logging middleware to logging module

48a73ef

PGijsbers added the dev-tools Anything related to the developer tooling (CI, pre-commit, etc.) label Apr 1, 2026

sourcery-ai bot reviewed Apr 1, 2026

View reviewed changes

src/core/logging.py Show resolved Hide resolved

src/core/logging.py Outdated Show resolved Hide resolved

Remove automated logging of client ip address

dd806b1

coderabbitai bot reviewed Apr 1, 2026

View reviewed changes

src/core/logging.py Show resolved Hide resolved

src/core/logging.py Outdated Show resolved Hide resolved

PGijsbers and others added 5 commits April 1, 2026 17:02

Disable logging to file with pytest

37e974a

Use a different configuration for the tests

c62218c

Do not log the request body

2659219

Disable loguru

f7c6ff6

[pre-commit.ci] auto fixes from pre-commit.com hooks

38d1d8d

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Apr 2, 2026

View reviewed changes

src/main.py Outdated Show resolved Hide resolved

PGijsbers and others added 4 commits April 2, 2026 16:36

Add back import statements

1f1fa55

Startup logging

9c81a9a

Add back the context middleware

6b276de

[pre-commit.ci] auto fixes from pre-commit.com hooks

9f58623

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Apr 2, 2026

View reviewed changes

PGijsbers added 2 commits April 2, 2026 16:57

Add back in logging, but with placeholder messages

8e5c561

Add back in full logging of requests

4e5d3da

PGijsbers merged commit 98a89a2 into main Apr 2, 2026
8 of 9 checks passed

PGijsbers deleted the logging branch April 2, 2026 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add logging configuration and setup#306

Add logging configuration and setup#306
PGijsbers merged 18 commits intomainfrom
logging

PGijsbers commented Apr 1, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Apr 1, 2026 •

edited

Loading

Reviews paused

❌ Failed checks (1 warning)

Uh oh!

codecov bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

PGijsbers commented Apr 1, 2026

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

codecov bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Apr 1, 2026 •

edited

Loading

codecov bot commented Apr 1, 2026 •

edited

Loading