Skip to content

Performance testing#8

Merged
AivanF merged 8 commits intomainfrom
aivan/2026-04-01-Performance-Testing
Apr 2, 2026
Merged

Performance testing#8
AivanF merged 8 commits intomainfrom
aivan/2026-04-01-Performance-Testing

Conversation

@AivanF
Copy link
Copy Markdown
Collaborator

@AivanF AivanF commented Apr 1, 2026

Designed and implemented:

  1. Parametrised scalable pseudo-random deterministic (via weighted random selection algorithm) e-commerce with seasonality data seeding.
  2. Generalised performance benchmarking with easily configurable sets of data sizes and of test cases.
  3. A script to convert output to CSV for building charts.

Summary by CodeRabbit

  • Tests

    • Added a full performance benchmarking suite: deterministic dataset generation and seeding, session-scoped benchmark environments across multiple dataset scales (1k–40k), parameterized query workloads, and a tool to convert pytest-benchmark JSON into pivoted CSV with estimated complexity.
  • Chores

    • Added pytest-benchmark to dev dependencies, updated .gitignore to ignore local benchmark artifacts, and introduced Makefile targets to run, export, and convert benchmark reports.

@AivanF AivanF requested a review from ZmeiGorynych April 1, 2026 17:55
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds a performance benchmarking suite: deterministic data generator and seeder, pytest session fixtures for multiple dataset scales, parameterized pytest-benchmark tests, a bench JSON→CSV converter and CLI, Makefile targets for running/reporting benchmarks, and dev tooling/config updates.

Changes

Cohort / File(s) Summary
Config
\.gitignore, pyproject.toml
Added benchmarks/ ignore entry under examples/postgres/slayer_data/; added dev dependency pytest-benchmark = "^5.2.3".
Benchmark CSV Tool
tests/perf/bench_to_csv.py
New CLI/module to convert pytest-benchmark JSON into a pivoted CSV; infers per-query complexity via log-log median exponent; supports stdout or -o output file.
Fixtures & Params
tests/perf/conftest.py, tests/perf/params.py
New session-scoped env_<scale> fixtures driven by SCALES; create temporary or external DB, seed data, apply INDEXES, configure YAMLStorage, instantiate SlayerQueryEngine, and warm up queries. params.py centralizes DB backend, scales, dates, indexes, and seed.
Data Generation & Seeding
tests/perf/seed.py
New deterministic LCG-based dataset generator and SQL seeder with dataclasses (Region, Shop, Customer, Order, Dataset), deterministic timestamps/fields, weighted sampling, and batched raw-SQL inserts.
Benchmark Tests
tests/perf/test_bench.py
New pytest-benchmark test module: many predefined query workloads; dynamically generated per-scale benchmark classes that run queries via fixtures and record timings.
Make Targets
Makefile
New Makefile with BENCH_FLAGS and targets: test, lint, bench, bench-report, bench-csv to run benchmarks, produce JSON reports, and convert to CSV.

Sequence Diagram

sequenceDiagram
    participant Runner as Test Runner
    participant Fixture as pytest fixture
    participant Generator as Data Generator
    participant DB as Database
    participant Engine as SlayerQueryEngine

    note over Runner,Engine: benchmark setup and execution

    Runner->>Fixture: request env_<scale>
    Fixture->>Generator: generate_dataset(order_count)
    activate Generator
    Generator-->>Fixture: Dataset
    deactivate Generator

    Fixture->>DB: create/connect DB and seed_database(dataset)
    activate DB
    DB-->>Fixture: DB ready (schema, rows, INDEXES applied)
    deactivate DB

    Fixture->>Engine: init (YAMLStorage + DB)
    Fixture-->>Runner: return (Engine, Dataset)

    Runner->>Engine: execute SlayerQuery (benchmarked)
    activate Engine
    Engine->>DB: run SQL
    DB-->>Engine: rows
    Engine-->>Runner: SlayerResponse (timing)
    deactivate Engine
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • ZmeiGorynych

Poem

🐰 I seeded fields with careful hops and code,
Shops, customers, and orders down the road.
Fixtures hum, engines wake to run—
Benchmarks captured, CSVs begun.
Hop—timings stored and tales of speed are told.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 52.63% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Performance testing' is vague and generic; it does not clearly specify the main change (e.g., adding benchmarking infrastructure, seeder, or CSV conversion tools). Consider a more specific title such as 'Add performance benchmarking infrastructure with parametrized seeder and CSV conversion' to clearly convey the primary changes.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch aivan/2026-04-01-Performance-Testing

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/perf/seed.py (2)

408-459: Consider keyword-only parameters and moving import to top of file.

Two minor suggestions:

  1. Per coding guidelines, functions with >1 parameter should use keyword arguments
  2. The sqlalchemy import could be moved to the top of the file for consistency
♻️ Optional refactor

Move import to top:

+import sqlalchemy as sa
 from dataclasses import dataclass
 from datetime import datetime, timedelta

Use keyword-only parameters:

-def seed_database(engine, dataset: Dataset) -> None:
+def seed_database(*, engine, dataset: Dataset) -> None:

As per coding guidelines: "Use keyword arguments for functions with more than 1 parameter" and "Place imports at the top of files".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/perf/seed.py` around lines 408 - 459, The function seed_database
currently takes positional parameters (engine, dataset) and imports sqlalchemy
inside the function; change the signature to require keyword-only parameters
(def seed_database(*, engine, dataset): or def seed_database(engine, *,
dataset):) so callers must pass by name, move the "import sqlalchemy as sa" to
the top-level imports, and update any call sites to pass engine=... and
dataset=...; ensure the function still references sa in its body (e.g., sa.text)
after the import move.

140-154: Consider using keyword-only arguments for public API.

Per coding guidelines, functions with more than 1 parameter should use keyword arguments. The generate_dataset function signature could be clearer with keyword-only parameters.

♻️ Optional refactor
 def generate_dataset(
-    order_count: int,
-    start_date: str = "2023-01-01",
-    end_date: str = "2024-12-31",
-    seed: int = 42,
+    *,
+    order_count: int,
+    start_date: str = "2023-01-01",
+    end_date: str = "2024-12-31",
+    seed: int = 42,
 ) -> Dataset:

This would require updating the call site in conftest.py:

dataset = generate_dataset(order_count=order_count)

As per coding guidelines: "Use keyword arguments for functions with more than 1 parameter".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/perf/seed.py` around lines 140 - 154, The public function
generate_dataset currently accepts multiple positional parameters; change its
signature to enforce keyword-only arguments (e.g., def
generate_dataset(order_count: int, *, start_date: str = "2023-01-01", end_date:
str = "2024-12-31", seed: int = 42) -> Dataset) so callers must pass start_date,
end_date, and seed by name; update any call sites (such as the test conftest.py
that invokes generate_dataset) to use keyword invocation like
generate_dataset(order_count=order_count) and leave internal usage of _prng
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/perf/bench_to_csv.py`:
- Around line 68-69: The code reads the output path via
sys.argv[sys.argv.index("-o") + 1] which will raise IndexError if "-o" is the
last arg; update the logic around out_path to first find the index =
sys.argv.index("-o") and confirm index + 1 < len(sys.argv) (or use a try/except)
before accessing the next element, and if missing, print a clear error/usage
message or exit; modify the block that sets out_path to validate presence of the
following value rather than directly indexing into sys.argv.

---

Nitpick comments:
In `@tests/perf/seed.py`:
- Around line 408-459: The function seed_database currently takes positional
parameters (engine, dataset) and imports sqlalchemy inside the function; change
the signature to require keyword-only parameters (def seed_database(*, engine,
dataset): or def seed_database(engine, *, dataset):) so callers must pass by
name, move the "import sqlalchemy as sa" to the top-level imports, and update
any call sites to pass engine=... and dataset=...; ensure the function still
references sa in its body (e.g., sa.text) after the import move.
- Around line 140-154: The public function generate_dataset currently accepts
multiple positional parameters; change its signature to enforce keyword-only
arguments (e.g., def generate_dataset(order_count: int, *, start_date: str =
"2023-01-01", end_date: str = "2024-12-31", seed: int = 42) -> Dataset) so
callers must pass start_date, end_date, and seed by name; update any call sites
(such as the test conftest.py that invokes generate_dataset) to use keyword
invocation like generate_dataset(order_count=order_count) and leave internal
usage of _prng unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bc476819-2dae-4fe8-b2bb-8b063f9b8ec3

📥 Commits

Reviewing files that changed from the base of the PR and between 0fc1d72 and 2e94060.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • .gitignore
  • pyproject.toml
  • tests/perf/__init__.py
  • tests/perf/bench_to_csv.py
  • tests/perf/conftest.py
  • tests/perf/seed.py
  • tests/perf/test_bench.py

Comment thread tests/perf/bench_to_csv.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
Makefile (1)

5-5: Consider adding all and clean phony targets.

checkmake is already flagging this file for missing standard entrypoints. Thin aliases are enough if you want to keep the Makefile minimal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 5, Add the standard phony targets "all" and "clean" to
satisfy checkmake: declare them alongside the existing .PHONY line and implement
thin aliases so "all" depends on the primary validation targets (e.g., test
lint) and "clean" removes build/test artifacts (or delegates to existing cleanup
commands) using the Makefile's existing targets and conventions (refer to .PHONY
and the test, lint, bench, bench-report, bench-csv target names to locate where
to wire these aliases).
tests/perf/seed.py (1)

92-133: Use Pydantic models for the benchmark entities.

Region, Shop, Customer, Order, and Dataset are the module’s canonical models, but they’re plain dataclasses. That diverges from the repo standard and drops the validation/serialization layer right where bad seed data is hardest to debug.

As per coding guidelines, "Use Python 3.11+ with Pydantic v2 for all models".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/perf/seed.py` around lines 92 - 133, Replace the plain dataclasses with
Pydantic v2 models: convert Region, Shop, Customer, Order, and Dataset to
inherit from pydantic.BaseModel (import from pydantic) and use typing
annotations (e.g., int, str, datetime, list[int], Optional[datetime]) and
pydantic.Field where defaults/metadata are needed; ensure Order fields
completed_at and cancelled_at are Optional[datetime], keep avg_cost as int
(cents) and avg_frequency/size as int, and make Dataset reference lists of the
Pydantic model classes (list[Region], list[Shop], etc.) so
validation/serialization are automatic. Also add any lightweight validators (if
needed) on Shop.avg_cost or Order.cost to enforce non-negative values using
model validators in the corresponding model names (Region, Shop, Customer,
Order, Dataset).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 1-3: The comment and BENCH_FLAGS disagree: the comment says
minimum 10 rounds but BENCH_FLAGS sets --benchmark-min-rounds=5; pick one and
make them consistent by either updating BENCH_FLAGS to use
--benchmark-min-rounds=10 or changing the comment to state 5 rounds; edit the
Makefile, update the BENCH_FLAGS definition (the variable name BENCH_FLAGS and
its --benchmark-min-rounds flag) or the top-line comment to match the chosen
value so documentation and flags align.

In `@tests/perf/conftest.py`:
- Around line 126-132: The current loop in db_engine.connect() that executes
INDEXES swallows all exceptions and can let benchmarks run on the wrong schema;
update the block around conn.execute(sa.text(idx_sql)) to capture failures
(collect failing idx_sql values and their exception messages), log them via the
test logger or process logger including the dialect (db_engine.dialect.name) and
then fail fast by raising an exception after the loop (or re-raising
immediately) so the test run aborts for that dialect instead of proceeding
without the intended indexes; reference the INDEXES constant, the
conn.execute(...) call, and the surrounding with db_engine.connect() /
conn.commit() logic when making the change.
- Around line 112-117: The current DB_BACKEND=="url" branch calls
seed_database(engine=db_engine, dataset=dataset, clean=True) unconditionally
which can destructively reset any external DB; change this to require an
explicit opt-in before running destructive cleanup: add a check for a dedicated
benchmark marker or a boolean env/param like BENCH_DB_RESET (or require the
DB_URL to reference a dedicated schema/database name such as containing
"bench"), and only call seed_database when that opt-in is present; otherwise
raise a clear ValueError requiring BENCH_DB_RESET or a bench-specific DB_URL.
Update the logic around DB_BACKEND, DB_URL, DB_TYPE, seed_database and
DatasourceConfig so the DatasourceConfig is still created for read-only runs but
destructive seed_database(clean=True) is gated behind the explicit opt-in.
- Around line 156-162: The fixtures created inside the loop are registered with
the inner function name "_fixture" so pytest can't find names like "env_1k";
update the decorator call in _make_fixture so the pytest fixture uses the
desired name by passing name=f"env_{_name}" to `@pytest.fixture` (i.e., change the
decorator on _fixture inside _make_fixture to `@pytest.fixture`(scope="session",
name=f"env_{_name}")), ensuring SCALES, _make_fixture and the returned fixtures
match the names requested by request.getfixturevalue(f"env_{scale_name}").

In `@tests/perf/seed.py`:
- Around line 140-145: In generate_dataset, validate inputs up front: ensure
order_count is non-negative (raise ValueError if order_count < 0) and ensure the
date window is valid (parse start_date and end_date and raise ValueError if
start_date > end_date or if parsing fails) so day_cumsum/total_day_weight cannot
be empty/zero; also apply the same precondition checks to the similar generator
at lines 160-162. Use clear ValueError messages referencing the invalid
parameter (order_count or start_date/end_date) so invalid perf params fail fast.
- Around line 313-315: The current remap assigns late-opening orders to an
arbitrary early shop, breaking the customer→shop affinity; change the branch
that handles (shop_id > late_shop_threshold and created_at < mid_date) to
instead re-roll the shop_id by sampling from that customer’s eligible early
shops (i.e., filter the customer’s weighted shop list to ids <=
late_shop_threshold and sample using the same weights) and assign that sampled
id to shop_id; if the customer has no eligible early shops, shift created_at
forward to mid_date (or the customer’s first available shop open date) so the
original late shop assignment remains valid. Ensure you reference and use the
same customer shop list and weighting logic used elsewhere in seed.py when
re-rolling to preserve the affinity distribution.
- Around line 463-475: The insert currently stringifies timestamps with
.isoformat() in the batch comprehension inside conn.execute (the INSERT INTO
orders using sa.text); change the batch payload so the "created_at",
"completed_at", and "cancelled_at" values are the native datetime objects (use
o.created_at, o.completed_at or None, o.cancelled_at or None) instead of calling
.isoformat(), so SQLAlchemy/DBAPI can bind them as TIMESTAMPs with the correct
dialect-specific processing.

---

Nitpick comments:
In `@Makefile`:
- Line 5: Add the standard phony targets "all" and "clean" to satisfy checkmake:
declare them alongside the existing .PHONY line and implement thin aliases so
"all" depends on the primary validation targets (e.g., test lint) and "clean"
removes build/test artifacts (or delegates to existing cleanup commands) using
the Makefile's existing targets and conventions (refer to .PHONY and the test,
lint, bench, bench-report, bench-csv target names to locate where to wire these
aliases).

In `@tests/perf/seed.py`:
- Around line 92-133: Replace the plain dataclasses with Pydantic v2 models:
convert Region, Shop, Customer, Order, and Dataset to inherit from
pydantic.BaseModel (import from pydantic) and use typing annotations (e.g., int,
str, datetime, list[int], Optional[datetime]) and pydantic.Field where
defaults/metadata are needed; ensure Order fields completed_at and cancelled_at
are Optional[datetime], keep avg_cost as int (cents) and avg_frequency/size as
int, and make Dataset reference lists of the Pydantic model classes
(list[Region], list[Shop], etc.) so validation/serialization are automatic. Also
add any lightweight validators (if needed) on Shop.avg_cost or Order.cost to
enforce non-negative values using model validators in the corresponding model
names (Region, Shop, Customer, Order, Dataset).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a69d49c7-bdf6-4112-bcf9-d44f9528cd0d

📥 Commits

Reviewing files that changed from the base of the PR and between 2bf0ee7 and 5538c8c.

📒 Files selected for processing (7)
  • .gitignore
  • Makefile
  • tests/perf/bench_to_csv.py
  • tests/perf/conftest.py
  • tests/perf/params.py
  • tests/perf/seed.py
  • tests/perf/test_bench.py
✅ Files skipped from review due to trivial changes (2)
  • .gitignore
  • tests/perf/params.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/perf/bench_to_csv.py
  • tests/perf/test_bench.py

Comment thread Makefile Outdated
Comment thread tests/perf/conftest.py
Comment thread tests/perf/conftest.py
Comment thread tests/perf/conftest.py Outdated
Comment thread tests/perf/seed.py
Comment thread tests/perf/seed.py
Comment thread tests/perf/seed.py
@AivanF AivanF merged commit 147f9b5 into main Apr 2, 2026
2 of 3 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Apr 7, 2026
@coderabbitai coderabbitai bot mentioned this pull request Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant