Skip to content

Conversation

@vikonix
Copy link
Contributor

@vikonix vikonix commented Nov 24, 2025

No description provided.

@vikonix vikonix requested a review from solatis December 9, 2025 08:25
@vikonix vikonix marked this pull request as ready for review December 9, 2025 22:37
Copy link
Contributor

@solatis solatis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't dug into the C++ part too much yet, but I'm noticing there is a lot of "stuff" in this PR that's actually not part of the task. Perhaps the cleanups / improvements are necessary, but I prefer to keep this PR focused.

Secondly: the test approach is not great and causes a lot of duplication. I think the best approach is to immediately bite the bullet: integrate this directly into pandas/__init__.py, and make it a (temporarily) flag the user can specify whether or not to use arrow.

this will also ensure we verify we can correctly accept and pull pandas dataframes without copies in the same way we currently do it with numpy.

Comment on lines 1 to 38
import numpy as np
import pytest

import quasardb


def _arrow_reader(timestamps, values):
pa = pytest.importorskip("pyarrow")

ts_array = pa.array(timestamps.astype("datetime64[ns]"), type=pa.timestamp("ns"))
value_array = pa.array(values, type=pa.float64())
batch = pa.record_batch([ts_array, value_array], names=["$timestamp", "value"])
return pa.RecordBatchReader.from_batches(batch.schema, [batch])


def _create_arrow_table(connection, entry_name):
table_name = entry_name + "_arrow"
table = connection.table(table_name)

column = quasardb.ColumnInfo(quasardb.ColumnType.Double, "value")
table.create([column])

return table


@pytest.mark.usefixtures("qdbd_connection")
def test_batch_push_arrow_with_options(qdbd_connection, entry_name):
pa = pytest.importorskip("pyarrow")

table = _create_arrow_table(qdbd_connection, entry_name)

timestamps = np.array(
[
np.datetime64("2024-01-01T00:00:00", "ns"),
np.datetime64("2024-01-01T00:00:01", "ns"),
],
dtype="datetime64[ns]",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this.

I would approach this very differently:

  • reuse the existing bulk reader type
  • but make the mechanism by which it is read parametrized

right now there's a lot, a lot of duplication of test logic.

alternatively (and preferable): wire this into pandas. make the push mechanism a (temporary?) optional flag, so that we can differentiate between the two modes. then use that as a parameter for parametrized testing.

that way you automagically hook into the hundreds if not thousands of different tests we have for pandas and numpy

this means it needs to be wired into numpy/__init__.py first, and then in pandas/__init__.py.

keywords="quasardb timeseries database API driver ",
setup_requires=[],
install_requires=["numpy"],
install_requires=["numpy", "PyArrow"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really have it as a hard dependency now? Or can it be optional?

):

index = pd.Index(
# pd.date_range(start_date, periods=row_count, freq="s"), name="$timestamp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/test_numpy.py: 634 warnings D:\work\quasar\qdb-api-python\tests\conftest.py:685: FutureWarning: 'S' is deprecated and will be removed in a future version, please use 's' instead. pd.date_range(start_date, periods=row_count, freq="S"), name="$timestamp"

return request.param


# @pytest.fixture(params=["s"], ids=["frequency=s"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants