Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion quasardb/cluster.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,10 @@ void register_cluster(py::module_ & m)
.def("compact_progress", &qdb::cluster::compact_progress) //
.def("compact_abort", &qdb::cluster::compact_abort) //
.def("wait_for_compaction", &qdb::cluster::wait_for_compaction) //
.def("endpoints", &qdb::cluster::endpoints); //
.def("endpoints", &qdb::cluster::endpoints) //
.def("validate_query", &qdb::cluster::validate_query) //
.def("split_query_range", &qdb::cluster::split_query_range); //

}

}; // namespace qdb
31 changes: 31 additions & 0 deletions quasardb/cluster.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@
#include <chrono>
#include <iostream>


namespace qdb
{

Expand Down Expand Up @@ -496,6 +497,36 @@ class cluster
return results;
}

py::object validate_query(const std::string & query_string)
{
check_open();

std::string query = query_string;
const std::string limit_string = "LIMIT 1";
query += " " + limit_string;

// TODO:
// should return dict of column names and dtypes
// currently returns numpy masked arrays
return py::cast(qdb::numpy_query(_handle, query));
}

py::object split_query_range(std::chrono::system_clock::time_point start, std::chrono::system_clock::time_point end, std::chrono::milliseconds delta)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is not too useful right now, I assume you do the regex part in the dask connector now, which is probably not ideal.

Let's keep it as-is though.

{
std::vector<std::pair<std::chrono::system_clock::time_point, std::chrono::system_clock::time_point>> ranges;

for (auto current_start = start; current_start < end; ) {
auto current_end = current_start + delta;
if (current_end > end) {
current_end = end;
}
ranges.emplace_back(current_start, current_end);
current_start = current_end;
}
return py::cast(ranges);
}


private:
std::string _uri;
handle_ptr _handle;
Expand Down
9 changes: 7 additions & 2 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -764,10 +764,15 @@ def deduplication_mode(request):
return request.param


@pytest.fixture(params=["S"], ids=["frequency=S"])
def frequency(request):
yield request.param


@pytest.fixture
def gen_index(start_date, row_count):
def gen_index(start_date, row_count, frequency):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is gen_index invoked? Would it not require a default value for this parameter? Did you check the tests pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a leftover from test_dask.py
above this line i defined frequency fixture
from it you can change value passed to gen_index if needed
default behawior is the same as before, all tests pass
its not useful right, can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for test_dask.py i needed to have data split between multiple shards
defining additional fixture provides way to override behavior for selected tests only

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I get it, it's a fixture and we have a separate frequency fixture as well which is automatically injected.

Please keep conftest.py as in-sync as possible with the one in our dask repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i will update conftest in qdb-dask-integration to match

return pd.Index(
pd.date_range(start_date, periods=row_count, freq="S"), name="$timestamp"
pd.date_range(start_date, periods=row_count, freq=frequency), name="$timestamp"
)


Expand Down