SQL query to show the full minimal timeline of two tables #4631
Replies: 3 comments
-
|
neat case study @Akeboshiwind 👌
This is a data-modelling smell in XTDB, a sign that that valid-time isn't being used correctly - you're effectively trying to re-create XT's entity timeline resolution but in user-space. Instead, try to migrate that table s.t. the history of each entity represents the correct current state of your config-layer at each point in time. Specifically, from your data model (which I'm aware is extracted and cut down so might not be possible IRL), I would model If you also want the valid-time periods for which there is no config layer, I'd create a config_layer record when the device is created, so that you can inner-join - -- Device that exists before the config_layer
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-01');
INSERT INTO device RECORDS {_id: 1};
PATCH INTO config_layer RECORDS {_id: 1};
COMMIT;
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-02');
INSERT INTO config_layer RECORDS {_id: 1, config: 'a'};
COMMIT;
-- Device with no config_layer
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-03');
INSERT INTO device RECORDS {_id: 2};
PATCH INTO config_layer RECORDS {_id: 2};
COMMIT;
-- Device with multiple overlaps
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-04');
INSERT INTO device RECORDS {_id: 3};
PATCH INTO config_layer RECORDS {_id: 3};
COMMIT;
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-05');
INSERT INTO config_layer RECORDS {_id: 3, config: 'b', _valid_to: DATE '2024-01-06'};
COMMIT;
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-07');
INSERT INTO config_layer RECORDS {_id: 3, config: 'c', _valid_to: DATE '2024-01-08'};
COMMIT;
-- config_layer before device
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-09');
INSERT INTO config_layer RECORDS {_id: 4, config: 'd'};
COMMIT;
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-10');
INSERT INTO device RECORDS {_id: 4};
PATCH INTO config_layer RECORDS {_id: 4};
COMMIT;
-- overlapping config_layer
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-11');
INSERT INTO device RECORDS {_id: 5};
PATCH INTO config_layer RECORDS {_id: 5};
COMMIT;
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-12');
INSERT INTO config_layer RECORDS {_id: 5, config: 'e'};
COMMIT;
BEGIN READ WRITE WITH (SYSTEM_TIME = DATE '2024-01-13');
INSERT INTO config_layer RECORDS {_id: 5, config: 'f', _valid_to: DATE '2024-01-14'};
COMMIT;You can then use XT's built in OVERLAPS and period-intersection ( SETTING DEFAULT VALID_TIME ALL
SELECT d._id device_id,
c.config AS config_layer_id,
d._valid_time * c._valid_time AS valid_time
FROM device d
JOIN config_layer c
ON d._id = c._id
AND d._valid_time OVERLAPS c._valid_timewhich I can confirm returns the same results as your original setup 🙂 Again, depending on your real-world scenario, you might be even able to inline the HTH! James |
Beta Was this translation helpful? Give feedback.
-
|
Just spotted that QuestDB has a
|
Beta Was this translation helpful? Give feedback.
-
|
Relevant paper: Extending the kernel of a relational DBMS with comprehensive support for They introduce a few new bits of SQL syntax, crucially including If we added our own |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
What we're building 📋
Requirements:
devicewith its activeconfig_layer(if any) across all time periodsconfig_layers overlap, use the most recently inserted onedeviceexists but noconfig_layerapplies (these should shownull)devicebeforeconfig,configbeforedevice, gaps betweenconfigsInput: Two bitemporal tables
devices- entities that exist over timeconfig_layers- configuration that can override device defaults, with device_id foreign keyExpected output: Complete timeline showing which config (if any) applies to each device at every point in time
Why not just
FULL OUTER JOIN? 🤔You might think "just do a
FULL OUTER JOINbetweendeviceandconfig_layer" - but that misses a crucial piece!A
FULL OUTER JOINonly gives you periods where either adeviceexists or aconfig_layerexists, but it doesn't handle the gaps properly.Consider device 1: it exists from 2024-01-01, but config layer 'a' only starts at 2024-01-02. There's a period [2024-01-01,2024-01-02] where the device exists but has no config layer -
FULL OUTER JOINwould miss this entirely because it's not looking at the temporal segments, just the overlapping records.Same issue with device 3: it has gaps between config layers where the device exists but no config applies. We need those
nullconfig_layer periods in our results!That's why we need to build the time segments first, then join everything onto those segments.
I've setup various scenarios here:
Setup
Here's our query (Play):
It's chunky, but I challenge you to come up with something similar (or do it better without XTDB)!
These are the results it produces:
Step by step we:
devicedevicedevice&config_layersconfig_layers to use (in this case: of the ones that overlay, the last inserted)Beta Was this translation helpful? Give feedback.
All reactions