[PoC]: Database level Data Integration by BobdenOs · Pull Request #1420 · cap-js/cds-dbs

BobdenOs · 2025-11-17T21:43:02Z

Concept

Wouldn't it be nice if your database could connect to other systems to use their data ? To do this using SAP HANA Cloud it is possible to use a virtual table. It functions like a view on top of a table/view in a remote source. The underlying functioning of a remote source is a specific ODBC driver. One of the more interesting drivers is the OData driver. Which doesn't use a proprietary database connection, but rather relies on HTTP to fetch the data out of a remote system.

With the current INSERT and UPSERT implementation being JSON based. The obvious next step was to check whether postgres and sqlite could trigger an HTTP request. For sqlite with the custom functions being simple javascript functions it was clearly possible. With postgres there are a lot of extensions so of course pgsql-http exists.

Proof of Concept

First thing we need is the "other" system. Simple cds export the catalog service of the bookshop.

cd test/bookshop
cds export ./srv/cat-service.cds

Next consume the data product in the test model.

using {CatalogService} from '../../test/bookshop/apis/CatalogService';

service integration {
  entity Genres as projection on CatalogService.Genres;
  entity Books  as projection on CatalogService.ListOfBooks;
}

Using the @data.product annotation it is possible to identify the entities that are located in the "other" system. Making sure to annotate the entities with cds.persistence.exists so that the compiler will expect them to already be deployed. So ensure to make a view which uses the HTTP function of the database to download the data.

SELECT * FROM http('http://localhost:4004/browse/ListOfBooks')->>'$.value'

If only it was so simple. All the JSON functions produce string values as types have to be consistent. This is where the INSERT logic comes into play. By looking at the elements of the entity it is possible to use the input converter to create the correct SQL data type out of the JSON string values. Allowing the database to process the data as if it was provided by a native view.

SELECT cast(value->>'$.ID' as Integer) AS ID, ... FROM json_each(http('http://localhost:4004/browse/ListOfBooks')->>'$.value')

You might at this point have started to ask "why ?". As there is also service level data integration which does pretty much the same thing. Well the @cap-js/cds-dbs can do a lot of advanced features. Which are non trivial to re implement in the javascript layer. Additionally any javascript implementation of these features wouldn't have a "good" performance. It would both cost a lot of CPU cycles and a lot of memory to achieve the same results. As HANA, postgres and sqlite all are written in C/C++ with their primary focus on optimizing relational data manipulation. So when a query has to do a where, join, group by, order by, path expression or expand the databases have all the optimized tools available in native implementations.

// postgres / sqlite
await cds.ql`SELECT FROM ${Books} { * }`
// [odata] - GET /browse/ListOfBooks
await cds.ql`SELECT FROM ${Books} { *, genre { * } }`
// [odata] - GET /browse/ListOfBooks
// [odata] - GET /browse/Genres

// SAP HANA Cloud OData remote source behavior
await cds.ql`SELECT FROM ${Books} { * }`
// [odata] - GET /browse/ListOfBooks
await cds.ql`SELECT FROM ${Books} { ID }`
// [odata] - GET /browse/ListOfBooks?$select=ID
await cds.ql`SELECT FROM ${Books} { *, genre { * } }`
// [odata] - GET /browse/ListOfBooks?$expand=genre

So while this is a very simple solution it provides a lot of power. Additionally in the case of postgres it is possible to take a more robust implementation by using foreign data wrappers. Which comes with the same benefits (and drawbacks) as SAP HANA Cloud remote sources.

FYI

In the case of sqlite the http function is defined as deterministic. Depending on the interpretation of the word deterministic the function could only be called a single time, but in reality the function will be called once per query. Therefor it is safe to have it defined as deterministic. With the additional benefit that it will only be called once for an expand instead of once for each row.

Create virtual tables in postgres and sqlite

2cf25ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PoC]: Database level Data Integration#1420

[PoC]: Database level Data Integration#1420
BobdenOs wants to merge 1 commit intomainfrom
feat/data-integration

BobdenOs commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BobdenOs commented Nov 17, 2025

Concept

Proof of Concept

FYI

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant