Skip to content

Commit 2f57fbb

Browse files
committed
DM-52370: Extended replica allocation services
The new services extend chunk allocations (placement) to include all existing replicas for the chunks in question instead of just one allocation per chunk. The extended services are meant to be used by the ingest workflows to push multiple replicas of a chunk in scenarios where the replication level is bigger than 1 and multiple replicas already exist in Qserv. The older services would fail in such scenarios. Minor refactoring on the older code. Fixed a bug in the JSON schema of of the chunks allocation service that was returning the key "location" instead of "locations". Migrated the integration test accordingly. Fixed a bug in the integration test. Migratd the documentation on the Ingest API.
1 parent ebcba99 commit 2f57fbb

File tree

6 files changed

+423
-79
lines changed

6 files changed

+423
-79
lines changed

doc/ingest/api/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11

22
.. note::
33

4-
Information in this guide corresponds to the version **51** of the Qserv REST API. Keep in mind
4+
Information in this guide corresponds to the version **52** of the Qserv REST API. Keep in mind
55
that each implementation of the API has a specific version. The version number will change
66
if any changes to the implementation or the API that might affect users will be made.
77
The current document will be kept updated to reflect the latest version of the API.

doc/ingest/api/reference/rest/controller/table-location.rst

Lines changed: 120 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -119,10 +119,66 @@ The service also supports an alternative method accepting a transaction identifi
119119
120120
If a request succeeded, the System would respond with the following JSON object:
121121

122+
.. code-block::
123+
124+
{ "location" : {
125+
"chunk" : <number>,
126+
"worker" : <string>,
127+
"host" : <string>,
128+
"host_name" : <string>,
129+
"port" : <number>,
130+
"http_host" : <string>,
131+
"http_host_name" : <string>,
132+
"http_port" : <number>
133+
},
134+
...
135+
}
136+
137+
Where, the object represents a worker where the Ingest system requests the workflow to forward the chunk contributions.
138+
See an explanation of the attributes in:
139+
140+
- :ref:`table-location-connect-params`
141+
142+
143+
.. _table-location-chunks-one-multi:
144+
145+
Single chunk allocation (all replicas of a chunk)
146+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147+
148+
The following service is meant to be used for allocating/locating (potentially) multiple replicas of a single chunk-multi:
149+
150+
.. list-table::
151+
:widths: 10 90
152+
:header-rows: 0
153+
154+
* - ``POST``
155+
- ``/ingest/chunk-multi``
156+
157+
Where the request object has the following schema, in which a client would have to provide the name of a database:
158+
159+
.. code-block::
160+
161+
{ "database" : <string>,
162+
"chunk" : <number>
163+
}
164+
165+
The service also supports an alternative method accepting a transaction identifier (transactions are always associated with the corresponding databases):
166+
167+
.. code-block::
168+
169+
{ "transaction_id" : <number>,
170+
"chunk" : <number>
171+
}
172+
173+
**Note** the difference in the object schema - unlike the single-chunk allocator, this one expects an array of chunk numbers.
174+
175+
If a request succeeded, the System would respond with the following JSON object:
176+
122177
.. code-block::
123178
124179
{ "locations" : [
125-
{ "worker" : <string>,
180+
{ "chunk" : <number>,
181+
"worker" : <string>,
126182
"host" : <string>,
127183
"host_name" : <string>,
128184
"port" : <number>,
@@ -134,11 +190,12 @@ If a request succeeded, the System would respond with the following JSON object:
134190
]
135191
}
136192
137-
Where, the object represents a worker where the Ingest system requests the workflow to forward the chunk contributions.
193+
Where, each object in the array represents a particular worker where the corresponding replica of the chunk is located.
138194
See an explanation of the attributes in:
139195

140196
- :ref:`table-location-connect-params`
141197

198+
142199
.. _table-location-chunks-many:
143200

144201
Multiple chunks allocation
@@ -161,7 +218,7 @@ Where the request object has the following schema, in which a client would have
161218
"chunks" : [<number>, <number>, ... <number>]
162219
}
163220
164-
Like the above-explained case of the single chunk allocation service, this one also supports an alternative method accepting
221+
Like the above-explained case of other chunk allocation service, this one also supports an alternative method accepting
165222
a transaction identifier (transactions are always associated with the corresponding databases):
166223

167224
.. code-block::
@@ -194,6 +251,66 @@ Where, each object in the array represents a particular worker. See an explanati
194251

195252
- :ref:`table-location-connect-params`
196253

254+
255+
256+
.. _table-location-chunks-many-multi:
257+
258+
Multiple chunks allocation (all replicas of each chunk)
259+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
260+
261+
For allocating/locating all available replicas of each chunk in the requested collectionone would have to use the following service:
262+
263+
.. list-table::
264+
:widths: 10 90
265+
:header-rows: 0
266+
267+
* - ``POST``
268+
- ``/ingest/chunks-multi``
269+
270+
Where the request object has the following schema, in which a client would have to provide the name of a database:
271+
272+
.. code-block::
273+
274+
{ "database" : <string>,
275+
"chunks" : [<number>, <number>, ... <number>]
276+
}
277+
278+
Like the above-explained case of other chunk allocation services, this one also supports an alternative method accepting
279+
a transaction identifier (transactions are always associated with the corresponding databases):
280+
281+
.. code-block::
282+
283+
{ "transaction_id" : <number>,
284+
"chunks" : [<number>, <number>, ... <number>]
285+
}
286+
287+
**Note** the difference in the object schema - unlike the single-chunk allocator, this one expects an array of chunk numbers, where
288+
each chunk may have multiple replicas. In the later case the service will return multiple entries for the same chunk number.
289+
290+
The resulting object has the following schema:
291+
292+
.. code-block::
293+
294+
{ "locations" : [
295+
{ "chunk" : <number>,
296+
"worker" : <string>,
297+
"host" : <string>,
298+
"host_name" : <string>,
299+
"port" : <number>,
300+
"http_host" : <string>,
301+
"http_host_name" : <string>,
302+
"http_port" : <number>
303+
},
304+
...
305+
]
306+
}
307+
308+
Where, each object in the array represents a particular worker. See an explanation of the attributes in:
309+
310+
- :ref:`table-location-connect-params`
311+
312+
313+
197314
.. _table-location-connect-params:
198315

199316
Connection parameters of the workers

python/lsst/qserv/admin/replication_interface.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -330,8 +330,9 @@ def ingest_chunk_config(self, transaction_id: int, chunk_id: str) -> ChunkLocati
330330
)
331331
),
332332
)
333+
loc = res["location"]
333334
return ChunkLocation(
334-
res["chunk"], res["host"], str(res["port"]), res["http_host"], str(res["http_port"])
335+
loc["chunk"], loc["host"], str(loc["port"]), loc["http_host"], str(loc["http_port"])
335336
)
336337

337338
def ingest_chunk_configs(self, transaction_id: int, chunk_ids: list[int]) -> list[ChunkLocation]:
@@ -363,7 +364,7 @@ def ingest_chunk_configs(self, transaction_id: int, chunk_ids: list[int]) -> lis
363364
ChunkLocation(
364365
loc["chunk"], loc["host"], str(loc["port"]), loc["http_host"], str(loc["http_port"])
365366
)
366-
for loc in res["location"]
367+
for loc in res["locations"]
367368
]
368369

369370
def ingest_regular_table(self, transaction_id: int) -> list[RegularLocation]:

0 commit comments

Comments
 (0)