-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Brainstorm time. Express server that has 2 endpoints:
-
/register
- Registers a callback, and you receive an id later. it
evals the payload and keep in memory. this is only for not sending the whole function every time on /request. Using the id received here calls the same evaled function using /request
- Registers a callback, and you receive an id later. it
-
/request
- FIFO Queue the request (the same RequestOptions format) in memory.
- would be on a 32GB instance with worker_threads (pool of 7-8 threads) in a work stealing manner.
- use a SharedArrayBuffer with the JSON in string format, JSON.parse inside the worker?
- Bilateral communication can work with MessageChannel. Maybe call Apify.pushData from the received message? or inside the worker? (might overload the dataset endpoint)
- Persists remaining requests to RL on migration, using the same id from register, so we know which one belongs to where and can be loaded from the KV.
- how retries are done? they come back to the queue or stay retrying until giving up?
- Registering the code from the "main" process (ie. another scraper/task) can serialize the function as in:
await fetch(/* container url register */, { body: myWorkerCode.toString() })
- since it's a fire-and-forget mechanism (and the data will be pushed to the designated dataset), how could de-duping occur?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels