Skip to content

Validate the wire-protocol for the remote agent #313

@gmalecha-at-skylabs

Description

@gmalecha-at-skylabs

Background

Note: more triage required.

Currently:

  • client<->server interaction
    • rocq-pipeline/.../microrpc/tunnel.py:
      • WsMux + WsServer
        • client can only send but not receive; server is the converse
      • re. client<->server communication:
        • Rocq process is run on the client, but server is actually "driving" the interaction
        • client:
          • initiates a connection with the (downstream/private) server via (?) a setup request
          • replays rocq actions based on server actions, forwarding results to the server
        • b) the server drives the interaction the
        • ==> rocq-pipeline/.../microrpc/duplex.py was added to bridge this gap
  • wire proto. <-> deserialization
    • microrpc/ defines custom (de)serializers, but:
      • risk: leakage of private / internal data; no logic (in main or this PR) that controls what gets (de)serialized in different contexts; examples:
        • tracebacks from server that might contain private internals
        • configs from clients/server that might contain env variables
        • ...
      • does not utilize "full" json interfaces (i.e. only adding a slim JSONEncoder, but not matching JSONDecoder interface
      • does not uniformly handle nested types:
        • special casing for certain types at the top-level of serialization (rdm_api.Err, rdm_api.Resp`)
        • instead of forcing clients to request deserialization at a particular type (based on client<->server proto), it uses type-tags in a non-uniform way to support deserialization
      • special casing for exception types (which can be removed)
      • dataclasses vs. BaseModel
        • currently: a lot of special case logic to deal with the few dataclasses we currently have, but which could/should become BaseModels
        • dataclasses should (almost always) be prohibited from use at deserialization time since they don't have validators

Related GH PRs

  • Some investigation about consolidating tunnel and duplex. #219
    • summary:
      • reduce special/casing for client/server in microrpc by moving to a JsonRPC based approach
      • unified (de)serialization logic (that can ideally be reused)
    • background:
      • removes tunnel.py
      • in duplex.py: unifies the client/server sides into a single interface
      • alternatives:
        • use two websocket connections (one for each direction)
        • use JsonRPC instead -- which already handles the client/server details
  • refactor(ocaml-jsonrpc-tp/python): use pydantic BaseModel for Err + Resp types #282 (+ downstream PR)
    • summary:
      • move more types to BaseModel
      • simplify/unify some of the (de)serialization code, but not all
      • tweak client<->server connection to use the above
    • blocker: these changes modify the wire protocol in an incompatible w.r.t released docker images

Technical Notes

  • microrpc/ is meant to be generic (at least with Gregory's PR)
  • ocaml-jsonrpc-tp auto-generates the RDM methods/types, so we could also try to autogenerate schemas for the args/retval of methods
  • rocq-agent-toolkit-utils would be a good place to upstream/centralize many of the common utilities related to:
    • schemas+versioning
    • (de)serialization
    • meta-programming

Open Design Questions

  • (how) can the transport be decoupled from the (de)serialization?
  • re. (de)serialization:
    • who -- between client/server -- should own the (de)serialization logic?
    • how to ensure that server-side modifications to data formats/APIs/etc... only break clients infrequently?
    • reflection: we could employ version negotiation, but for now that might be over-complicating things

Discussion

  • policy of backwards compatibility for constituent types might take us quite far, but this doesn't help with code-level API as much
  • additionally, we could avoid writing the protocols for methods by hand (cf. protocol.py file from Gregory's PR):
    • we could autogenerate BaseModels that codify the protocol
    • ultimately: we could harden/stabilize the

Goal

  • conclude issue triaging:
    • reach consensus answers to the above open questions
    • map these answers to the open PRs
    • clarify the relationship between the two PRs
  • execute on the above PRs that change change the wire proto. via:
    • (de)serialization changes
    • client<->server proto changes
  • re-release a new image to clients in conjunction with a server re-deploy

Technical Details

Metadata

Metadata

Labels

help wantedExtra attention is neededtriageRequires triage

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions