Skip to content

authz: default gflowd starter as admin; restrict privileged CLI/ops to admin #64

@AndPuQing

Description

@AndPuQing

Problem

  • In multi-user environments, it is easy for any local user to run the CLI binaries (gctl/gcancel/...) against a running gflowd and perform operations that affect the whole scheduler.
  • Today the HTTP API has no authentication/authorization (see existing security note in src/bin/gflowd/server.rs), so permission boundaries are mostly "by convention".

Idea (from discussion)

  • Make the OS user who starts gflowd the default "admin" identity.
  • Restrict certain binaries and/or commands to admin only (ideally enforced server-side, with CLI UX improvements).

Proposed approach

  1. Introduce an admin concept in gflowd
  • Default admin: the UNIX user that started gflowd (or a configured list of admin users).
  • Persist/admin config: e.g. in config file, or a file in the data dir created at first boot.
  1. Enforce authorization on privileged operations
  • Examples of privileged operations (to discuss):
    • Cluster/scheduler settings (e.g. allowed GPUs, group max concurrency)
    • Debug endpoints (/debug/*), or any endpoint that exposes sensitive information
    • Potentially "cancel/hold/release" for jobs not owned by the caller
  • Non-admin access should return a clear 401/403 response.
  1. CLI behavior
  • Keep enforcement on the server.
  • Optionally make certain subcommands refuse to run if not admin (best-effort client-side), but treat this as UX only.

Auth/identity options (open for discussion)

  • Minimal: an admin token stored in a root-owned file (or data-dir file) and sent as a header.
  • Better for local-only: use a Unix domain socket with filesystem permissions.
  • If remote access is needed: proper auth (mTLS / signed tokens / etc.).

Acceptance criteria

  • There is a clear way to identify "admin" vs "non-admin".
  • Privileged endpoints are protected and tested.
  • Existing single-user/local workflows remain reasonable (good error messages; easy setup).

Questions

  • Which operations should be admin-only vs job-owner-only?
  • Should this be optional behind a config flag (default off) or default on when not bound to localhost?

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiREST API and servergflowdScheduler daemonneeds-discussionNeeds discussion before implementationpriority: mediumMedium priority issuetype: featureNew feature or enhancement request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions