[PROPOSAL] Experiment with Human-Agent Maintainer collaboration through an experimental plugin - not in default distribution

## What/Why

The core of what this proposal gets at is how we can adapt software development in a world of higher velocity changes with the help of AI tools. The core repo is already delegating to claude for deciding whether CI checks should be run automatically on a PR or not. I do mean to provoke discussion as I can imagine there would be many opinions on the subject. I think this will come up for discussion at some point in the future so I'm opting to open this for discussion now.

### What are you proposing?

I am proposing an experiment in allowing AI agents to act as maintainers for a narrowly scoped OpenSearch project, including exercising judgment around pull request review, approval, and merge decisions.

To safely explore this model, I propose creating a new OpenSearch Dashboards plugin, potentially paired with a backend plugin, for **Searchable Photo Albums** (or collaborative document editing with searchable repository - quip-replacement and hackathon ideas for custom plugins). This project would not ship in the default OpenSearch distribution. Instead, it would serve as a sandbox for testing a new software development model in which the community drives direction and AI agents participate as active maintainers on the community’s behalf.

The core capability being proposed is not just the photo album functionality itself, but the governance and development workflow around it: agents would help triage issues, review contributions, propose changes, and eventually approve and merge pull requests according to project policy, observed community feedback, and maintainer guardrails.

### What users have asked for this feature?

This proposal is primarily motivated by an emerging need rather than a single direct feature request: OpenSearch and other open source communities are beginning to experience AI not only as a coding assistant, but as a potential participant in the software lifecycle.

There is growing interest in:
- using AI to triage issues and summarize design tradeoffs,
- accelerating contribution review and reducing maintainer bottlenecks,
- exploring whether agents can safely act with limited delegated authority,
- understanding what governance model is needed when AI begins making project-shaping decisions.

This proposal is intended as an experiment to answer those questions in a practical, observable, and low-risk environment rather than in a core repository.

At the moment, this is better framed as a forward-looking project proposal and community experiment than as a response to a specific backlog item. The purpose is to give the community something concrete to evaluate.

### What problems are you trying to solve?

This proposal is trying to solve both a development-process problem and a governance problem.

Open source projects increasingly face maintainer bandwidth constraints, contribution review latency, and difficulty scaling community input into timely technical decisions. AI agents may be able to help, but today they are mostly used informally and without a clear operational model.

This proposal explores whether OpenSearch can safely support a model where agents take on limited maintainer responsibilities in a sandboxed environment.

Core user needs:

- When a repository receives many contributions and design suggestions, a community maintainer wants to delegate portions of triage and review to an AI agent, so they can reduce review bottlenecks and keep the project moving.
- When community members disagree on a feature direction, a project participant wants the project to apply a transparent and consistent decision process, so they can understand why one path was chosen over another.
- When AI is given increased responsibility in an open source project, the community wants to observe that behavior in a low-risk sandbox, so they can evaluate trust, safety, and effectiveness before considering broader adoption.
- When new development workflows enabled by agentic AI emerge, the OpenSearch community wants to experiment intentionally rather than accidentally, so they can shape the governance model instead of reacting to it later.

### What is the developer experience going to be?

The developer experience has two dimensions:

1. **The sample application itself** — a new Searchable Photo Albums plugin for OpenSearch Dashboards, and potentially a companion backend plugin.
2. **The agent-maintainer workflow** — contributors interact with the repository much like a normal project, but some repository actions are performed by agents according to defined policy.

For the governance experiment, there may also be repository-level automation and policy configuration, for example:
- agent-readable contribution guidelines,
- merge policy configuration,
- labeling and triage rules,
- escalation paths where the agent must defer to humans,
- audit trails for why an agent approved or merged a change.

This proposal does not require changes to existing OpenSearch core REST APIs. The initial scope is intended to be additive and isolated to the new plugin repository or repositories.

#### Are there any security considerations?

Yes. The security considerations are significant, even though this is a sandboxed experiment.

The main concern is not end-user data access in the sample application, but the authority delegated to agents in the repository workflow.

Security considerations include:
- restricting agent permissions to a dedicated repository or repositories,
- requiring explicit policy around what agents may approve or merge,
- maintaining auditability for agent decisions,
- ensuring escalation to humans for sensitive changes,
- preventing prompt injection or malicious contribution patterns from manipulating agent behavior,
- defining whether agents can approve only after tests pass and policy checks succeed,
- defining rollback and override mechanisms for human maintainers.

For the sample plugin itself, any APIs should integrate with the OpenSearch security model in the same way as other plugins. If the plugin stores album or photo metadata, access control should be explicit and consistent with existing security expectations in Dashboards and OpenSearch.

### What is the contributor experience going to be?

Contributors would open issues and pull requests as usual. The difference is that AI agents would participate in project maintenance responsibilities such as:
- triaging issues,
- labeling and categorizing proposals,
- summarizing tradeoffs in PRs,
- requesting changes,
- approving PRs when policy conditions are met,
- merging PRs when the agent determines the change aligns with project direction and repository rules.

The intent is for the community to continue driving the direction of the project. The agent is not meant to invent project goals independently; it is meant to interpret community input, resolve ambiguity where possible, and act within clearly defined policy boundaries.

Example user stories:
- As a contributor, I can open a PR to add album search filters and receive actionable feedback from the agent.
- As a community member, I can participate in issue discussion and expect that the agent will consider that feedback in its decision-making.
- As a human maintainer, I can set policy boundaries that determine when the agent may act autonomously and when it must escalate.
- As an observer, I can inspect why a PR was approved or merged by the agent.

### Why should it be built? Any reason not to?

It should be built because the OpenSearch community will likely need an opinion on how agentic AI fits into open source development, and that opinion will be stronger if it is informed by real experience rather than theory.

This experiment would create value by:
- giving the community a concrete environment to evaluate AI maintainership,
- generating lessons about trust, governance, review quality, and contributor experience,
- helping define boundaries for where agent autonomy is useful and where it is unsafe,
- exploring whether maintainership can scale without overburdening human reviewers,
- showing how OpenSearch can adapt to new software development models instead of being shaped by them passively.

Reasons not to build it, or risks to acknowledge:
- community members may reject the legitimacy of AI making maintainer decisions,
- agent decisions may be inconsistent, overly conservative, or overly aggressive,
- contributors may attempt to manipulate the agent’s behavior,
- bad merges or regressions may still occur even in a sandbox,
- the experiment may consume community attention without producing a reusable model,
- governance questions may prove harder than the technical implementation.

Those risks are exactly why a standalone, non-default, sandboxed project is the right place to start.

### What will it take to execute?

Execution will require both technical implementation and governance design.

On the technical side:
- create a new Dashboards plugin repository, and possibly a backend companion plugin,
- define the initial feature scope for Searchable Photo Albums,
- establish CI, testing, contribution guidelines, and release processes,
- implement repository automation and agent integration,
- define audit/logging of agent decisions,
- create guardrails for approval and merge authority.

On the governance side:
- define what authority the agent has,
- define when the agent must escalate to humans,
- define what signals count as “community direction,”
- define how conflicts between vocal minority and broader community interest are handled,
- define rollback procedures when the agent makes a poor decision.

Assumptions and constraints:
- the agent will not have unrestricted authority across OpenSearch repositories,
- the experiment is limited to one or two dedicated repositories,
- the community is willing to tolerate some ambiguity and iteration in the governance model,
- human maintainers remain ultimately responsible for the repository and can intervene.

### Any remaining open questions?

Yes. The most important open questions are governance questions.

Examples include:
- What exact permissions should an agent maintainer have on day one?
- Should agents be allowed to merge only low-risk changes initially?
- Should agent approvals count the same as human approvals?
- What evidence should an agent use to infer community preference?
- How should an agent behave when the community is split?
- What transparency standards should apply to an agent’s reasoning and decisions?
- How should the community appeal or reverse an agent’s decision?
- Should there be one general-purpose agent or multiple specialized agents?
- What repository health metrics would determine whether the experiment is succeeding?

Longer term, if the experiment is successful, the community may want to explore:
- shared agent-maintainer infrastructure for other sandbox repositories,
- agent-assisted release management,
- agent-assisted issue triage across multiple projects,
- policy-driven autonomy levels depending on repository criticality.

For now, those should remain out of scope. The immediate goal is to validate whether AI agents can function as maintainers in a constrained, observable, community-driven environment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Experiment with Human-Agent Maintainer collaboration through an experimental plugin - not in default distribution #475

What/Why

What are you proposing?

What users have asked for this feature?

What problems are you trying to solve?

What is the developer experience going to be?

Are there any security considerations?

What is the contributor experience going to be?

Why should it be built? Any reason not to?

What will it take to execute?

Any remaining open questions?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[PROPOSAL] Experiment with Human-Agent Maintainer collaboration through an experimental plugin - not in default distribution #475

Description

What/Why

What are you proposing?

What users have asked for this feature?

What problems are you trying to solve?

What is the developer experience going to be?

Are there any security considerations?

What is the contributor experience going to be?

Why should it be built? Any reason not to?

What will it take to execute?

Any remaining open questions?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions