Skip to content

Epic: Implement a Self-Configuring Provider Generation System #24

@bordumb

Description

@bordumb

Epic: Implement a Self-Configuring Provider Generation System

Labels: epic, research, architecture, agent-autonomy, llm

Opened: 2025-07-14


1. Overview & Motivation

The ARLA platform has successfully implemented a decoupled provider pattern, which separates the core agent-engine from simulation-specific logic. This is a major architectural strength. However, it still requires a human developer to write the concrete provider implementations (e.g., SoulSimRewardCalculator) for each new simulation environment.

This epic proposes the next major evolution of ARLA's architecture: to make the agents themselves responsible for creating their own providers on the fly.

The vision is to move from a "pluggable" system to a "self-configuring" one. When an agent enters a new, unseen environment, it should be able to introspect its own components and the structure of the world, reason about the "physics" of its new reality, and dynamically generate the provider code it needs to function and learn. This represents a significant step towards truly general and autonomous agents, shifting the developer's role from writing world-specific logic to designing the agent's meta-learning and self-configuration capabilities.

2. Architectural Vision: The "Provider Generation" Meta-System

The core of this feature will be a new, high-level cognitive system that runs once for each agent at the beginning of a simulation.

New System: ProviderGenerationSystem (agent-engine/systems/)

  • Purpose: To inspect the agent's components and the environment, and then use the LLM via the CognitiveScaffold to write and load the necessary provider classes (RewardCalculator, StateEncoder, etc.) at runtime.

High-Level Workflow:

  1. Introspection: At the start of a simulation, the ProviderGenerationSystem for a given agent will perform a detailed scan of:

    • Its own components (e.g., PortfolioComponent, HealthComponent).
    • The attributes of those components (e.g., cash_balance: float).
    • The components of other entities it can observe in the environment.
  2. Meta-Prompt Construction: The system will then construct a detailed, structured prompt to be sent to the LLM. This prompt is critical and will contain:

    • The Goal: A high-level objective for the agent (e.g., "My primary objective is to maximize my TimeBudgetComponent.current_time_budget.").
    • The Interfaces: The full source code of the provider interfaces it needs to implement (e.g., RewardCalculatorInterface, StateEncoderInterface). This gives the LLM a clear "contract" to fulfill.
    • The "Physics" (World Schema): A summary of the components and attributes it discovered during introspection. For example: "I have a component called PortfolioComponent with an attribute total_value: float. My goal is to maximize survival, which is tied to my TimeBudgetComponent. Therefore, a good reward signal would likely be a positive change in my PortfolioComponent.total_value."
    • The Task: A clear instruction to the LLM: "You are an expert AI research programmer. Your task is to write the complete Python code for a set of provider classes that correctly implement the given interfaces. The logic in these classes should be designed to help me achieve my primary objective given the world schema. Output only the raw, executable Python code."
  3. Dynamic Code Generation and Injection:

    • The ProviderGenerationSystem sends the prompt through the CognitiveScaffold.
    • It receives a string containing the complete Python code for the new provider classes.
    • It uses a function like exec() to execute this code in a restricted namespace, dynamically defining the new classes (e.g., AutoGeneratedRewardCalculator) in the agent's runtime memory.
    • It then instantiates these newly-defined classes and injects them into the other core systems (QLearningSystem, ActionSystem, etc.), which are waiting for their dependencies.

3. Implementation Plan

This is a major feature and should be rolled out in phases.

Phase 1: Core System and Dynamic Execution

  • Create the ProviderGenerationSystem in the agent-engine.
  • Implement the introspection logic to scan an agent's components.
  • Implement the meta-prompt construction logic.
  • Implement a safe wrapper around exec() to execute the LLM's code and retrieve the newly defined classes.

Phase 2: Dependency Injection Mechanism

  • Modify the SystemManager to support late-stage dependency injection. The manager needs to be able to hold off on fully initializing systems like QLearningSystem until after the ProviderGenerationSystem has run and created the necessary providers.
  • The ProviderGenerationSystem will need a way to pass the newly created provider instances to the SystemManager, which will then complete the initialization of the other systems.

Phase 3: Simplified Simulation Entry Point

  • Refactor the simulations/soul_sim/run.py file to remove the manual instantiation of all the SoulSim... providers.
  • Instead, the run.py file will now only need to register the ProviderGenerationSystem alongside the other core systems.

Phase 4: Safety, Validation, and Testing

  • Implement guardrails for the exec() call to minimize security risks. The executed code should be in a restricted environment with no file system or network access.
  • Before injecting a generated provider, the ProviderGenerationSystem should validate it using isinstance() to ensure it correctly implements the required interface.
  • Write unit tests for the ProviderGenerationSystem itself, likely using a mock LLM that returns a pre-written, valid provider implementation as a string.

4. Definition of Done

  • A new ProviderGenerationSystem exists in the agent-engine.
  • A simulation can be successfully launched by only registering this meta-system (and the other core systems), without any simulation-specific providers being manually instantiated in run.py.
  • An agent, when placed in the existing soul-sim environment, can dynamically generate a functional set of providers that allow it to learn and act in the world.
  • The process is robust, with clear error handling for when the LLM produces invalid or non-compliant code.
  • The new architecture is documented, explaining the workflow and the new, simplified process for creating new simulation environments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions