Skip to content

nik-kale/AutoOPS-Architect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AutoOps Architect

Python 3.11+ License: MIT CI codecov Code style: ruff

Zero/low-code meta-agent that designs and runs autonomous SRE & ops workflows from natural language goals.

AutoOps Architect takes your operations goals (like "investigate elevated 5xx errors") and automatically generates executable workflow graphs that collect data, analyze issues, and recommend actions - all without writing code.

Why AutoOps Architect?

  • Natural Language to Action: Describe what you want to investigate, and the system creates a structured workflow
  • Composable Workflows: Generated workflows are DAGs (directed acyclic graphs) with clear dependencies
  • Pluggable Tools: Integrate with your existing monitoring, ticketing, and automation systems
  • Institutional Memory: Learn from past investigations to improve future workflows
  • Human-in-the-Loop: Dangerous operations require explicit approval

Quick Start

Installation

Option 1: Docker (Recommended)

# Clone the repository
git clone https://github.com/nik-kale/AutoOPS-Architect.git
cd AutoOPS-Architect

# Set your API key
export OPENAI_API_KEY="sk-..."
# or
export ANTHROPIC_API_KEY="sk-..."

# Start with Docker Compose (includes Redis cache)
docker-compose up -d

# Access web UI at http://localhost:8000
# View logs
docker-compose logs -f autoops

# Stop services
docker-compose down

Option 2: Pip Install

# Clone the repository
git clone https://github.com/nik-kale/AutoOPS-Architect.git
cd AutoOPS-Architect

# Install with pip
pip install -e .

# Or with development dependencies
pip install -e ".[dev]"

Option 3: Docker Run (Standalone)

# Build image
docker build -t autoops-architect .

# Run container
docker run -d \
  -p 8000:8000 \
  -e OPENAI_API_KEY="sk-..." \
  --name autoops \
  autoops-architect

# Access at http://localhost:8000

Basic Usage

# Generate a workflow plan from a goal
autoops plan "Investigate elevated 5xx errors for the checkout service in prod"

# Execute a workflow file
autoops run workflow.json

# Plan and execute in one step
autoops plan-and-run "Check why login API is slow" --service auth-api --env production

Example Output

AutoOps Architect
Planning workflow for:
Investigate elevated 5xx errors for the checkout service in prod

Generated workflow: Investigate 5xx Errors Workflow
ID: wf-5xx-investigation-001
Nodes: 6, Edges: 5

Workflow Steps
β”œβ”€β”€ πŸ”§ [log_collection] Collect service logs
β”œβ”€β”€ πŸ”§ [metric_query] Query error rate metrics
β”œβ”€β”€ πŸ”§ [analysis] Analyze collected data
β”œβ”€β”€ πŸ”§ [rca_call] Run root cause analysis
β”œβ”€β”€ πŸ”§ [summary] Generate investigation summary
└── πŸ”§ [ticket_create] Create tracking ticket

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         User Goal                                β”‚
β”‚        "Investigate elevated 5xx errors for checkout"           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Architect (Planner)                       β”‚
β”‚                                                                  β”‚
β”‚  β€’ Parses goal                                                   β”‚
β”‚  β€’ Retrieves similar past workflows from memory                  β”‚
β”‚  β€’ Uses LLM to generate workflow graph                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        WorkflowGraph (DAG)                       β”‚
β”‚                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚ Collect  │──▢│ Analyze  │──▢│   RCA    │──▢│ Summary  β”‚     β”‚
β”‚  β”‚  Logs    β”‚   β”‚  Data    β”‚   β”‚  Call    β”‚   β”‚          β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Executor                                 β”‚
β”‚                                                                  β”‚
β”‚  β€’ Topologically sorts nodes                                     β”‚
β”‚  β€’ Executes nodes via Tools                                      β”‚
β”‚  β€’ Handles failures and approvals                                β”‚
β”‚  β€’ Collects results                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Memory Backend                               β”‚
β”‚                                                                  β”‚
β”‚  β€’ Stores workflow history                                       β”‚
β”‚  β€’ User preferences                                              β”‚
β”‚  β€’ Successful playbooks for reuse                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Ecosystem Integration

AutoOps Architect is designed to work with the broader AutoOps ecosystem:

Component Description Status
AutoRCA-Core AI-powered root cause analysis πŸ”œ Integration ready
Secure-MCP-Gateway Secure tool execution (Jira, Slack, etc.) πŸ”œ Integration ready
Ops-Agent-Desktop Browser-based mission automation πŸ”œ Integration ready
Autonomous Ops Hub Central orchestration platform πŸ“‹ Planned

Configuration

Environment Variables

# LLM Provider (auto-detected if not set)
export OPENAI_API_KEY="sk-..."
# or
export ANTHROPIC_API_KEY="sk-..."

# Optional integrations
export AUTORCA_URL="http://localhost:8080"
export MCP_GATEWAY_URL="http://localhost:3000"
export OPS_AGENT_DESKTOP_URL="http://localhost:9090"

Using Different LLM Providers

from autoops_architect.llm import LLMConfig, LLMProvider
from autoops_architect.planner import Architect, PlannerConfig

# Use OpenAI
config = PlannerConfig(
    llm_config=LLMConfig(
        provider=LLMProvider.OPENAI,
        model="gpt-4",
    )
)

# Use Anthropic
config = PlannerConfig(
    llm_config=LLMConfig(
        provider=LLMProvider.ANTHROPIC,
        model="claude-3-sonnet-20240229",
    )
)

# Use mock (for testing)
config = PlannerConfig(
    llm_config=LLMConfig(provider=LLMProvider.MOCK)
)

architect = Architect(config=config)

LLM Response Caching

AutoOps Architect supports transparent LLM response caching to reduce costs and improve latency. When enabled, identical planning requests return cached responses instantly.

Benefits:

  • 50%+ cost reduction for repeated goals
  • 70% faster response times for cached queries
  • Supports memory, filesystem, and Redis backends
from autoops_architect.llm.cache import CacheConfig
from autoops_architect.planner import Architect, PlannerConfig

# Enable caching with in-memory backend (default)
config = PlannerConfig(
    cache_config=CacheConfig(
        enabled=True,
        backend="memory",      # Options: memory, filesystem, redis
        ttl_seconds=3600,      # Cache entries valid for 1 hour
        max_size=1000,         # Maximum cached entries
    )
)

# Filesystem cache (persists across restarts)
config = PlannerConfig(
    cache_config=CacheConfig(
        enabled=True,
        backend="filesystem",
        cache_dir="~/.cache/autoops-architect",
        ttl_seconds=7200,      # 2 hours
    )
)

# Redis cache (for distributed deployments)
config = PlannerConfig(
    cache_config=CacheConfig(
        enabled=True,
        backend="redis",
        redis_url="redis://localhost:6379/0",
        ttl_seconds=3600,
    )
)

architect = Architect(config=config)

# Force bypass cache for specific requests
workflow = await architect.llm_client.complete_json(
    messages,
    force_refresh=True  # Ignores cache, makes fresh LLM call
)

# Get cache statistics
if isinstance(architect.llm_client, CachedLLMClient):
    stats = architect.llm_client.get_cache_stats()
    print(f"Cache hit rate: {stats['hit_rate']:.1%}")

Cache backends comparison:

Backend Persistence Distributed Use Case
memory No No Single-process, development
filesystem Yes No Single-server, persists across restarts
redis Yes Yes Multi-server, production deployments

Note: Redis backend requires pip install redis to be installed separately.

CLI Reference

# Plan a workflow
autoops plan "Your goal here" [OPTIONS]
  --service, -s    Target service(s)
  --env, -e        Environment (prod/staging/dev)
  --priority, -p   Priority level
  --output, -o     Output file path
  --yaml           Output as YAML
  --mock           Use mock LLM
  --mermaid        Show Mermaid diagram

# Run a workflow
autoops run <workflow.json> [OPTIONS]
  --dry-run, -n    Simulate execution
  --auto-approve   Auto-approve all requests

# Plan and run
autoops plan-and-run "Your goal" [OPTIONS]

# View history
autoops history [OPTIONS]
  --limit, -n      Number of entries
  --search, -q     Search keywords

# List available tools
autoops tools [OPTIONS]
  --all, -a        Show disabled tools

# Validate a workflow
autoops validate <workflow.json>

# Version info
autoops version

Project Structure

autoops-architect/
β”œβ”€β”€ src/autoops_architect/
β”‚   β”œβ”€β”€ models/          # Data models (Goal, Workflow, etc.)
β”‚   β”œβ”€β”€ planner/         # Architect/meta-agent logic
β”‚   β”œβ”€β”€ executor/        # Workflow execution engine
β”‚   β”œβ”€β”€ tools/           # Tool interface and implementations
β”‚   β”œβ”€β”€ memory/          # Memory backends
β”‚   β”œβ”€β”€ llm/             # LLM client abstraction
β”‚   └── cli.py           # CLI application
β”œβ”€β”€ tests/               # Test suite
β”œβ”€β”€ examples/            # Example goals and workflows
β”œβ”€β”€ docs/                # Documentation
└── pyproject.toml       # Project configuration

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=autoops_architect

# Type checking
mypy src/autoops_architect

# Linting
ruff check src/

# Format code
ruff format src/

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Good First Issues

  • Add a new tool integration
  • Create example workflows for common scenarios
  • Improve documentation
  • Add more test cases

Roadmap

See docs/roadmap.md for the detailed roadmap including:

  • Phase 2: UI, templates, and real integrations
  • Phase 3: Code quality, performance, and CI/CD
  • Phase 4: Security, safety, and QA
  • Phase 5: Ecosystem and community features

License

MIT License - see LICENSE for details.

Acknowledgments

Built with:

About

Zero/low-code meta-agent that designs and runs autonomous SRE & ops workflows from natural language goals.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors