Skip to content

Latest commit

 

History

History
786 lines (610 loc) · 19.9 KB

File metadata and controls

786 lines (610 loc) · 19.9 KB
layout default
title AnythingLLM Tutorial - Chapter 6: Agents
nav_order 6
has_children false
parent AnythingLLM Tutorial

Chapter 6: Agents - Intelligent Capabilities and Automation

Welcome to Chapter 6: Agents - Intelligent Capabilities and Automation. In this part of AnythingLLM Tutorial: Self-Hosted RAG and Agents Platform, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.

Enable AI agents with tool use, function calling, and automated workflows in AnythingLLM.

Overview

AnythingLLM agents extend basic chat functionality with intelligent capabilities like tool use, function calling, and automated task execution. This chapter covers configuring and using agents for enhanced productivity.

Agent Fundamentals

What are Agents?

# Agents in AnythingLLM are AI assistants that can:
- Use tools and functions to perform actions
- Access external APIs and services
- Execute multi-step workflows
- Make decisions and take actions autonomously
- Learn from interactions and improve over time

# Key capabilities:
- Tool integration (web search, calculators, APIs)
- Function calling (structured outputs, data processing)
- Workflow automation (multi-step processes)
- Memory and context awareness
- Error handling and recovery

Agent Types

# Built-in agent types:

# Chat Agent (Default)
# - Conversational responses
# - Document Q&A
# - General assistance

# Tool Agent
# - Can use external tools
# - API integrations
# - Data processing

# Workflow Agent
# - Multi-step automation
# - Complex task execution
# - Business process automation

# Custom Agent
# - User-defined capabilities
# - Specialized for specific domains
# - Custom tool integrations

Enabling Agents

Basic Agent Setup

# Enable agents in workspace settings
# Workspace Settings > Agent Configuration
# - Enable Agent: Yes
# - Agent Type: Tool Agent
# - Model: gpt-4o (recommended for agents)
# - Max Tokens: 4096
# - Temperature: 0.3 (more deterministic)

Agent Configuration

{
  "agent": {
    "enabled": true,
    "type": "tool_agent",
    "model": "gpt-4o",
    "temperature": 0.3,
    "max_tokens": 4096,
    "system_prompt": "You are an intelligent assistant with access to various tools. Use them appropriately to help users accomplish their tasks.",
    "tools": [
      "web_search",
      "calculator",
      "file_operations",
      "api_calls"
    ]
  }
}

Tool Integration

Built-in Tools

Web Search Tool

# Enable web search capability
# Allows agents to search the internet for current information

# Configuration:
# Tools > Web Search
# - Provider: Google, Bing, or DuckDuckGo
# - API Key: (if required)
# - Safe Search: enabled
# - Max Results: 5

# Usage examples:
# "Search for the latest news about AI development"
# "Find tutorials on Docker containerization"
# "Look up the current price of Bitcoin"

Calculator Tool

# Mathematical calculations and data analysis

# Capabilities:
# - Basic arithmetic
# - Complex equations
# - Statistical functions
# - Unit conversions
# - Financial calculations

# Usage examples:
# "Calculate 15% of 1250"
# "Convert 100 USD to EUR"
# "Solve for x: 2x + 5 = 17"
# "Calculate compound interest: $1000 at 5% for 3 years"

File Operations Tool

# File system operations within the container

# Capabilities:
# - Read text files
# - List directories
# - Basic file analysis
# - Search file contents

# Security note: Limited to container filesystem
# Cannot access host system files

# Usage examples:
# "Show me the contents of config.py"
# "List all Python files in the project"
# "Search for 'TODO' comments in the codebase"

Custom Tools

API Integration Tools

# Create custom API tools
# Example: Weather API integration

import requests
import json

def get_weather(location):
    """Get current weather for a location"""
    api_key = "your-weather-api-key"
    url = f"http://api.weatherapi.com/v1/current.json?key={api_key}&q={location}"

    try:
        response = requests.get(url)
        data = response.json()

        return {
            "location": data["location"]["name"],
            "temperature": data["current"]["temp_c"],
            "condition": data["current"]["condition"]["text"],
            "humidity": data["current"]["humidity"]
        }
    except Exception as e:
        return {"error": str(e)}

# Register tool in AnythingLLM
tool_config = {
    "name": "weather",
    "description": "Get current weather information for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City name or location"
            }
        },
        "required": ["location"]
    }
}

Database Tools

# Database query tool
import sqlite3

def query_database(sql_query, db_path="data.db"):
    """Execute SQL query on database"""

    try:
        conn = sqlite3.connect(db_path)
        cursor = conn.cursor()

        cursor.execute(sql_query)
        results = cursor.fetchall()
        columns = [desc[0] for desc in cursor.description]

        conn.close()

        return {
            "columns": columns,
            "rows": results,
            "row_count": len(results)
        }
    except Exception as e:
        return {"error": str(e)}

# Usage in agent prompts:
# "Show me the top 10 customers by order value"
# "Count how many orders were placed this month"

Code Execution Tools

# Safe code execution (sandboxed)
import subprocess
import tempfile
import os

def execute_python_code(code, timeout=10):
    """Execute Python code in sandbox"""

    with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
        f.write(code)
        temp_file = f.name

    try:
        result = subprocess.run(
            ['python', temp_file],
            capture_output=True,
            text=True,
            timeout=timeout
        )

        return {
            "stdout": result.stdout,
            "stderr": result.stderr,
            "returncode": result.returncode
        }
    except subprocess.TimeoutExpired:
        return {"error": "Code execution timed out"}
    except Exception as e:
        return {"error": str(e)}
    finally:
        os.unlink(temp_file)

# Usage:
# "Execute this Python code: print('Hello, World!')"
# "Run this data analysis script on the CSV file"

Function Calling

Structured Outputs

# Function calling enables structured responses
{
  "function_call": {
    "name": "search_database",
    "arguments": {
      "query": "SELECT * FROM users WHERE active = 1",
      "limit": 10
    }
  }
}

# Benefits:
# - Consistent data formats
# - Easier parsing and processing
# - Integration with other systems
# - Reduced hallucinations

Multi-Step Workflows

# Complex workflows using function calling

# Example: Customer support workflow
workflow:
  - step: "analyze_problem"
    function: "classify_issue"
    inputs: ["customer_message"]

  - step: "gather_info"
    function: "search_knowledge_base"
    inputs: ["issue_category", "product_version"]

  - step: "generate_response"
    function: "create_support_response"
    inputs: ["analysis", "relevant_docs", "customer_history"]

  - step: "follow_up"
    function: "schedule_followup"
    inputs: ["response_quality", "customer_satisfaction"]

Agent Workflows

Automated Task Execution

# Define automated workflows

# Example: Code review workflow
code_review_workflow:
  name: "Automated Code Review"
  trigger: "pull_request.opened"
  steps:
    - name: "Static Analysis"
      tool: "eslint"
      files: "*.js,*.ts"

    - name: "Security Scan"
      tool: "security_scanner"
      config: "strict"

    - name: "Performance Check"
      tool: "performance_analyzer"
      thresholds:
        complexity: 10
        duplication: 5%

    - name: "Generate Report"
      tool: "report_generator"
      template: "code_review_report"

# Usage:
# Agent automatically runs when PR is opened
# Provides comprehensive code review
# Suggests improvements and catches issues

Business Process Automation

# Business process workflows

# Example: Order processing
order_processing_workflow:
  name: "Order Fulfillment"
  trigger: "order.created"
  steps:
    - name: "Validate Order"
      function: "validate_order"
      error_handling: "retry"

    - name: "Check Inventory"
      function: "check_inventory"
      conditional: "inventory.available"

    - name: "Process Payment"
      function: "process_payment"
      error_handling: "rollback"

    - name: "Ship Order"
      function: "create_shipment"
      success: "notify_customer"

    - name: "Update Records"
      function: "update_database"
      always: true

Conditional Logic

# Advanced workflow with conditions

support_ticket_workflow:
  name: "Support Ticket Processing"
  steps:
    - name: "Classify Ticket"
      function: "classify_ticket"
      output: "category"

    - name: "Route to Expert"
      conditions:
        - if: "category == 'technical'"
          then: "assign_to_engineering"
        - if: "category == 'billing'"
          then: "assign_to_finance"
        - if: "category == 'general'"
          then: "assign_to_support"

    - name: "Escalate if Urgent"
      conditions:
        - if: "priority == 'high' && resolution_time > 4_hours"
          then: "escalate_to_manager"

    - name: "Auto-Resolve Simple Issues"
      conditions:
        - if: "category == 'password_reset' && confidence > 0.9"
          then: "auto_resolve"

Agent Memory and Learning

Conversation Context

# Agents maintain conversation context
# Can reference previous interactions
# Build upon established knowledge

# Example conversation:
# User: "Show me sales data for Q1"
# Agent: [queries database, shows results]
# User: "Now show me the same data as a chart"
# Agent: [remembers Q1 data, creates visualization]

Learning from Feedback

# Agents can learn from user feedback

# Positive reinforcement:
# User: "That was helpful" → Agent learns successful pattern

# Negative feedback:
# User: "That's not what I meant" → Agent adjusts approach

# Explicit corrections:
# User: "I meant sales by region, not by product"
# Agent: Learns correct interpretation for future queries

Knowledge Base Updates

# Agents can suggest knowledge base improvements

# After answering a question:
# Agent: "This question wasn't well covered in the docs.
#          Should I add this to the FAQ?"

# User: "Yes"
# Agent: [adds to knowledge base, improves future answers]

Advanced Agent Features

Multi-Agent Collaboration

# Multiple agents working together

# Example: Research team
research_team:
  - name: "DataCollector"
    role: "Gather information from various sources"
    tools: ["web_search", "api_calls", "database_queries"]

  - name: "Analyzer"
    role: "Analyze and synthesize information"
    tools: ["data_analysis", "summarization"]

  - name: "Writer"
    role: "Create reports and documentation"
    tools: ["text_generation", "formatting"]

# Collaboration workflow:
# DataCollector gathers info → Analyzer processes → Writer creates output

Agent Specialization

# Specialized agents for different domains

# Code Review Agent
code_review_agent:
  model: "claude-3-5-sonnet-20241022"
  system_prompt: "You are an expert code reviewer specializing in security, performance, and best practices."
  tools: ["static_analysis", "security_scan", "performance_profiling"]

# Customer Support Agent
support_agent:
  model: "gpt-4o"
  system_prompt: "You are a patient, helpful customer support agent with access to product documentation."
  tools: ["knowledge_search", "ticket_creation", "escalation"]

# Data Analysis Agent
data_agent:
  model: "gpt-4o"
  system_prompt: "You are a data analysis expert who can query databases and create visualizations."
  tools: ["sql_execution", "data_visualization", "statistical_analysis"]

Agent Monitoring and Analytics

Performance Metrics

# Monitor agent performance
curl http://localhost:3001/api/v1/analytics/agents \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response:
{
  "agent_metrics": {
    "total_interactions": 15420,
    "success_rate": 0.94,
    "average_response_time": 3.2,
    "tool_usage": {
      "web_search": 2340,
      "calculator": 890,
      "file_operations": 456,
      "custom_tools": 1200
    },
    "error_types": {
      "tool_failure": 45,
      "timeout": 23,
      "invalid_input": 67
    }
  }
}

Usage Analytics

# Track how agents are used
curl http://localhost:3001/api/v1/analytics/agent-usage \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response:
{
  "popular_tools": [
    {"name": "web_search", "usage_count": 2340},
    {"name": "calculator", "usage_count": 890}
  ],
  "peak_usage_hours": [9, 10, 14, 15, 16],
  "user_satisfaction": 4.2,
  "feature_requests": [
    "Integration with Slack",
    "More visualization tools",
    "Custom workflow templates"
  ]
}

Security and Safety

Agent Permissions

# Control what agents can do
agent_permissions:
  file_access:
    allowed_paths: ["/app/workspace", "/tmp"]
    blocked_paths: ["/etc", "/home", "/root"]

  network_access:
    allowed_domains: ["api.example.com", "docs.example.com"]
    blocked_domains: ["malicious-site.com"]

  tool_restrictions:
    dangerous_tools: ["system_commands", "file_deletion"]
    require_approval: true

Input Validation

# Validate agent inputs
input_validation:
  sql_injection_protection: true
  path_traversal_protection: true
  command_injection_protection: true
  max_input_length: 10000
  allowed_characters: "alphanumeric + common punctuation"

Audit Logging

# Log all agent actions
audit_logging:
  enabled: true
  log_level: "detailed"
  retention_days: 90
  events_to_log:
    - "tool_execution"
    - "file_access"
    - "network_requests"
    - "error_conditions"

Troubleshooting Agents

Common Issues

# Agent not responding
# - Check model configuration
# - Verify API keys
# - Check rate limits

curl http://localhost:3001/api/v1/system/health \
  -H "Authorization: Bearer YOUR_API_KEY"

# Tools not working
# - Verify tool configuration
# - Check permissions
# - Test tool manually

# Slow responses
# - Check model selection
# - Monitor resource usage
# - Consider caching

# Inaccurate results
# - Review system prompt
# - Check document quality
# - Adjust temperature settings

Debug Mode

# Enable agent debugging
export AGENT_DEBUG=true

# Check agent decision process
curl http://localhost:3001/api/v1/debug/agent-reasoning \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "debug this agent behavior"}'

# View tool execution logs
docker logs anythingllm | grep -i "tool\|agent"

Best Practices

Agent Design Principles

# Clear responsibilities
# - One agent per major function
# - Avoid overlapping capabilities
# - Define clear boundaries

# Reliable error handling
# - Graceful failure recovery
# - Informative error messages
# - Fallback mechanisms

# User-friendly interactions
# - Clear communication
# - Progress indicators
# - Action confirmations

Performance Optimization

# Optimize agent performance:
# - Choose appropriate models for tasks
# - Implement caching for frequent operations
# - Use batch processing where possible
# - Monitor and tune resource usage

# Cost management:
# - Set usage limits
# - Use cost-effective models for routine tasks
# - Implement usage monitoring

Summary

In this chapter, we've covered:

  • Agent Fundamentals: Types and capabilities of AI agents
  • Tool Integration: Built-in and custom tools for extended functionality
  • Function Calling: Structured outputs and multi-step workflows
  • Workflow Automation: Automated task execution and business processes
  • Agent Memory: Context awareness and learning capabilities
  • Multi-Agent Systems: Collaboration and specialization
  • Monitoring: Performance metrics and usage analytics
  • Security: Permissions, validation, and audit logging
  • Troubleshooting: Common issues and debugging techniques

Key Takeaways

  1. Tool Integration: Agents extend capabilities through tools and APIs
  2. Workflow Automation: Multi-step processes for complex tasks
  3. Structured Outputs: Function calling enables reliable automation
  4. Security First: Proper permissions and validation are critical
  5. Monitoring: Track performance and usage for optimization
  6. Iterative Improvement: Learn from interactions and feedback
  7. User Experience: Clear communication and progress indicators

Next Steps

Now that you understand agents, let's explore API integration and how to programmatically access AnythingLLM.


Ready for Chapter 7? API & Integration

Generated for Awesome Code Docs

What Problem Does This Solve?

Most teams struggle here because the hard part is not writing more code, but deciding clear boundaries for name, Agent, agent so behavior stays predictable as complexity grows.

In practical terms, this chapter helps you avoid three common failures:

  • coupling core logic too tightly to one implementation path
  • missing the handoff boundaries between setup, execution, and validation
  • shipping changes without clear rollback or observability strategy

After working through this chapter, you should be able to reason about Chapter 6: Agents - Intelligent Capabilities and Automation as an operating subsystem inside AnythingLLM Tutorial: Self-Hosted RAG and Agents Platform, with explicit contracts for inputs, state transitions, and outputs.

Use the implementation notes around tools, tool, location as your checklist when adapting these patterns to your own repository.

How it Works Under the Hood

Under the hood, Chapter 6: Agents - Intelligent Capabilities and Automation usually follows a repeatable control path:

  1. Context bootstrap: initialize runtime config and prerequisites for name.
  2. Input normalization: shape incoming data so Agent receives stable contracts.
  3. Core execution: run the main logic branch and propagate intermediate state through agent.
  4. Policy and safety checks: enforce limits, auth scopes, and failure boundaries.
  5. Output composition: return canonical result payloads for downstream consumers.
  6. Operational telemetry: emit logs/metrics needed for debugging and performance tuning.

When debugging, walk this sequence in order and confirm each stage has explicit success/failure conditions.

Source Walkthrough

Use the following upstream sources to verify implementation details while reading this chapter:

  • AnythingLLM Repository Why it matters: authoritative reference on AnythingLLM Repository (github.com).
  • AnythingLLM Releases Why it matters: authoritative reference on AnythingLLM Releases (github.com).
  • AnythingLLM Docs Why it matters: authoritative reference on AnythingLLM Docs (docs.anythingllm.com).
  • AnythingLLM Website Why it matters: authoritative reference on AnythingLLM Website (anythingllm.com).

Suggested trace strategy:

  • search upstream code for name and Agent to map concrete implementation paths
  • compare docs claims against actual runtime/config code before reusing patterns in production

Chapter Connections