pentestagent/README.md at main · LdesignLab/pentestagent

AI Penetration Testing

PentestAgent is an AI-powered penetration testing framework designed to run in a Dockerized Kali Linux environment. It orchestrates autonomous security assessments using LLMs (Claude, OpenAI, Local via LM Studio) and standard pentesting tools.

⚡ Quick Start (Docker)

The fastest way to run PentestAgent is with the Kali Docker image, which includes all tools pre-installed. We provide ARM64 images built weekly.

1. Pull the Kali Image

docker pull ghcr.io/ldesignlab/pentestagent:kali-arm64

2. Configure Authentication

Create a .env file in your project directory:

Option A: Claude Max/Pro (OAuth - Free)

CLAUDE_CODE_OAUTH_TOKEN=sk-ant-oat01-...
PENTESTAGENT_MODEL=claude-sonnet-4-20250514

Option B: Claude API (Paid)

ANTHROPIC_API_KEY=sk-ant-...
PENTESTAGENT_MODEL=claude-sonnet-4-20250514

Option C: LM Studio (Local - Free) See the Authentication Guide for local setup.

3. Run Interactive Mode

docker run -it --rm \
  --privileged \
  --network host \
  -v "$(pwd)/loot:/app/loot" \
  -v "$(pwd)/output:/app/output" \
  -e CLAUDE_CODE_OAUTH_TOKEN="$(grep CLAUDE_CODE_OAUTH_TOKEN .env | cut -d= -f2)" \
  -e PENTESTAGENT_MODEL="claude-sonnet-4-20250514" \
  ghcr.io/ldesignlab/pentestagent:kali-arm64

4. Run with Playbook (Recommended)

For structured penetration testing, use playbooks:

# Save as run_playbook.sh and make executable
./run_playbook.sh thp3_web example.com

📖 Documentation

Detailed guides organized by topic:

🚀 Getting Started

Getting Started Guide - Full walkthrough for first-time users
Quick Reference - Command cheat sheet
Authentication - OAuth, API keys, and LM Studio
Troubleshooting - Common issues and solutions

🛠️ Core Features

Modes - Assist, Agent, and Crew modes explained
Playbooks - Automated multi-phase pentesting
Tools - Built-in and MCP tool reference
Extended Thinking - Deep reasoning mode
Testing Targets - Vulnerable applications for testing

⚙️ Advanced Configuration

Docker Setup - Networking, volumes, and custom builds
Local Setup - Running without Docker
MCP Integration - Connecting to 30+ tool servers
RAG Knowledge - Customizing the knowledge base
Performance Tuning - Optimization strategies

📚 Reference

Configuration - Environment variables reference
CLI Commands - Command line usage
Extension Guide - Developing custom tools and agents
Architecture - System design and data flow

🏛️ Project Structure

pentestagent/
  agents/         # Autonomous reasoning engines
  knowledge/      # RAG system and Shadow Graph
  mcp/            # Model Context Protocol servers
  playbooks/      # Structured attack workflows
  runtime/        # Docker and Local execution
  tools/          # Pentesting tool wrappers

⚖️ Legal & License

Legal Notice: Only use against systems you have explicit authorization to test. Unauthorized access is illegal.

License: MIT

TODO

Feature Requests

CIDR/Domain input - Support for CIDR notation (e.g., 192.168.1.0/24) and domain-based target specification for bulk scanning
XMap integration - Integrate XMap for high-speed IPv4 and IPv6 network scanning
Application scanner (GRep2) - Add application-level scanning capabilities
Full JSON report export - Generate comprehensive JSON reports suitable for storage in vector databases
Gemini OAuth/user login support - Add Google Gemini authentication via OAuth token similar to Claude Max account support (may require Gemini CLI integration)
PDF/HTML report generation - Professional reporting formats for client deliverables
Scheduling/cron support - Automated recurring scans on schedule
Multi-target parallel scanning - Run against multiple targets simultaneously
Slack/Discord/webhook notifications - Real-time alerts on critical findings
Credential manager - Secure storage for discovered and provided credentials
Scope management - Explicit in-scope/out-of-scope target enforcement
Resume/checkpoint - Resume interrupted scans from last saved state
Cloud provider scanning - AWS/Azure/GCP asset discovery and security assessment
API fuzzing - Dedicated API security testing (OpenAPI/GraphQL schema-based)
Container/Kubernetes scanning - Cloud-native and container security testing

Tool Integrations

OSINT & Reconnaissance:

Shodan/Censys integration - Internet-wide scanning and asset discovery
theHarvester - Email, subdomain, and metadata OSINT
Recon-ng - Modular reconnaissance framework
Subfinder - Fast subdomain discovery
Aquatone/EyeWitness - Screenshot-based web reconnaissance
httpx - Fast HTTP probing and technology detection

Vulnerability Scanning:

Nessus/OpenVAS integration - Enterprise vulnerability scanner support
testssl.sh - Comprehensive SSL/TLS testing
SSLScan/Certbot - Certificate and SSL analysis

Post-Exploitation & C2:

Mimikatz guide - Credential dumping knowledge base
Netcat/Socat guide - Reverse shells and port forwarding
pwncat - Advanced post-exploitation framework
Covenant/Havoc/Sliver - C2 framework integration
NetExec (CrackMapExec successor) - Updated lateral movement tool

Forensics & Reverse Engineering:

Volatility - Memory forensics analysis
Ghidra/radare2 - Binary reverse engineering
YARA - Malware detection and classification
Binwalk - Firmware analysis

Network & IDS:

Snort/Suricata - IDS/IPS testing and evasion
DNSRecon - DNS enumeration and zone transfers

Mobile Pentesting:

MobSF - Mobile Security Framework for Android/iOS
Frida/Objection - Dynamic instrumentation and hooking
APKTool - Android APK reverse engineering

IoT Testing:

Firmwalker - Firmware analysis
RouterSploit - Embedded device exploitation

Social Engineering:

Gophish - Phishing campaign framework
SET (Social Engineering Toolkit) - Social engineering attacks

Reporting & Integration:

Dradis integration - Collaborative reporting platform
Faraday integration - Vulnerability management
DefectDojo integration - Security orchestration and vulnerability management

Testing & Validation

Verify RAG end-to-end with a real LLM connection (not just mocks) to ensure retrieval is correctly injected during autonomous runs.
Add an advanced pentest E2E test that exercises real LLM decision-making to validate smart tool selection and replanning under realistic conditions.
Add Metasploit E2E testing to validate exploit execution and post-exploitation workflows.
Add playbook E2E testing to validate automated multi-phase penetration testing workflows against real test targets.
Add testing with LM Studio and a variety of local/hosted models to validate compatibility and performance.
Add advanced SQL injection pentest coverage against e2e targets to ensure the LLM is actively interacting and driving the workflow.
Add an e2e test against a Windows target with a known CVE exploit to verify discovery and interaction.
Ensure e2e tests against vulnerable targets produce valid reports and findings (end-to-end reporting verification).
Test playbooks against e2e targets to validate automated workflows end-to-end.
Verify the agent generates and follows LLM-created plans during runs (plan integrity check).
Add Active Directory E2E testing (Kerberoasting, DCSync, Pass-the-Hash workflows).
Add Nuclei template E2E testing to validate vulnerability scanning workflows.
Add credential dumping E2E testing (Mimikatz, SAM extraction flows).
Add privilege escalation E2E testing (Linux/Windows privesc chains).
Add performance/load testing (token usage, memory consumption, scan duration benchmarks).
Add Crew mode E2E testing (multi-agent orchestration, ShadowGraph integration, agent coordination).

Infrastructure

CI/CD integration - GitHub Actions pipeline for Docker Kali-based tests
Metrics/telemetry - Track scan success rates, tool usage, and performance metrics
Web UI/dashboard - Visual interface alternative to TUI for easier operation
AMD64/x86 Docker image testing - Validate Docker images on Intel/AMD architectures (currently only ARM64 tested)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Penetration Testing

⚡ Quick Start (Docker)

1. Pull the Kali Image

2. Configure Authentication

3. Run Interactive Mode

4. Run with Playbook (Recommended)

📖 Documentation

🚀 Getting Started

🛠️ Core Features

⚙️ Advanced Configuration

📚 Reference

🏛️ Project Structure

⚖️ Legal & License

TODO

Feature Requests

Tool Integrations

Testing & Validation

Infrastructure

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AI Penetration Testing

⚡ Quick Start (Docker)

1. Pull the Kali Image

2. Configure Authentication

3. Run Interactive Mode

4. Run with Playbook (Recommended)

📖 Documentation

🚀 Getting Started

🛠️ Core Features

⚙️ Advanced Configuration

📚 Reference

🏛️ Project Structure

⚖️ Legal & License

TODO

Feature Requests

Tool Integrations

Testing & Validation

Infrastructure