Skip to content

Latest commit

 

History

History
201 lines (161 loc) · 9.93 KB

File metadata and controls

201 lines (161 loc) · 9.93 KB
PentestAgent Logo

AI Penetration Testing

Tests Docker Python License Version Security MCP

PentestAgent is an AI-powered penetration testing framework designed to run in a Dockerized Kali Linux environment. It orchestrates autonomous security assessments using LLMs (Claude, OpenAI, Local via LM Studio) and standard pentesting tools.

⚡ Quick Start (Docker)

The fastest way to run PentestAgent is with the Kali Docker image, which includes all tools pre-installed. We provide ARM64 images built weekly.

1. Pull the Kali Image

docker pull ghcr.io/ldesignlab/pentestagent:kali-arm64

2. Configure Authentication

Create a .env file in your project directory:

Option A: Claude Max/Pro (OAuth - Free)

CLAUDE_CODE_OAUTH_TOKEN=sk-ant-oat01-...
PENTESTAGENT_MODEL=claude-sonnet-4-20250514

Option B: Claude API (Paid)

ANTHROPIC_API_KEY=sk-ant-...
PENTESTAGENT_MODEL=claude-sonnet-4-20250514

Option C: LM Studio (Local - Free) See the Authentication Guide for local setup.

3. Run Interactive Mode

docker run -it --rm \
  --privileged \
  --network host \
  -v "$(pwd)/loot:/app/loot" \
  -v "$(pwd)/output:/app/output" \
  -e CLAUDE_CODE_OAUTH_TOKEN="$(grep CLAUDE_CODE_OAUTH_TOKEN .env | cut -d= -f2)" \
  -e PENTESTAGENT_MODEL="claude-sonnet-4-20250514" \
  ghcr.io/ldesignlab/pentestagent:kali-arm64

4. Run with Playbook (Recommended)

For structured penetration testing, use playbooks:

# Save as run_playbook.sh and make executable
./run_playbook.sh thp3_web example.com

📖 Documentation

Detailed guides organized by topic:

🚀 Getting Started

🛠️ Core Features

⚙️ Advanced Configuration

📚 Reference


🏛️ Project Structure

pentestagent/
  agents/         # Autonomous reasoning engines
  knowledge/      # RAG system and Shadow Graph
  mcp/            # Model Context Protocol servers
  playbooks/      # Structured attack workflows
  runtime/        # Docker and Local execution
  tools/          # Pentesting tool wrappers

⚖️ Legal & License

Legal Notice: Only use against systems you have explicit authorization to test. Unauthorized access is illegal.

License: MIT

TODO

Feature Requests

  • CIDR/Domain input - Support for CIDR notation (e.g., 192.168.1.0/24) and domain-based target specification for bulk scanning
  • XMap integration - Integrate XMap for high-speed IPv4 and IPv6 network scanning
  • Application scanner (GRep2) - Add application-level scanning capabilities
  • Full JSON report export - Generate comprehensive JSON reports suitable for storage in vector databases
  • Gemini OAuth/user login support - Add Google Gemini authentication via OAuth token similar to Claude Max account support (may require Gemini CLI integration)
  • PDF/HTML report generation - Professional reporting formats for client deliverables
  • Scheduling/cron support - Automated recurring scans on schedule
  • Multi-target parallel scanning - Run against multiple targets simultaneously
  • Slack/Discord/webhook notifications - Real-time alerts on critical findings
  • Credential manager - Secure storage for discovered and provided credentials
  • Scope management - Explicit in-scope/out-of-scope target enforcement
  • Resume/checkpoint - Resume interrupted scans from last saved state
  • Cloud provider scanning - AWS/Azure/GCP asset discovery and security assessment
  • API fuzzing - Dedicated API security testing (OpenAPI/GraphQL schema-based)
  • Container/Kubernetes scanning - Cloud-native and container security testing

Tool Integrations

OSINT & Reconnaissance:

  • Shodan/Censys integration - Internet-wide scanning and asset discovery
  • theHarvester - Email, subdomain, and metadata OSINT
  • Recon-ng - Modular reconnaissance framework
  • Subfinder - Fast subdomain discovery
  • Aquatone/EyeWitness - Screenshot-based web reconnaissance
  • httpx - Fast HTTP probing and technology detection

Vulnerability Scanning:

  • Nessus/OpenVAS integration - Enterprise vulnerability scanner support
  • testssl.sh - Comprehensive SSL/TLS testing
  • SSLScan/Certbot - Certificate and SSL analysis

Post-Exploitation & C2:

  • Mimikatz guide - Credential dumping knowledge base
  • Netcat/Socat guide - Reverse shells and port forwarding
  • pwncat - Advanced post-exploitation framework
  • Covenant/Havoc/Sliver - C2 framework integration
  • NetExec (CrackMapExec successor) - Updated lateral movement tool

Forensics & Reverse Engineering:

  • Volatility - Memory forensics analysis
  • Ghidra/radare2 - Binary reverse engineering
  • YARA - Malware detection and classification
  • Binwalk - Firmware analysis

Network & IDS:

  • Snort/Suricata - IDS/IPS testing and evasion
  • DNSRecon - DNS enumeration and zone transfers

Mobile Pentesting:

  • MobSF - Mobile Security Framework for Android/iOS
  • Frida/Objection - Dynamic instrumentation and hooking
  • APKTool - Android APK reverse engineering

IoT Testing:

  • Firmwalker - Firmware analysis
  • RouterSploit - Embedded device exploitation

Social Engineering:

  • Gophish - Phishing campaign framework
  • SET (Social Engineering Toolkit) - Social engineering attacks

Reporting & Integration:

  • Dradis integration - Collaborative reporting platform
  • Faraday integration - Vulnerability management
  • DefectDojo integration - Security orchestration and vulnerability management

Testing & Validation

  • Verify RAG end-to-end with a real LLM connection (not just mocks) to ensure retrieval is correctly injected during autonomous runs.
  • Add an advanced pentest E2E test that exercises real LLM decision-making to validate smart tool selection and replanning under realistic conditions.
  • Add Metasploit E2E testing to validate exploit execution and post-exploitation workflows.
  • Add playbook E2E testing to validate automated multi-phase penetration testing workflows against real test targets.
  • Add testing with LM Studio and a variety of local/hosted models to validate compatibility and performance.
  • Add advanced SQL injection pentest coverage against e2e targets to ensure the LLM is actively interacting and driving the workflow.
  • Add an e2e test against a Windows target with a known CVE exploit to verify discovery and interaction.
  • Ensure e2e tests against vulnerable targets produce valid reports and findings (end-to-end reporting verification).
  • Test playbooks against e2e targets to validate automated workflows end-to-end.
  • Verify the agent generates and follows LLM-created plans during runs (plan integrity check).
  • Add Active Directory E2E testing (Kerberoasting, DCSync, Pass-the-Hash workflows).
  • Add Nuclei template E2E testing to validate vulnerability scanning workflows.
  • Add credential dumping E2E testing (Mimikatz, SAM extraction flows).
  • Add privilege escalation E2E testing (Linux/Windows privesc chains).
  • Add performance/load testing (token usage, memory consumption, scan duration benchmarks).
  • Add Crew mode E2E testing (multi-agent orchestration, ShadowGraph integration, agent coordination).

Infrastructure

  • CI/CD integration - GitHub Actions pipeline for Docker Kali-based tests
  • Metrics/telemetry - Track scan success rates, tool usage, and performance metrics
  • Web UI/dashboard - Visual interface alternative to TUI for easier operation
  • AMD64/x86 Docker image testing - Validate Docker images on Intel/AMD architectures (currently only ARM64 tested)