crawl bot is a scalable FastAPI backend that leverages sitemap.xml to integrate web crawling with RAG-powered chatbot functionality. Open-source and developer-friendly, it provides a robust framework for embedding AI-driven interactions into your applications.
Example URL: https://example.com/sitemap.xml
-
FastAPI Framework: Built on FastAPI for high performance and easy development.
-
Database Support: Supports PostgreSQL and MongoDB for data storage.
-
Health Check: Implements health checks for database connections on startup.
-
CORS Support: Configured to allow cross-origin requests.
-
Environment Configuration: Uses environment variables for configuration management.
-
Python 3.10
-
Virtual environment (recommended)
- Clone the repository:
git clone https://github.com/Ragib01/crawlbot.git
cd crawl-chatbot-api
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the required packages:
pip install -r requirements.txt
- Create a
.envfile in the root directory by copying the example file:
cp .env.example .env
Then configure your environment variables in the .env file.
To start the FastAPI application, run:
uvicorn app.main:app --reload
You can access the API documentation at http://127.0.0.1:8000/docs.
-
Health Check:
GET /v1/health -
Chatbot:
POST /v1/chatbot -
Vector Database:
GET /v1/vectordb -
Crawler:
GET /v1/crawler
Contributions are welcome! Please read the CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.