This project is a minimal Retrieval-Augmented Generation (RAG) pipeline for querying content from makonetworks.com using ChromaDB for vector search and Ollama for LLM-powered answers.
The main script (main.py) performs the following steps:
- Scrapes makonetworks.com using Playwright to collect up-to-date website content.
- Splits the scraped text into manageable chunks for processing.
- Embeds the content using Ollama embedding models (e.g.,
nomic-embed-text). - Stores and indexes embeddings in a local ChromaDB database for fast vector search.
- Answers user queries by retrieving relevant context from ChromaDB and generating answers with an Ollama LLM (e.g.,
llama3(llama3.2:1brequired for k8) ).
- Python 3.9+
- Ollama (for local embeddings/LLM)
- Models:
nomic-embed-textandllama3
- Models:
- Playwright (for dynamic site scraping)
-
Clone the repository:
git clone https://github.com/eflaatten/cxai-backend.git cd cxai-backend -
Install dependencies:
pip install -r requirements.txt python -m playwright install # Or manually: pip install langchain langchain-ollama chromadb playwright requests beautifulsoup4 python -m playwright install -
[Local Ollama Only] Pull required models:
ollama pull nomic-embed-text ollama pull llama3/Preferred Model
The script will:
- Scrape makonetworks.com for content (using a headless browser)
- Chunk, embed, and index content with ChromaDB
If your Ollama model is running remotely (e.g., in a Kubernetes cluster or on a remote server):
- Edit the
rag_queryfunction inmain.py:def rag_query(question, db, llm_model="llama3"): ... llm = OllamaLLM( model=llm_model, base_url="https://YOUR-OLLAMA-ENDPOINT" ) return llm.invoke(prompt)
Testing Locally
-
Run
uvicorn main:app --host 0.0.0.0 --port 8000
-
In Postman, make a POST request to
http://0.0.0.0:8000/api/ragwith the following JSON body:
{
"question": "What is Mako Networks?"
}You should receive a response similar to:
{
"choices": [
{
"message": {
"content": "Mako Networks is another name for the Mako System."
}
}
]
}