This project provides a web-based user interface, powered by Streamlit, to upload your own Electronic Health Record (EHR) data from a CSV file. It uses a locally-hosted Large Language Model (LLM) served by Ollama to create a powerful AI agent. This agent can understand questions asked in plain English, convert them into SQL queries, and give you back answers based on your data.
Your data never leaves your machine, ensuring 100% privacy.
- Private & Local: Your data is processed entirely on your machine and is never sent to a third-party API.
- Natural Language Queries: Ask complex questions like "How many patients have an average health expense over $5000?" instead of writing SQL code.
- GPU Accelerated: Uses Ollama to run powerful open-source LLMs (like Llama 3, Gemma 2) on your local NVIDIA GPU for fast performance.
- Easy Setup: The entire application stack is managed by Docker Compose and runs with a single command.
- Swappable Models: Easily switch between different LLMs by changing just two lines of code.
The application runs as a multi-container setup managed by Docker Compose:
[User] <--> [Web Browser] <--> [Streamlit Container (medquery-app)] <--> [Ollama Container (ollama)] <--> [NVIDIA GPU]
- The Ollama Container downloads and serves the LLM, accessing the host's GPU for acceleration.
- The Streamlit Container runs the Python web application, waits for Ollama to be ready, and automatically pulls the required model before starting the UI.
Before you begin, ensure you have the following installed on your host machine:
- Docker
- An NVIDIA GPU with sufficient VRAM for the chosen model (at least 8GB recommended).
- The latest NVIDIA Drivers for your operating system.
- The NVIDIA Container Toolkit, which allows Docker to access the GPU.
-
Get the Project Files Clone this repository or ensure you have the following files in a single directory:
docker-compose.ymlDockerfileentrypoint.shapp.pyrequirements.txt
-
Start the Application Open a terminal in the project's root directory and run the following command:
docker-compose up --build
The first time you run this, it will:
- Build the
medquery-appDocker image. - Pull the official
ollamaimage. - Start both containers.
- The
medquery-appwill wait for the Ollama server and then automatically runollama pullto download the language model (e.g.,llama3). This may take several minutes.
- Build the
-
Access the UI Once you see the "Starting Streamlit application..." message in your terminal, open your web browser and navigate to: http://localhost:8501
- In the web UI, use the sidebar to upload a CSV file containing your EHR data.
- Click the "Load Data" button. The application will load your data into a temporary, in-memory SQL database.
- Once the data is loaded, you can ask questions in the chat input box at the bottom of the page and press Enter. The agent will show its work in the terminal and display the final answer in the UI.
This project is configured to use llama3 by default, but you can easily switch to another model like Google's gemma2:9b.
-
In
entrypoint.sh, change theMODEL_NAMEvariable to the new model you want to automatically pull.# In the ensure_model_exists() function: MODEL_NAME="gemma2:9b" # Or "phi3", "llama3:70b", etc.
-
In
app.py, update themodelparameter in theget_llmfunction to match.# In the get_llm function: def get_llm(): return ChatOllama(model="gemma2:9b", ...)
-
Rebuild and restart the application with
docker-compose up --build.
docker-compose.yml: Defines and orchestrates theollamaandmedquery-appservices.Dockerfile: Instructs Docker on how to build the Python/Streamlit application image.entrypoint.sh: A startup script that waits for the Ollama server and automatically pulls the required model before launching the app.app.py: The core Streamlit application code, containing the UI and LangChain agent logic.requirements.txt: A list of the Python packages required for the application.