From 1dc20b63d185bf1570eef8a6ceac45405b1f682b Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Fri, 14 Nov 2025 11:58:35 +0000
Subject: [PATCH] Add INTEGRATIONS.md showcasing LangChain4j and Quarkus
 integration examples

---
 INTEGRATIONS.md | 97 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 97 insertions(+)
 create mode 100644 INTEGRATIONS.md

diff --git a/INTEGRATIONS.md b/INTEGRATIONS.md
new file mode 100644
index 0000000..29af733
--- /dev/null
+++ b/INTEGRATIONS.md
@@ -0,0 +1,97 @@
+# Integrations
+
+This document showcases minimal integration examples demonstrating how to use GPULlama3.javaF with popular Java frameworks and AI orchestration libraries.
+
+## LangChain4j Integration
+
+**Repository**: [gpullama3-langchain4j-demo](https://github.com/beehive-lab/gpullama3-langchain4j-demo)
+
+A pure Java demonstration of running LLaMA 3 language models on GPU through LangChain4j integration with TornadoVM for heterogeneous computing.
+
+### Features
+
+- Basic conversational AI interactions
+- Memory-persistent multi-turn dialogues
+- Real-time token streaming
+- Agentic systems with tool-calling capabilities
+- Comparative gameplay agents (CPU vs GPU)
+
+### Requirements
+
+- Java 21+
+- Maven
+- TornadoVM properly configured with GPU access
+- ~20GB dedicated GPU memory for optimal performance
+
+### Quick Start
+
+```bash
+# Build the project
+mvn clean install
+
+# Run with GPU acceleration (TornadoVM)
+tornado --threadInfo --enableProfiler --printKernel -cp @cp.txt io.github.mikepapadim.YourDemo
+
+# Run on CPU (for comparison)
+java -Xmx20g -cp @cp.txt io.github.mikepapadim.YourDemo
+```
+
+### Performance
+
+Benchmarks demonstrate consistent GPU advantages with speedups ranging from **3.5× to 5×** versus CPU execution across various model sizes using NVIDIA 5090 GPU.
+
+---
+
+## Quarkus + LangChain4j Integration
+
+**Repository**: [gpullama3-quarkus-langchain4j-demo](https://github.com/beehive-lab/gpullama3-quarkus-langchain4j-demo)
+
+Cloud-native demonstration combining Quarkus framework with LangChain4j extension for GPU-accelerated language model execution in containerized Java applications.
+
+### Features
+
+- **Chat Demo**: Interactive chat application with GPU acceleration
+- **Streaming Demo**: Real-time streaming responses for conversational AI
+- Quarkus-optimized deployment for cloud-native environments
+- Hot-reload development mode
+
+### Requirements
+
+- Java 21 (suggested: 21.0.2-open)
+- Maven
+- TornadoVM with environment variables configured
+
+### Quick Start
+
+```bash
+# Build all demo modules
+mvn clean install
+
+# Run the Chat Demo
+java @$TORNADO_SDK/../../../tornado-argfile -jar demos/chat-demo/target/quarkus-app/quarkus-run.jar
+
+# Run the Streaming Demo
+java @$TORNADO_SDK/../../../tornado-argfile -jar demos/streaming-demo/target/quarkus-app/quarkus-run.jar
+
+# Development mode (with hot reload)
+mvn quarkus:dev
+```
+
+### Integration Benefits
+
+- **Lightweight**: Quarkus minimizes memory footprint and startup time
+- **Cloud-Ready**: Native Kubernetes integration and container-first design
+- **Developer Experience**: Fast hot reload and unified configuration
+- **AI-Optimized**: LangChain4j extension for seamless AI service integration
+
+---
+
+## Getting Started
+
+Both integrations require:
+
+1. **TornadoVM Setup**: Follow the [TornadoVM installation guide](https://github.com/beehive-lab/TornadoVM)
+2. **GPU Drivers**: Ensure CUDA/OpenCL drivers are properly installed
+3. **Model Files**: Download LLaMA 3 model files and configure paths
+
+For detailed implementation examples, explore the respective repositories linked above.