diff --git a/_data/pages_info.yml b/_data/pages_info.yml index a1b25bd2a9..5afe3efa68 100644 --- a/_data/pages_info.yml +++ b/_data/pages_info.yml @@ -4086,6 +4086,9 @@ "/docs/samples/analytics/mcp-server-ai-insights/": url: "/docs/samples/analytics/mcp-server-ai-insights/" redirect_from: [] +"/docs/samples/analytics/ollama/nginx/": + url: "/docs/samples/analytics/ollama/nginx/" + redirect_from: [] "/docs/samples/analytics/ollama/": url: "/docs/samples/analytics/ollama/" redirect_from: [] diff --git a/_includes/docs/samples/analytics/ollama.md b/_includes/docs/samples/analytics/ollama.md index c8d08b0c62..22adb20ed6 100644 --- a/_includes/docs/samples/analytics/ollama.md +++ b/_includes/docs/samples/analytics/ollama.md @@ -1,386 +1,241 @@ * TOC {:toc} -## Overview +## Introduction to Ollama -When working with AI and Large Language Models (LLMs), you may want to keep your data private or cut costs by utilizing your own hardware instead of relying on cloud-based services. -A fantastic tool for achieving this is [Ollama](https://ollama.com/){:target="_blank"}, which makes it easy to run LLMs locally. - -However, Ollama does not have an authentication mechanism built-in, making it the user's responsibility to secure the deployment when exposing it on a network. - -A common approach is to use a reverse proxy to handle authentication and securely expose the Ollama API. -This proxy acts as a gatekeeper, validating credentials before forwarding requests to Ollama. - -In this guide, we will show how to create a basic solution with Ollama and [Nginx](https://nginx.org/){:target="_blank"} as a [reverse proxy](https://en.wikipedia.org/wiki/Reverse_proxy){:target="_blank"} -and demonstrate how to connect it to the ThingsBoard platform. -We will demonstrate two common authentication methods: -- **HTTP Basic Authentication** (username and password) -- **Bearer Token Authentication** (a secret API key) - -Both services, Ollama and Nginx, will be deployed together as containers using Docker Compose. -The goal is not to provide a production-grade solution, but rather to illustrate the concept and provide a simple, working starting point for further experimentation and implementation. -This guide uses the standard Ollama Docker image without GPU acceleration to keep the setup straightforward - you can add GPU support later to significantly improve inference performance. - -{% capture https_warning %} -After completing this guide, we **strongly recommend** securing your [Nginx proxy with HTTPS](https://nginx.org/en/docs/http/configuring_https_servers.html){:target="_blank"} -to ensure that credentials (passwords or bearer tokens) are always encrypted and not sent in plain text over the network. -{% endcapture %} -{% include templates/warn-banner.md content=https_warning %} - -## Prerequisites - -Before you start, ensure you have Docker and Docker Compose installed. -The easiest way to get both is to install [Docker Desktop](https://docs.docker.com/desktop/){:target="_blank"} and ensure it is running before you proceed. - -## Setup: Project Directory - -First, create a main project directory named `ollama-nginx-auth`. All the files we create throughout this guide will be placed inside this directory. - -Next, inside the `ollama-nginx-auth` directory, create another directory named `nginx`. This is where you will store your Nginx-specific configuration files. - -After you are done, your directory structure should look like this: -``` -ollama-nginx-auth/ -└── nginx/ -``` - -Make sure you are working inside the main `ollama-nginx-auth` directory for the next steps. - -## Approach 1: HTTP Basic Authentication - -This method protects your endpoint with a simple username and password. -When a request is made, Nginx checks the provided credentials against an encrypted list of users in a `.htpasswd` file to grant or deny access. - -The `.htpasswd` file is a standard file used for storing usernames and passwords for basic authentication on web servers like Nginx. -Each line in the file represents a single user and contains the username followed by a colon and the encrypted (hashed) password. - -### Step 1: Create the Credential File - -From your project root (`ollama-nginx-auth`), create the `.htpasswd` file inside the `nginx` directory. This command creates a file with the username `myuser` and password `mypassword`. - -{% capture tabspec %}htpasswd-setup -htpasswd-setup-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/htpasswd-setup-linux-macos.sh -htpasswd-setup-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/htpasswd-setup-windows.ps1{% endcapture %} -{% include tabs.html %} - -### Step 2: Create the Nginx Configuration File - -Create a file named `basic_auth.conf` inside the `nginx` directory (`ollama-nginx-auth/nginx/basic_auth.conf`) and paste the following content into it. -``` -events {} - -http { - server { - listen 80; - - location / { - # This section enforces HTTP Basic Authentication - auth_basic "Restricted Access"; - auth_basic_user_file /etc/nginx/.htpasswd; # Path to credentials file inside the container - - # If authentication is successful, forward the request to Ollama - proxy_pass http://ollama:11434; - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - - # Increase timeouts for slow model responses to prevent 504 Gateway Timeout errors - proxy_connect_timeout 300s; - proxy_send_timeout 300s; - proxy_read_timeout 300s; - } - } -} -``` -{: .copy-code} - -Here's what the configuration does: -- `listen 80;`: Nginx listens on port 80 inside the Docker container. -- `auth_basic "Restricted Access";`: Enables HTTP Basic Authentication. -- `auth_basic_user_file /etc/nginx/.htpasswd;`: Specifies the location of the password file inside the container. We will mount our local file to this path. -- `proxy_pass http://ollama:11434;`: Forwards any authenticated requests to the `ollama` service at its internal address. +[Ollama](https://ollama.com/){:target="_blank"} is an open-source tool that allows you to run Large Language Models (LLMs) locally on your own infrastructure. Think of it as +bringing the power of AI models like Llama, Mistral, or Gemma directly to your servers instead of sending requests to external cloud services. -### Step 3: Create the Docker Compose File +Unlike cloud-based AI providers such as OpenAI, Anthropic, or Google Gemini that require API calls over the internet, Ollama runs entirely within your environment. This fundamental +difference opens up new possibilities for enterprises looking to leverage AI while maintaining control over their data and infrastructure. -Create a file named `docker-compose.basic.yml` in the root of your project (`ollama-nginx-auth/docker-compose.basic.yml`) and paste the following content into it. -```yml -services: - ollama: - image: ollama/ollama - container_name: ollama - volumes: - - ollama_data:/root/.ollama - restart: unless-stopped - - nginx: - image: nginx:latest - container_name: nginx_proxy - ports: - - "8880:80" - volumes: - - ./nginx/basic_auth.conf:/etc/nginx/nginx.conf:ro - - ./nginx/.htpasswd:/etc/nginx/.htpasswd:ro - depends_on: - - ollama - restart: unless-stopped - -volumes: - ollama_data: -``` -{: .copy-code} - -### Step 4: Run and Test - -Start the services using the dedicated compose file. The `-f` flag specifies which file to use. This may take a some time. -```shell -docker compose -f docker-compose.basic.yml up -d -``` -{: .copy-code} - -Pull a model by executing the command directly inside the Ollama container. We'll use `gemma3:1b`, a lightweight model suitable for testing. This may take a some time. -```shell -docker exec -it ollama ollama pull gemma3:1b -``` -{: .copy-code} - -Test with your user (`myuser`): - -{% capture tabspec %}http-basic-test -http-basic-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/http-basic-test-linux-macos.sh -http-basic-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/http-basic-test-windows.ps1{% endcapture %} -{% include tabs.html %} - -Test an API call with incorrect credentials to see it fail: - -{% capture tabspec %}http-basic-failed-test -http-basic-failed-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/http-basic-failed-test-linux-macos.sh -http-basic-failed-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/http-basic-failed-test-windows.ps1{% endcapture %} -{% include tabs.html %} - -The output will show `401 Unauthorized` error. - -### Step 5: Connecting to ThingsBoard - -To connect this secured Ollama endpoint to ThingsBoard, follow [these instructions](/docs/{{docsPrefix}}samples/analytics/ai-models/#adding-ai-models-to-thingsboard){:target="_blank"} to open Ollama configuration form. - -When you reach the form, use the following settings: -- **Base URL**: `http://localhost:8880` (If ThingsBoard is running on a different machine, replace `localhost` with the IP address of the machine running Docker). -- **Authentication**: `Basic` -- **Username**: `myuser` (or any other user you created) -- **Password**: `mypassword` (the corresponding password) -- **Model ID**: `gemma3:1b` -- Optionally, configure other available settings. -- Click the **Check connectivity** button to verify the connection. - -### Step 6 (Optional): Manage Users - -You can easily add or remove users from the `.htpasswd` file. Changes to this file take effect immediately without needing to restart Nginx. - -{% capture adding-users-via-htpasswd %} -Always use the `htpasswd` command to add users. This utility correctly encrypts the password and ensures the credentials are stored in the format that Nginx requires. -Manually adding plain-text passwords to the file will not work. -{% endcapture %} -{% include templates/info-banner.md content=adding-users-via-htpasswd %} - -**To add a new user:** - -Run the `htpasswd` command again. This example adds `anotheruser` with password `anotherpassword`. - -{% capture tabspec %}http-basic-add-user -http-basic-add-user-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/http-basic-add-user-linux-macos.sh -http-basic-add-user-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/http-basic-add-user-windows.ps1{% endcapture %} -{% include tabs.html %} - -You can repeat this command for as many users as you need. - -**To remove a user:** - -Simply open the file `./nginx/.htpasswd` in a text editor and delete the line corresponding to the user you want to remove. - -## Approach 2: Bearer Token (API Key) Authentication - -This method uses a secret token. You will manage your keys in a simple text file, and Nginx will be configured to read them without needing a service restart. - -### Step 1: Create the API Keys File - -Create a file named `api_keys.txt` inside the `nginx` directory (`ollama-nginx-auth/nginx/api_keys.txt`) and paste your API keys into it, one per line. -``` -my-secret-api-key-1 -admin-key-abcdef -``` -{: .copy-code} - -### Step 2: Create the Nginx Configuration File - -Create a file named `bearer_token.conf` inside the `nginx` directory (`ollama-nginx-auth/nginx/bearer_token.conf`) and paste the following content into it. -This configuration includes a [Lua](https://www.lua.org/) script to read the API keys file dynamically. -``` -events {} - -http { - server { - listen 80; - - location / { - # Lua script to read keys from a file and check against the Authorization header - # This code runs for every request to this location. - access_by_lua_block { - local function trim(s) - return (s:gsub("^%s*(.-)%s*$", "%1")) - end - - -- Function to read keys from the file into a set for quick lookups - local function get_keys_from_file(path) - local keys = {} - local file = io.open(path, "r") - if not file then - ngx.log(ngx.ERR, "cannot open api keys file: ", path) - return keys - end - for line in file:lines() do - line = trim(line) - if line ~= "" then - keys[line] = true - end - end - file:close() - return keys - end - - -- Path to the keys file inside the container - local api_keys_file = "/etc/nginx/api_keys.txt" - local valid_keys = get_keys_from_file(api_keys_file) - - -- Check the Authorization header - local auth_header = ngx.var.http_authorization or "" - local _, _, token = string.find(auth_header, "Bearer%s+(.+)") - - if not token or not valid_keys[token] then - return ngx.exit(ngx.HTTP_UNAUTHORIZED) - end - } - - # If access is granted, forward the request to Ollama - proxy_pass http://ollama:11434; - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - - # Increase timeouts for slow model responses to prevent 504 Gateway Timeout errors - proxy_connect_timeout 300s; - proxy_send_timeout 300s; - proxy_read_timeout 300s; - } - } -} -``` -{: .copy-code} - -### Step 3: Create the Docker Compose File - -Create a file named `docker-compose.bearer.yml` in the root of your project (`ollama-nginx-auth/docker-compose.bearer.yml`) and paste the following content into it. -This `docker-compose.bearer.yml` uses an Nginx image that includes the required Lua module (`openresty/openresty`). -```yml -services: - ollama: - image: ollama/ollama - container_name: ollama - volumes: - - ollama_data:/root/.ollama - restart: unless-stopped - - nginx: - # Use the OpenResty image which includes the Nginx Lua module - image: openresty/openresty:latest - container_name: nginx_proxy - ports: - - "8880:80" - volumes: - # Mount the new Nginx config and the API keys file - - ./nginx/bearer_token.conf:/usr/local/openresty/nginx/conf/nginx.conf:ro - - ./nginx/api_keys.txt:/etc/nginx/api_keys.txt:ro - depends_on: - - ollama - restart: unless-stopped - -volumes: - ollama_data: -``` -{: .copy-code} - -### Step 4: Run and Test - -Start the services using the dedicated compose file. The `-f` flag specifies which file to use. -```shell -docker compose -f docker-compose.bearer.yml up -d -``` -{: .copy-code} - -Pull a model (this will be quick if you did it in Approach 1): -```shell -docker exec -it ollama ollama pull gemma3:1b -``` -{: .copy-code} - -Test a request using a valid API key: - -{% capture tabspec %}bearer-test -bearer-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/bearer-test-linux-macos.sh -bearer-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/bearer-test-windows.ps1{% endcapture %} -{% include tabs.html %} - -Test with an invalid API key to see it fail: - -{% capture tabspec %}bearer-failed-test -bearer-failed-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/bearer-failed-test-linux-macos.sh -bearer-failed-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/bearer-failed-test-windows.ps1{% endcapture %} -{% include tabs.html %} - -### Step 5: Connecting to ThingsBoard - -To connect this secured Ollama endpoint to ThingsBoard, follow [these instructions](/docs/{{docsPrefix}}samples/analytics/ai-models/#adding-ai-models-to-thingsboard){:target="_blank"} to open Ollama configuration form. - -When you reach the form, use the following settings: -- **Base URL**: `http://localhost:8880` (If ThingsBoard is running on a different machine, replace `localhost` with the IP address of the machine running Docker). -- **Authentication**: `Token` -- **Token**: `my-secret-api-key-1` (or any other token you created) -- **Model ID**: `gemma3:1b` -- Optionally, configure other available settings. -- Click the **Check connectivity** button to verify the connection. - -### Step 6 (Optional): Manage API Keys - -Simply open the file `./nginx/api_keys.txt` in a text editor. Add, change, or remove keys (one per line). Save the file. - -The changes take effect immediately on the next API request because the Lua script reads the file every time a request is made. - -For example, you can edit the file, remove the `admin-key-abcdef` key, save it, and then try to use that key in a test request. -The request will now fail with a 401 Unauthorized error. - -## Usage - -To start or stop the services, you will use the `docker compose up` and `docker compose down` commands, -making sure to specify the appropriate file for the authentication approach you want to use (`docker-compose.basic.yml` or `docker-compose.bearer.yml`). -- To start the services for either approach, run the following command from your project directory, replacing `` with the correct file name: - ```shell - docker compose -f up -d - ``` - {: .copy-code} - -- When you're finished, stop the containers with the corresponding file name: - ```shell - docker compose -f down - ``` - {: .copy-code} - -## Next steps - -Now that you have Ollama endpoint, here are some recommended next steps: - -- **Enable HTTPS**: Secure your Nginx proxy with HTTPS by following the [official Nginx HTTPS configuration guide](https://nginx.org/en/docs/http/configuring_https_servers.html){:target="_blank"}. - -- **Add GPU Support**: Enable GPU acceleration for Ollama to significantly improve inference speed. - Use the [Ollama Docker GPU setup instructions](https://github.com/ollama/ollama/blob/main/docs/docker.md){:target="_blank"} as a starting point. - -- **Predictive Maintenance with AI**: Explore our guide on [AI-Based Anomaly Detection with ThingsBoard](/docs/{{docsPrefix}}samples/analytics/ai-predictive-maintenance/){:target="_blank"}. - You can adapt that guide to use the local Ollama model you've just configured, allowing you to run the entire solution without relying on external AI services. +## Why Consider Ollama for Your ThingsBoard Deployment? + +If you already have a ThingsBoard deployment and are exploring AI integration, Ollama addresses several key concerns: + +**Cost Reduction**: Ollama eliminates per-token charges common with cloud AI services. Once you have GPU-enabled infrastructure, you pay only for hardware and operations, making +costs predictable regardless of usage volume. + +**Data Privacy**: All data processing happens within your infrastructure. Your telemetry data and AI analysis never leave your network, helping you maintain compliance with +regulations like GDPR, HIPAA, or industry-specific requirements. + +**Network Independence**: Ollama doesn't depend on external internet connectivity or third-party service availability, making it suitable for facilities with limited internet +access or critical infrastructure where reliability is paramount. + +## Understanding Ollama Deployment Options + +The way you integrate Ollama with ThingsBoard depends largely on your current deployment architecture. Let's explore how Ollama fits with the most common ThingsBoard deployment +patterns. + +### Single Server Monolithic Deployment + +If you're running ThingsBoard as a single service on one server (such as an EC2 instance), Ollama can be deployed directly on the same machine as an additional service. This works +well if: + +- Your server has GPU capabilities (recommended for acceptable performance) +- You have sufficient memory and CPU resources to run both services +- Your AI workload is moderate and doesn't require dedicated hardware + +In this scenario, Ollama runs alongside ThingsBoard, and communication happens through localhost connections, keeping everything simple and contained. + +### Single Server Docker Compose Deployment + +For ThingsBoard deployments using Docker Compose in cluster mode (microservices), you have two options for adding Ollama: + +- **As a Docker container**: Add Ollama to your existing docker-compose.yml file, making it part of your container stack. Note that this approach may require additional + configuration to enable GPU support through Docker. +- **As a system service**: Install Ollama directly on the host system, running it as a regular Linux service. This approach often configures GPU support automatically during + installation, making it simpler to get hardware acceleration working. + +Both approaches work well for this deployment type, but the system service installation typically provides easier GPU access out of the box. + +### Kubernetes Cluster Deployment + +In Kubernetes environments, it's recommended to run Ollama on a separate node pool with GPU support. This approach offers several benefits: + +- **Scalability**: Kubernetes makes it straightforward to scale your Ollama deployment to meet demand. You can add GPU-enabled nodes to your node pool as your AI workload grows, + and Kubernetes will automatically distribute Ollama pods across available resources. This allows you to scale AI capabilities independently from your ThingsBoard infrastructure. +- **Security**: Kubernetes provides various features to secure your Ollama deployment, including network policies to control traffic between pods, pod security standards to enforce + security best practices, and ingress controllers to manage external access with TLS termination and authentication. +- **Complexity**: Keep in mind that Kubernetes deployments are considerably more complex to set up and maintain. You'll need to configure components like the Nvidia GPU operator + for GPU access, set up proper node selectors or taints/tolerations for pod scheduling, and manage resource quotas and limits. This requires solid Kubernetes expertise. + +### Remote Ollama Deployment + +Perhaps the most flexible approach is running Ollama on completely separate infrastructure from your ThingsBoard deployment. In this model: + +- Ollama runs on dedicated GPU-enabled servers optimized for AI workloads +- ThingsBoard makes HTTP/HTTPS requests to the remote Ollama instance + +This separation of concerns allows you to scale your AI infrastructure independently from your IoT platform and optimize each for its specific workload. + +## Understanding Authentication Requirements + +Here's an important consideration: Ollama itself does not include built-in authentication mechanisms. The software is designed to be fast and simple, leaving security +implementation to the deployment architecture around it. + +This means that without additional security layers, anyone who can reach your Ollama endpoint can use it freely. Depending on your deployment scenario, this presents different +levels of concern: + +**When authentication is critical:** + +- Ollama is exposed to untrusted networks or the internet +- Multiple teams or projects share the same Ollama instance +- Compliance requirements mandate access controls +- You need to track or limit usage + +**When authentication might be less critical:** + +- Ollama runs within a fully trusted, isolated network (e.g., Docker Compose setups or Kubernetes clusters without external access to Ollama) +- Only ThingsBoard has network access to the Ollama endpoint +- Your infrastructure already provides network-level security + +Even in trusted network scenarios, implementing authentication provides defense in depth and enables better access control and auditing. + +## ThingsBoard's Authentication Support for Ollama + +Understanding the need for flexible security options, ThingsBoard provides three authentication methods when connecting to Ollama endpoints: + +### None (No Authentication) + +This option makes unauthenticated requests to the Ollama endpoint. While this might seem insecure, it's appropriate for specific scenarios: + +- Ollama runs on the same server as ThingsBoard with localhost-only access +- Network-level security (firewalls, VPNs) already isolates the Ollama endpoint +- You're in a development or testing environment + +When using this option, you're relying on network architecture and infrastructure controls to secure access to Ollama. + +### Basic Authentication (Username and Password) + +HTTP Basic authentication provides a straightforward security layer using username and password credentials. When using this method, ThingsBoard encodes your credentials in Base64 +format and passes them in the Authorization header as `Basic `. + +This method: + +- Is simple to implement and understand +- Works well for smaller teams or single-user scenarios +- Requires credential management (password rotation, secure storage) + +**Important**: Always use HTTPS when using Basic authentication to ensure credentials are encrypted in transit and not sent in plain text over the network. + +### Token Authentication (Bearer Token/API Key) + +Token-based authentication uses API keys (bearer tokens) to authenticate requests. When using this method, ThingsBoard passes your token in the Authorization header as +`Bearer `. + +This approach: + +- Is familiar to anyone who has worked with cloud AI services +- Allows for easy credential rotation without changing passwords +- Supports multiple tokens for different applications or environments +- Enables fine-grained access control when combined with reverse proxy capabilities + +**Important**: Always use HTTPS when using Token authentication to ensure your API keys are encrypted in transit and not sent in plain text over the network. + +## Setting Up Authentication for Ollama + +If you need to implement Basic or Token authentication for your Ollama deployment, we provide +a [guide for setting up Nginx as a reverse proxy with authentication](/docs/samples/analytics/ollama/nginx). This guide serves as a starting point to help you get moving with +authentication implementation. + +The guide covers configuring both authentication methods with Docker Compose, providing a foundation you can adapt and expand for your specific environment. + +## Choosing the Right Authentication Method + +Selecting the appropriate authentication method depends on your specific deployment scenario: + +**Use "None" when:** + +- Ollama and ThingsBoard are on the same server communicating via localhost +- Your network architecture already provides complete isolation + +**Use "Basic Authentication" when:** + +- User management overhead should be minimal +- You have a small team or single-user access +- You have HTTPS properly configured + +**Use "Token Authentication" when:** + +- You want to align with AI industry standards +- Multiple applications or teams will access Ollama +- You need to support credential rotation without disruption +- You require better audit trails for access +- Your team is already familiar with API key management + +For most production deployments, especially remote Ollama scenarios, Token authentication offers the best balance of security and usability. + +## Configuring Ollama in ThingsBoard + +Once you have Ollama deployed and optionally secured with authentication, connecting it to ThingsBoard is straightforward. You'll configure Ollama as an AI model provider through +the ThingsBoard interface. + +### Accessing the Configuration Form + +Navigate to the AI models configuration section in ThingsBoard by following the instructions +at [Adding AI Models to ThingsBoard](/docs/{{docsPrefix}}samples/analytics/ai-models/#adding-ai-models-to-thingsboard){:target="_blank"}. This will open the AI model configuration +form where you can add your Ollama endpoint. + +### Configuration Parameters + +The Ollama configuration form includes the following key settings: + +**Provider**: Select "Ollama" from the AI provider dropdown. + +**Base URL**: This is the HTTP/HTTPS endpoint where your Ollama instance is accessible. Examples: + +- `http://localhost:11434` - Ollama running locally on the same server +- `http://192.168.1.100:8880` - Ollama on another server in your network +- `https://ollama.yourdomain.com` - Ollama behind a reverse proxy with HTTPS + +**Authentication**: Choose one of three options: + +- **None**: No authentication will be used. ThingsBoard will make direct, unauthenticated requests to the Ollama endpoint. + +- **Basic**: HTTP Basic authentication using username and password credentials. + - **Username**: Your authentication username + - **Password**: Your authentication password + +- **Token**: Bearer token authentication using an API key. + - **Token**: Your API key or bearer token + +**Model ID**: The specific Ollama model you want to use (e.g., `llama3:8b`, `mistral:7b`, `gemma3:1b`). This should match exactly with a model you've pulled into your Ollama +instance. + +**Temperature, Top P, Top K, Maximum Output Tokens**: These parameters control the model's response behavior and are common across AI providers. Configure them according to your +specific use case requirements. + +**Context Length**: This critical setting determines how much context (conversation history, system prompts, input data) the model can process in a single request. + +### Understanding Context Length + +The context length setting deserves special attention when working with Ollama, as it significantly impacts both the model's capabilities and your server's resource utilization. + +Context length (also called context window) is the total number of tokens the model can process in a single request, including your system prompts, input data, and the model's +generated response. You need to set this value high enough to accommodate all your inputs plus the expected output without losing any data. + +**Memory considerations:** +Unlike cloud services where infrastructure scales automatically, with Ollama you're managing fixed hardware resources. Context length significantly affects GPU memory usage - +larger context windows require substantially more memory. The exact relationship varies by model size and architecture, but the impact is considerable. + +**Determining the right value:** +The appropriate context length for your use case is best determined empirically. Start with a reasonable estimate based on your typical input size plus expected output length, then +adjust based on: + +- Whether requests are being truncated +- Actual memory usage on your GPU +- Performance and response times + +If you find data being cut off, increase the context length. If memory usage is too high or performance suffers, consider reducing it or using a smaller model. + +### Testing Your Configuration + +After filling in all required fields, click the **Check connectivity** button at the bottom of the form. A successful test will show a green checkbox confirming that ThingsBoard +can communicate with your Ollama endpoint and the specified model is available. + +## Using Ollama in ThingsBoard + +For a practical example of using AI models in ThingsBoard, including Ollama, check out +our [Predictive Maintenance with AI guide](https://thingsboard.io/docs/samples/analytics/ai-predictive-maintenance/){:target="_blank"}. This guide demonstrates how to use AI for +anomaly detection and predictive maintenance scenarios, showcasing real-world applications of AI integration in IoT systems. diff --git a/_includes/docs/samples/analytics/ollama/nginx.md b/_includes/docs/samples/analytics/ollama/nginx.md new file mode 100644 index 0000000000..1161325337 --- /dev/null +++ b/_includes/docs/samples/analytics/ollama/nginx.md @@ -0,0 +1,362 @@ +* TOC +{:toc} + +## Overview {#overview} + +[Ollama](https://ollama.com/){:target="_blank"} is a powerful tool for running Large Language Models (LLMs) locally, but it does not include built-in authentication mechanisms. +When exposing Ollama on a network, securing the API endpoint becomes your responsibility. + +This guide demonstrates how to deploy Ollama with [Nginx](https://nginx.org/){:target="_blank"} as a [reverse proxy](https://en.wikipedia.org/wiki/Reverse_proxy){:target="_blank"} +to add authentication to your Ollama deployment. The Nginx proxy acts as a security gatekeeper, validating credentials before forwarding requests to the Ollama service. + +We will cover two common authentication methods: +- **HTTP Basic Authentication** (username and password) +- **Bearer Token Authentication** (a secret API key) + +Both services, Ollama and Nginx, will be deployed together as containers using Docker Compose. +This guide focuses on demonstrating the concept with a working implementation that you can use as a foundation for further customization. +We use the standard Ollama Docker image without GPU acceleration to keep the setup straightforward, though GPU support can be added later for improved performance. + +{% capture https_warning %} +After completing this guide, we **strongly recommend** securing your [Nginx proxy with HTTPS](https://nginx.org/en/docs/http/configuring_https_servers.html){:target="_blank"} +to ensure that credentials (passwords or bearer tokens) are always encrypted and not sent in plain text over the network. +{% endcapture %} +{% include templates/warn-banner.md content=https_warning %} + +## Prerequisites {#prerequisites} + +Before you start, ensure you have Docker and Docker Compose installed. +The easiest way to get both is to install [Docker Desktop](https://docs.docker.com/desktop/){:target="_blank"} and ensure it is running before you proceed. + +## Setup: Project Directory {#project-setup} + +First, create a main project directory named `ollama-nginx-auth`. All the files we create throughout this guide will be placed inside this directory. + +Next, inside the `ollama-nginx-auth` directory, create another directory named `nginx`. This is where you will store your Nginx-specific configuration files. + +After you are done, your directory structure should look like this: +``` +ollama-nginx-auth/ +└── nginx/ +``` + +Make sure you are working inside the main `ollama-nginx-auth` directory for the next steps. + +## Approach 1: HTTP Basic Authentication {#basic-auth} + +This method protects your endpoint with a simple username and password. +When a request is made, Nginx checks the provided credentials against an encrypted list of users in a `.htpasswd` file to grant or deny access. + +The `.htpasswd` file is a standard file used for storing usernames and passwords for basic authentication on web servers like Nginx. +Each line in the file represents a single user and contains the username followed by a colon and the encrypted (hashed) password. + +### Step 1: Create the Credential File {#basic-credentials} + +From your project root (`ollama-nginx-auth`), create the `.htpasswd` file inside the `nginx` directory. This command creates a file with the username `myuser` and password `mypassword`. + +{% capture tabspec %}htpasswd-setup +htpasswd-setup-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/htpasswd-setup-linux-macos.sh +htpasswd-setup-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/htpasswd-setup-windows.ps1{% endcapture %} +{% include tabs.html %} + +### Step 2: Create the Nginx Configuration File {#basic-config} + +Create a file named `basic_auth.conf` inside the `nginx` directory (`ollama-nginx-auth/nginx/basic_auth.conf`) and paste the following content into it. +``` +events {} + +http { + server { + listen 80; + + location / { + # This section enforces HTTP Basic Authentication + auth_basic "Restricted Access"; + auth_basic_user_file /etc/nginx/.htpasswd; # Path to credentials file inside the container + + # If authentication is successful, forward the request to Ollama + proxy_pass http://ollama:11434; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + + # Increase timeouts for slow model responses to prevent 504 Gateway Timeout errors + proxy_connect_timeout 300s; + proxy_send_timeout 300s; + proxy_read_timeout 300s; + } + } +} +``` +{: .copy-code} + +Here's what the configuration does: +- `listen 80;`: Nginx listens on port 80 inside the Docker container. +- `auth_basic "Restricted Access";`: Enables HTTP Basic Authentication. +- `auth_basic_user_file /etc/nginx/.htpasswd;`: Specifies the location of the password file inside the container. We will mount our local file to this path. +- `proxy_pass http://ollama:11434;`: Forwards any authenticated requests to the `ollama` service at its internal address. + +### Step 3: Create the Docker Compose File {#basic-compose} + +Create a file named `docker-compose.basic.yml` in the root of your project (`ollama-nginx-auth/docker-compose.basic.yml`) and paste the following content into it. +```yml +services: + ollama: + image: ollama/ollama + container_name: ollama + volumes: + - ollama_data:/root/.ollama + restart: unless-stopped + + nginx: + image: nginx:latest + container_name: nginx_proxy + ports: + - "8880:80" + volumes: + - ./nginx/basic_auth.conf:/etc/nginx/nginx.conf:ro + - ./nginx/.htpasswd:/etc/nginx/.htpasswd:ro + depends_on: + - ollama + restart: unless-stopped + +volumes: + ollama_data: +``` +{: .copy-code} + +### Step 4: Run and Test {#basic-test} + +Start the services using the dedicated compose file. The `-f` flag specifies which file to use. This may take a some time. +```shell +docker compose -f docker-compose.basic.yml up -d +``` +{: .copy-code} + +Pull a model by executing the command directly inside the Ollama container. We'll use `gemma3:1b`, a lightweight model suitable for testing. This may take a some time. +```shell +docker exec -it ollama ollama pull gemma3:1b +``` +{: .copy-code} + +Test with your user (`myuser`): + +{% capture tabspec %}http-basic-test +http-basic-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/http-basic-test-linux-macos.sh +http-basic-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/http-basic-test-windows.ps1{% endcapture %} +{% include tabs.html %} + +Test an API call with incorrect credentials to see it fail: + +{% capture tabspec %}http-basic-failed-test +http-basic-failed-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/http-basic-failed-test-linux-macos.sh +http-basic-failed-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/http-basic-failed-test-windows.ps1{% endcapture %} +{% include tabs.html %} + +The output will show `401 Unauthorized` error. + +### Step 5 (Optional): Manage Users {#basic-manage-users} + +You can easily add or remove users from the `.htpasswd` file. Changes to this file take effect immediately without needing to restart Nginx. + +{% capture adding-users-via-htpasswd %} +Always use the `htpasswd` command to add users. This utility correctly encrypts the password and ensures the credentials are stored in the format that Nginx requires. +Manually adding plain-text passwords to the file will not work. +{% endcapture %} +{% include templates/info-banner.md content=adding-users-via-htpasswd %} + +**To add a new user:** + +Run the `htpasswd` command again. This example adds `anotheruser` with password `anotherpassword`. + +{% capture tabspec %}http-basic-add-user +http-basic-add-user-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/http-basic-add-user-linux-macos.sh +http-basic-add-user-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/http-basic-add-user-windows.ps1{% endcapture %} +{% include tabs.html %} + +You can repeat this command for as many users as you need. + +**To remove a user:** + +Simply open the file `./nginx/.htpasswd` in a text editor and delete the line corresponding to the user you want to remove. + +## Approach 2: Bearer Token (API Key) Authentication {#bearer-token} + +This method uses a secret token. You will manage your keys in a simple text file, and Nginx will be configured to read them without needing a service restart. + +### Step 1: Create the API Keys File {#bearer-keys} + +Create a file named `api_keys.txt` inside the `nginx` directory (`ollama-nginx-auth/nginx/api_keys.txt`) and paste your API keys into it, one per line. +``` +my-secret-api-key-1 +admin-key-abcdef +``` +{: .copy-code} + +### Step 2: Create the Nginx Configuration File {#bearer-config} + +Create a file named `bearer_token.conf` inside the `nginx` directory (`ollama-nginx-auth/nginx/bearer_token.conf`) and paste the following content into it. +This configuration includes a [Lua](https://www.lua.org/) script to read the API keys file dynamically. +``` +events {} + +http { + server { + listen 80; + + location / { + # Lua script to read keys from a file and check against the Authorization header + # This code runs for every request to this location. + access_by_lua_block { + local function trim(s) + return (s:gsub("^%s*(.-)%s*$", "%1")) + end + + -- Function to read keys from the file into a set for quick lookups + local function get_keys_from_file(path) + local keys = {} + local file = io.open(path, "r") + if not file then + ngx.log(ngx.ERR, "cannot open api keys file: ", path) + return keys + end + for line in file:lines() do + line = trim(line) + if line ~= "" then + keys[line] = true + end + end + file:close() + return keys + end + + -- Path to the keys file inside the container + local api_keys_file = "/etc/nginx/api_keys.txt" + local valid_keys = get_keys_from_file(api_keys_file) + + -- Check the Authorization header + local auth_header = ngx.var.http_authorization or "" + local _, _, token = string.find(auth_header, "Bearer%s+(.+)") + + if not token or not valid_keys[token] then + return ngx.exit(ngx.HTTP_UNAUTHORIZED) + end + } + + # If access is granted, forward the request to Ollama + proxy_pass http://ollama:11434; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + + # Increase timeouts for slow model responses to prevent 504 Gateway Timeout errors + proxy_connect_timeout 300s; + proxy_send_timeout 300s; + proxy_read_timeout 300s; + } + } +} +``` +{: .copy-code} + +Here's what the configuration does: +- `listen 80;`: Nginx listens on port 80 inside the Docker container. +- `access_by_lua_block`: Executes a Lua script for each request to validate the Bearer token. + - The script reads valid API keys from `/etc/nginx/api_keys.txt` on every request. + - It extracts the token from the `Authorization: Bearer ` header. + - If the token is missing or not found in the valid keys list, it returns a 401 Unauthorized response. +- `proxy_pass http://ollama:11434;`: Forwards any authenticated requests to the `ollama` service at its internal address. + +### Step 3: Create the Docker Compose File {#bearer-compose} + +Create a file named `docker-compose.bearer.yml` in the root of your project (`ollama-nginx-auth/docker-compose.bearer.yml`) and paste the following content into it. +This `docker-compose.bearer.yml` uses an Nginx image that includes the required Lua module (`openresty/openresty`). +```yml +services: + ollama: + image: ollama/ollama + container_name: ollama + volumes: + - ollama_data:/root/.ollama + restart: unless-stopped + + nginx: + # Use the OpenResty image which includes the Nginx Lua module + image: openresty/openresty:latest + container_name: nginx_proxy + ports: + - "8880:80" + volumes: + # Mount the new Nginx config and the API keys file + - ./nginx/bearer_token.conf:/usr/local/openresty/nginx/conf/nginx.conf:ro + - ./nginx/api_keys.txt:/etc/nginx/api_keys.txt:ro + depends_on: + - ollama + restart: unless-stopped + +volumes: + ollama_data: +``` +{: .copy-code} + +### Step 4: Run and Test {#bearer-test} + +Start the services using the dedicated compose file. The `-f` flag specifies which file to use. +```shell +docker compose -f docker-compose.bearer.yml up -d +``` +{: .copy-code} + +Pull a model (this will be quick if you did it in Approach 1): +```shell +docker exec -it ollama ollama pull gemma3:1b +``` +{: .copy-code} + +Test a request using a valid API key: + +{% capture tabspec %}bearer-test +bearer-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/bearer-test-linux-macos.sh +bearer-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/bearer-test-windows.ps1{% endcapture %} +{% include tabs.html %} + +Test with an invalid API key to see it fail: + +{% capture tabspec %}bearer-failed-test +bearer-failed-test-linux-macos,Linux/macOS,shell,/docs/samples/analytics/resources/bearer-failed-test-linux-macos.sh +bearer-failed-test-windows,Windows (PowerShell),text,/docs/samples/analytics/resources/bearer-failed-test-windows.ps1{% endcapture %} +{% include tabs.html %} + +### Step 5 (Optional): Manage API Keys {#bearer-manage-keys} + +Simply open the file `./nginx/api_keys.txt` in a text editor. Add, change, or remove keys (one per line). Save the file. + +The changes take effect immediately on the next API request because the Lua script reads the file every time a request is made. + +For example, you can edit the file, remove the `admin-key-abcdef` key, save it, and then try to use that key in a test request. +The request will now fail with a 401 Unauthorized error. + +## Usage {#usage} + +To start or stop the services, you will use the `docker compose up` and `docker compose down` commands, +making sure to specify the appropriate file for the authentication approach you want to use (`docker-compose.basic.yml` or `docker-compose.bearer.yml`). +- To start the services for either approach, run the following command from your project directory, replacing `` with the correct file name: + ```shell + docker compose -f up -d + ``` + {: .copy-code} + +- When you're finished, stop the containers with the corresponding file name: + ```shell + docker compose -f down + ``` + {: .copy-code} + +## Next steps {#next-steps} + +Now that you have Ollama endpoint, here are some recommended next steps: + +- **Enable HTTPS**: Secure your Nginx proxy with HTTPS by following the [official Nginx HTTPS configuration guide](https://nginx.org/en/docs/http/configuring_https_servers.html){:target="_blank"}. + +- **Add GPU Support**: Enable GPU acceleration for Ollama to significantly improve inference speed. + Use the [Ollama Docker GPU setup instructions](https://github.com/ollama/ollama/blob/main/docs/docker.md){:target="_blank"} as a starting point. diff --git a/docs/paas/eu/samples/analytics/ollama.md b/docs/paas/eu/samples/analytics/ollama.md index e9f49c8180..ada19e985f 100644 --- a/docs/paas/eu/samples/analytics/ollama.md +++ b/docs/paas/eu/samples/analytics/ollama.md @@ -1,7 +1,7 @@ --- layout: docwithnav-paas-eu -title: Running AI on Your Own Hardware - Securing Ollama with an Nginx Reverse Proxy -description: Secure your local Ollama LLM deployment with Nginx and Docker Compose. This step-by-step guide provides copy-paste commands to easily set up username/password or API key authentication, including how to connect securely from ThingsBoard. +title: Local AI with Ollama - Integrating Ollama with ThingsBoard +description: Learn how to integrate Ollama self-hosted AI models with ThingsBoard to reduce costs, maintain data privacy, and run AI entirely on your infrastructure. --- {% assign docsPrefix = "paas/eu/" %} diff --git a/docs/paas/samples/analytics/ollama.md b/docs/paas/samples/analytics/ollama.md index 5395defe92..083b65d619 100644 --- a/docs/paas/samples/analytics/ollama.md +++ b/docs/paas/samples/analytics/ollama.md @@ -1,7 +1,7 @@ --- layout: docwithnav-paas -title: Running AI on Your Own Hardware - Securing Ollama with an Nginx Reverse Proxy -description: Secure your local Ollama LLM deployment with Nginx and Docker Compose. This step-by-step guide provides copy-paste commands to easily set up username/password or API key authentication, including how to connect securely from ThingsBoard. +title: Local AI with Ollama - Integrating Ollama with ThingsBoard +description: Learn how to integrate Ollama self-hosted AI models with ThingsBoard to reduce costs, maintain data privacy, and run AI entirely on your infrastructure. --- {% assign docsPrefix = "paas/" %} diff --git a/docs/pe/samples/analytics/ollama.md b/docs/pe/samples/analytics/ollama.md index 45722338ef..5e0dc13131 100644 --- a/docs/pe/samples/analytics/ollama.md +++ b/docs/pe/samples/analytics/ollama.md @@ -1,7 +1,7 @@ --- layout: docwithnav-pe -title: Running AI on Your Own Hardware - Securing Ollama with an Nginx Reverse Proxy -description: Secure your local Ollama LLM deployment with Nginx and Docker Compose. This step-by-step guide provides copy-paste commands to easily set up username/password or API key authentication, including how to connect securely from ThingsBoard. +title: Local AI with Ollama - Integrating Ollama with ThingsBoard +description: Learn how to integrate Ollama self-hosted AI models with ThingsBoard to reduce costs, maintain data privacy, and run AI entirely on your infrastructure. --- {% assign docsPrefix = "pe/" %} diff --git a/docs/samples/analytics/ollama.md b/docs/samples/analytics/ollama.md index 48093a47a6..a6b5a21875 100644 --- a/docs/samples/analytics/ollama.md +++ b/docs/samples/analytics/ollama.md @@ -1,7 +1,7 @@ --- layout: docwithnav -title: Running AI on Your Own Hardware - Securing Ollama with an Nginx Reverse Proxy -description: Secure your local Ollama LLM deployment with Nginx and Docker Compose. This step-by-step guide provides copy-paste commands to easily set up username/password or API key authentication, including how to connect securely from ThingsBoard. +title: Local AI with Ollama - Integrating Ollama with ThingsBoard +description: Learn how to integrate Ollama self-hosted AI models with ThingsBoard to reduce costs, maintain data privacy, and run AI entirely on your infrastructure. --- {% include get-hosts-name.html docsPrefix=docsPrefix %} diff --git a/docs/samples/analytics/ollama/nginx.md b/docs/samples/analytics/ollama/nginx.md new file mode 100644 index 0000000000..0cf2b79c84 --- /dev/null +++ b/docs/samples/analytics/ollama/nginx.md @@ -0,0 +1,8 @@ +--- +layout: docwithnav +title: Securing Ollama with an Nginx Reverse Proxy +description: Secure your local Ollama LLM deployment with Nginx and Docker Compose. This step-by-step guide provides copy-paste commands to easily set up username/password or API key authentication. +--- + +{% include get-hosts-name.html docsPrefix=docsPrefix %} +{% include /docs/samples/analytics/ollama/nginx.md %}