opea-project · joshuayao · Nov 4, 2025 · Oct 27, 2025 · Oct 28, 2025 · Oct 29, 2025
@@ -62,7 +62,7 @@ services:
     command:
       - '--path.procfs=/host/proc'
       - '--path.sysfs=/host/sys'
-      - --collector.filesystem.ignored-mount-points
+      - --collector.filesystem.mount-points-exclude
       - "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)"
     ports:
       - 9100:9100

@@ -13,13 +13,26 @@ This example includes the following sections:
 
 This section describes how to quickly deploy and test the DocSum service manually on an Intel Xeon platform. The basic steps are:
 
-1. [Access the Code](#access-the-code)
-2. [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
-3. [Configure the Deployment Environment](#configure-the-deployment-environment)
-4. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
-5. [Check the Deployment Status](#check-the-deployment-status)
-6. [Test the Pipeline](#test-the-pipeline)
-7. [Cleanup the Deployment](#cleanup-the-deployment)
+- [Example DocSum deployments on Intel Xeon Processor](#example-docsum-deployments-on-intel-xeon-processor)
+  - [DocSum Quick Start Deployment](#docsum-quick-start-deployment)
+    - [Access the Code and Set Up Environment](#access-the-code-and-set-up-environment)
+    - [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
+    - [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
+      - [Option #1](#option-1)
+      - [Option #2](#option-2)
+    - [Check the Deployment Status](#check-the-deployment-status)
+    - [Test the Pipeline](#test-the-pipeline)
+    - [Cleanup the Deployment](#cleanup-the-deployment)
+  - [DocSum Docker Compose Files](#docsum-docker-compose-files)
+    - [Running LLM models with remote endpoints](#running-llm-models-with-remote-endpoints)
+  - [DocSum Detailed Usage](#docsum-detailed-usage)
+    - [Query with text](#query-with-text)
+    - [Query with audio and video](#query-with-audio-and-video)
+    - [Query with long context](#query-with-long-context)
+  - [Launch the UI](#launch-the-ui)
+    - [Gradio UI](#gradio-ui)
+    - [Launch the Svelte UI](#launch-the-svelte-ui)
+    - [Launch the React UI (Optional)](#launch-the-react-ui-optional)
 
 ### Access the Code and Set Up Environment
 
@@ -28,7 +41,7 @@ Clone the GenAIExample repository and access the ChatQnA Intel Xeon platform Doc
 ```bash
 git clone https://github.com/opea-project/GenAIExamples.git
 cd GenAIExamples/DocSum/docker_compose
-source intel/set_env.sh
+source intel/cpu/xeon/set_env.sh
 ```
 
 > NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`.
@@ -47,13 +60,26 @@ Some HuggingFace resources, such as some models, are only accessible if you have
 
 ### Deploy the Services Using Docker Compose
 
+#### Option #1
+
 To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:
 
 ```bash
 cd intel/cpu/xeon/
 docker compose up -d
 ```
 
+#### Option #2
+
+> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file.
+
+To deploy with monitoring:
+
+```bash
+cd intel/cpu/xeon/
+docker compose -f compose.yaml -f compose.monitoring.yaml up -d
+```
+
 **Note**: developers should build docker image from source when:
 
 - Developing off the git main branch (as the container's ports in the repo may be different from the published docker image).
@@ -109,17 +135,25 @@ To stop the containers associated with the deployment, execute the following com
 docker compose -f compose.yaml down
 ```
 
+If mornitoring is enabled, execute the following command:
+
+```bash
+cd intel/cpu/xeon/
+docker compose -f compose.yaml -f compose.monitoring.yaml down
+```
+
 All the DocSum containers will be stopped and then removed on completion of the "down" command.
 
 ## DocSum Docker Compose Files
 
 In the context of deploying a DocSum pipeline on an Intel® Xeon® platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application.
 
-| File                                         | Description                                                                            |
-| -------------------------------------------- | -------------------------------------------------------------------------------------- |
-| [compose.yaml](./compose.yaml)               | Default compose file using vllm as serving framework                                   |
-| [compose_tgi.yaml](./compose_tgi.yaml)       | The LLM serving framework is TGI. All other configurations remain the same as default  |
-| [compose_remote.yaml](./compose_remote.yaml) | Uses remote inference endpoints for LLMs. All other configurations are same as default |
+| File                                                 | Description                                                                            |
+| ---------------------------------------------------- | -------------------------------------------------------------------------------------- |
+| [compose.yaml](./compose.yaml)                       | Default compose file using vllm as serving framework                                   |
+| [compose_tgi.yaml](./compose_tgi.yaml)               | The LLM serving framework is TGI. All other configurations remain the same as default  |
+| [compose_remote.yaml](./compose_remote.yaml)         | Uses remote inference endpoints for LLMs. All other configurations are same as default |
+| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files          |
 
 ### Running LLM models with remote endpoints
 

@@ -0,0 +1,59 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+services:
+  prometheus:
+    image: prom/prometheus:v2.52.0
+    container_name: opea_prometheus
+    user: root
+    volumes:
+      - ./prometheus.yaml:/etc/prometheus/prometheus.yaml
+      - ./prometheus_data:/prometheus
+    command:
+      - '--config.file=/etc/prometheus/prometheus.yaml'
+    ports:
+      - '9090:9090'
+    ipc: host
+    restart: unless-stopped
+
+  grafana:
+    image: grafana/grafana:11.0.0
+    container_name: grafana
+    volumes:
+      - ./grafana_data:/var/lib/grafana
+      - ./grafana/dashboards:/var/lib/grafana/dashboards
+      - ./grafana/provisioning:/etc/grafana/provisioning
+    user: root
+    environment:
+      GF_SECURITY_ADMIN_PASSWORD: admin
+      GF_RENDERING_CALLBACK_URL: http://grafana:3000/
+      GF_LOG_FILTERS: rendering:debug
+      no_proxy: ${no_proxy}
+      host_ip: ${host_ip}
+    depends_on:
+      - prometheus
+    ports:
+      - '3000:3000'
+    ipc: host
+    restart: unless-stopped
+
+  node-exporter:
+    image: prom/node-exporter
+    container_name: node-exporter
+    volumes:
+      - /proc:/host/proc:ro
+      - /sys:/host/sys:ro
+      - /:/rootfs:ro
+    command:
+      - '--path.procfs=/host/proc'
+      - '--path.sysfs=/host/sys'
+      - --collector.filesystem.ignored-mount-points
+      - "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)"
+    environment:
+      no_proxy: ${no_proxy}
+    ports:
+      - 9100:9100
+    ipc: host
+    restart: always
+    deploy:
+      mode: global
@@ -0,0 +1,11 @@
+#!/bin/bash
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+if ls *.json 1> /dev/null 2>&1; then
+    rm *.json
+fi
+
+wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json
+wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json
+wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/docsum_megaservice_grafana.json
+wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json
@@ -0,0 +1,14 @@
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: 1
+
+providers:
+- name: 'default'
+  orgId: 1
+  folder: ''
+  type: file
+  disableDeletion: false
+  updateIntervalSeconds: 10 #how often Grafana will scan for changed dashboards
+  options:
+    path: /var/lib/grafana/dashboards
@@ -0,0 +1,54 @@
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+# config file version
+apiVersion: 1
+
+# list of datasources that should be deleted from the database
+deleteDatasources:
+  - name: Prometheus
+    orgId: 1
+
+# list of datasources to insert/update depending
+# what's available in the database
+datasources:
+  # <string, required> name of the datasource. Required
+- name: Prometheus
+  # <string, required> datasource type. Required
+  type: prometheus
+  # <string, required> access mode. direct or proxy. Required
+  access: proxy
+  # <int> org id. will default to orgId 1 if not specified
+  orgId: 1
+  # <string> url
+  url: http://$host_ip:9090
+  # <string> database password, if used
+  password:
+  # <string> database user, if used
+  user:
+  # <string> database name, if used
+  database:
+  # <bool> enable/disable basic auth
+  basicAuth: false
+  # <string> basic auth username, if used
+  basicAuthUser:
+  # <string> basic auth password, if used
+  basicAuthPassword:
+  # <bool> enable/disable with credentials headers
+  withCredentials:
+  # <bool> mark as default datasource. Max one per org
+  isDefault: true
+  # <map> fields that will be converted to json and stored in json_data
+  jsonData:
+     httpMethod: GET
+     graphiteVersion: "1.1"
+     tlsAuth: false
+     tlsAuthWithCACert: false
+  # <string> json object of data that will be encrypted.
+  secureJsonData:
+    tlsCACert: "..."
+    tlsClientCert: "..."
+    tlsClientKey: "..."
+  version: 1
+  # <bool> allow users to edit datasources from the UI.
+  editable: true
@@ -0,0 +1,27 @@
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+# [IP_ADDR]:{PORT_OUTSIDE_CONTAINER} -> {PORT_INSIDE_CONTAINER} / {PROTOCOL}
+global:
+  scrape_interval: 5s
+  external_labels:
+    monitor: "my-monitor"
+scrape_configs:
+  - job_name: "prometheus"
+    static_configs:
+      - targets: ["opea_prometheus:9090"]
+  - job_name: "vllm"
+    metrics_path: /metrics
+    static_configs:
+      - targets: ["docsum-xeon-vllm-service:80"]
+  - job_name: "tgi"
+    metrics_path: /metrics
+    static_configs:
+      - targets: ["docsum-xeon-tgi-server:80"]
+  - job_name: "docsum-backend-server"
+    metrics_path: /metrics
+    static_configs:
+      - targets: ["docsum-xeon-backend-server:8888"]
+  - job_name: "prometheus-node-exporter"
+    metrics_path: /metrics
+    static_configs:
+      - targets: ["node-exporter:9100"]
@@ -2,15 +2,14 @@
 
 # Copyright (C) 2024 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0
-SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
-pushd "${SCRIPT_DIR}/../../.." > /dev/null
+
+SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
+
+pushd "$SCRIPT_DIR/../../../../../" > /dev/null
 source .set_env.sh
 popd > /dev/null
 
 export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1"
-export no_proxy="${no_proxy},${host_ip}" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
-export http_proxy=$http_proxy
-export https_proxy=$https_proxy
 export HF_TOKEN=${HF_TOKEN}
 
 export LLM_ENDPOINT_PORT=8008
@@ -41,3 +40,13 @@ export NUM_CARDS=1
 export BLOCK_SIZE=128
 export MAX_NUM_SEQS=256
 export MAX_SEQ_LEN_TO_CAPTURE=2048
+
+# Download Grafana configurations
+pushd "${SCRIPT_DIR}/grafana/dashboards" > /dev/null
+source download_opea_dashboard.sh
+popd > /dev/null
+
+# Set network proxy settings
+export no_proxy="${no_proxy},${host_ip},docsum-xeon-vllm-service,docsum-xeon-tgi-server,docsum-xeon-backend-server,opea_prometheus,grafana,node-exporter,$JAEGER_IP" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
+export http_proxy=$http_proxy
+export https_proxy=$https_proxy
@@ -15,13 +15,25 @@ This example includes the following sections:
 
 This section describes how to quickly deploy and test the DocSum service manually on an Intel® Gaudi® platform. The basic steps are:
 
-1. [Access the Code](#access-the-code)
-2. [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
-3. [Configure the Deployment Environment](#configure-the-deployment-environment)
-4. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
-5. [Check the Deployment Status](#check-the-deployment-status)
-6. [Test the Pipeline](#test-the-pipeline)
-7. [Cleanup the Deployment](#cleanup-the-deployment)
+- [Example DocSum deployments on Intel® Gaudi® Platform](#example-docsum-deployments-on-intel-gaudi-platform)
+  - [DocSum Quick Start Deployment](#docsum-quick-start-deployment)
+    - [Access the Code and Set Up Environment](#access-the-code-and-set-up-environment)
+    - [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
+    - [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose)
+      - [Option #1](#option-1)
+      - [Option #2](#option-2)
+    - [Check the Deployment Status](#check-the-deployment-status)
+    - [Test the Pipeline](#test-the-pipeline)
+    - [Cleanup the Deployment](#cleanup-the-deployment)
+  - [DocSum Docker Compose Files](#docsum-docker-compose-files)
+  - [DocSum Detailed Usage](#docsum-detailed-usage)
+    - [Query with text](#query-with-text)
+    - [Query with audio and video](#query-with-audio-and-video)
+    - [Query with long context](#query-with-long-context)
+  - [Launch the UI](#launch-the-ui)
+    - [Gradio UI](#gradio-ui)
+    - [Launch the Svelte UI](#launch-the-svelte-ui)
+    - [Launch the React UI (Optional)](#launch-the-react-ui-optional)
 
 ### Access the Code and Set Up Environment
 
@@ -30,7 +42,7 @@ Clone the GenAIExample repository and access the DocSum Intel® Gaudi® platform
 ```bash
 git clone https://github.com/opea-project/GenAIExamples.git
 cd GenAIExamples/DocSum/docker_compose
-source intel/set_env.sh
+source intel/hpu/gaudi/set_env.sh
 ```
 
 > NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`.
@@ -49,13 +61,26 @@ Some HuggingFace resources, such as some models, are only accessible if you have
 
 ### Deploy the Services Using Docker Compose
 
+#### Option #1
+
 To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:
 
 ```bash
 cd intel/hpu/gaudi/
 docker compose up -d
 ```
 
+#### Option #2
+
+> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file.
+
+To deploy with monitoring:
+
+```bash
+cd intel/cpu/xeon/
+docker compose -f compose.yaml -f compose.monitoring.yaml up -d
+```
+
 **Note**: developers should build docker image from source when:
 
 - Developing off the git main branch (as the container's ports in the repo may be different from the published docker image).
@@ -117,10 +142,11 @@ All the DocSum containers will be stopped and then removed on completion of the
 
 In the context of deploying a DocSum pipeline on an Intel® Gaudi® platform, the allocation and utilization of Gaudi devices across different services are important considerations for optimizing performance and resource efficiency. Each of the example deployments, defined by the example Docker compose yaml files, demonstrates a unique approach to leveraging Gaudi hardware, reflecting different priorities and operational strategies.
 
-| File                                   | Description                                                                               |
-| -------------------------------------- | ----------------------------------------------------------------------------------------- |
-| [compose.yaml](./compose.yaml)         | Default compose file using vllm as serving framework                                      |
-| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default |
+| File                                                 | Description                                                                               |
+| ---------------------------------------------------- | ----------------------------------------------------------------------------------------- |
+| [compose.yaml](./compose.yaml)                       | Default compose file using vllm as serving framework                                      |
+| [compose_tgi.yaml](./compose_tgi.yaml)               | The LLM serving framework is TGI. All other configurations remain the same as the default |
+| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files             |
 
 ## DocSum Detailed Usage