From 98d859e25303b19e01b0d3b5ae79076d85e2fbaa Mon Sep 17 00:00:00 2001 From: Yi Yao Date: Mon, 27 Oct 2025 16:19:51 +0800 Subject: [PATCH 1/7] Add monitoring for DocSum on Xeon Signed-off-by: Yi Yao --- .../docker_compose/intel/cpu/xeon/README.md | 61 +++++++++++++++---- .../intel/cpu/xeon/compose.monitoring.yaml | 59 ++++++++++++++++++ .../dashboards/download_opea_dashboard.sh | 11 ++++ .../provisioning/dashboards/local.yaml | 14 +++++ .../provisioning/datasources/datasource.yml | 54 ++++++++++++++++ .../intel/cpu/xeon/prometheus.yaml | 27 ++++++++ .../intel/{ => cpu/xeon}/set_env.sh | 13 +++- .../docker_compose/intel/hpu/gaudi/README.md | 26 +++++--- DocSum/tests/test_compose_on_xeon.sh | 4 +- 9 files changed, 243 insertions(+), 26 deletions(-) create mode 100644 DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml create mode 100644 DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh create mode 100644 DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/dashboards/local.yaml create mode 100644 DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/datasources/datasource.yml create mode 100644 DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml rename DocSum/docker_compose/intel/{ => cpu/xeon}/set_env.sh (76%) diff --git a/DocSum/docker_compose/intel/cpu/xeon/README.md b/DocSum/docker_compose/intel/cpu/xeon/README.md index e0b2ab26c0..f21f02bbc7 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/README.md +++ b/DocSum/docker_compose/intel/cpu/xeon/README.md @@ -13,13 +13,26 @@ This example includes the following sections: This section describes how to quickly deploy and test the DocSum service manually on an Intel Xeon platform. The basic steps are: -1. [Access the Code](#access-the-code) -2. [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token) -3. [Configure the Deployment Environment](#configure-the-deployment-environment) -4. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) -5. [Check the Deployment Status](#check-the-deployment-status) -6. [Test the Pipeline](#test-the-pipeline) -7. [Cleanup the Deployment](#cleanup-the-deployment) +- [Example DocSum deployments on Intel Xeon Processor](#example-docsum-deployments-on-intel-xeon-processor) + - [DocSum Quick Start Deployment](#docsum-quick-start-deployment) + - [Access the Code and Set Up Environment](#access-the-code-and-set-up-environment) + - [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token) + - [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) + - [Option #1:](#option-1) + - [Option #2:](#option-2) + - [Check the Deployment Status](#check-the-deployment-status) + - [Test the Pipeline](#test-the-pipeline) + - [Cleanup the Deployment](#cleanup-the-deployment) + - [DocSum Docker Compose Files](#docsum-docker-compose-files) + - [Running LLM models with remote endpoints](#running-llm-models-with-remote-endpoints) + - [DocSum Detailed Usage](#docsum-detailed-usage) + - [Query with text](#query-with-text) + - [Query with audio and video](#query-with-audio-and-video) + - [Query with long context](#query-with-long-context) + - [Launch the UI](#launch-the-ui) + - [Gradio UI](#gradio-ui) + - [Launch the Svelte UI](#launch-the-svelte-ui) + - [Launch the React UI (Optional)](#launch-the-react-ui-optional) ### Access the Code and Set Up Environment @@ -28,7 +41,7 @@ Clone the GenAIExample repository and access the ChatQnA Intel Xeon platform Doc ```bash git clone https://github.com/opea-project/GenAIExamples.git cd GenAIExamples/DocSum/docker_compose -source intel/set_env.sh +source intel/cpu/xeon/set_env.sh ``` > NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`. @@ -47,6 +60,8 @@ Some HuggingFace resources, such as some models, are only accessible if you have ### Deploy the Services Using Docker Compose +#### Option #1: + To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute: ```bash @@ -54,6 +69,16 @@ cd intel/cpu/xeon/ docker compose up -d ``` +#### Option #2: +> NOTE : To enable mornitoring, `compose.telemetry.yaml` file need to be merged along with default `compose.yaml` file. + +To deploy with mornitoring: + +```bash +cd intel/cpu/xeon/ +docker compose -f compose.yaml -f compose.telemetry.yaml up -d +``` + **Note**: developers should build docker image from source when: - Developing off the git main branch (as the container's ports in the repo may be different from the published docker image). @@ -109,17 +134,27 @@ To stop the containers associated with the deployment, execute the following com docker compose -f compose.yaml down ``` +If mornitoring is enabled, execute the following command: + +```bash +cd intel/cpu/xeon/ +docker compose -f compose.yaml -f compose.telemetry.yaml down +``` + + All the DocSum containers will be stopped and then removed on completion of the "down" command. ## DocSum Docker Compose Files In the context of deploying a DocSum pipeline on an Intel® Xeon® platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application. -| File | Description | -| -------------------------------------------- | -------------------------------------------------------------------------------------- | -| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | -| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as default | -| [compose_remote.yaml](./compose_remote.yaml) | Uses remote inference endpoints for LLMs. All other configurations are same as default | +| File | Description | +| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------ | +| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | +| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as default | +| [compose_remote.yaml](./compose_remote.yaml) | Uses remote inference endpoints for LLMs. All other configurations are same as default | +| [compose.telemetry.yaml](./compose.telemetry.yaml) | Helper file for telemetry features for vllm. Can be used along with any compose files that serves vllm | +| [compose_tgi.telemetry.yaml](./compose_tgi.telemetry.yaml) | Helper file for telemetry features for tgi. Can be used along with any compose files that serves tgi | ### Running LLM models with remote endpoints diff --git a/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml b/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml new file mode 100644 index 0000000000..495174ebe5 --- /dev/null +++ b/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml @@ -0,0 +1,59 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +services: + prometheus: + image: prom/prometheus:v2.52.0 + container_name: opea_prometheus + user: root + volumes: + - ./prometheus.yaml:/etc/prometheus/prometheus.yaml + - ./prometheus_data:/prometheus + command: + - '--config.file=/etc/prometheus/prometheus.yaml' + ports: + - '9090:9090' + ipc: host + restart: unless-stopped + + grafana: + image: grafana/grafana:11.0.0 + container_name: grafana + volumes: + - ./grafana_data:/var/lib/grafana + - ./grafana/dashboards:/var/lib/grafana/dashboards + - ./grafana/provisioning:/etc/grafana/provisioning + user: root + environment: + GF_SECURITY_ADMIN_PASSWORD: admin + GF_RENDERING_CALLBACK_URL: http://grafana:3000/ + GF_LOG_FILTERS: rendering:debug + no_proxy: ${no_proxy} + host_ip: ${host_ip} + depends_on: + - prometheus + ports: + - '3000:3000' + ipc: host + restart: unless-stopped + + node-exporter: + image: prom/node-exporter + container_name: node-exporter + volumes: + - /proc:/host/proc:ro + - /sys:/host/sys:ro + - /:/rootfs:ro + command: + - '--path.procfs=/host/proc' + - '--path.sysfs=/host/sys' + - --collector.filesystem.ignored-mount-points + - "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)" + environment: + no_proxy: ${no_proxy} + ports: + - 9100:9100 + restart: always + deploy: + mode: global + diff --git a/DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh b/DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh new file mode 100644 index 0000000000..be1b36dc64 --- /dev/null +++ b/DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh @@ -0,0 +1,11 @@ +#!/bin/bash +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 +if ls *.json 1> /dev/null 2>&1; then + rm *.json +fi + +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/docsum_megaservice_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json \ No newline at end of file diff --git a/DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/dashboards/local.yaml b/DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/dashboards/local.yaml new file mode 100644 index 0000000000..13922a769b --- /dev/null +++ b/DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/dashboards/local.yaml @@ -0,0 +1,14 @@ +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: 1 + +providers: +- name: 'default' + orgId: 1 + folder: '' + type: file + disableDeletion: false + updateIntervalSeconds: 10 #how often Grafana will scan for changed dashboards + options: + path: /var/lib/grafana/dashboards diff --git a/DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/datasources/datasource.yml b/DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/datasources/datasource.yml new file mode 100644 index 0000000000..a206521d67 --- /dev/null +++ b/DocSum/docker_compose/intel/cpu/xeon/grafana/provisioning/datasources/datasource.yml @@ -0,0 +1,54 @@ +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# config file version +apiVersion: 1 + +# list of datasources that should be deleted from the database +deleteDatasources: + - name: Prometheus + orgId: 1 + +# list of datasources to insert/update depending +# what's available in the database +datasources: + # name of the datasource. Required +- name: Prometheus + # datasource type. Required + type: prometheus + # access mode. direct or proxy. Required + access: proxy + # org id. will default to orgId 1 if not specified + orgId: 1 + # url + url: http://$host_ip:9090 + # database password, if used + password: + # database user, if used + user: + # database name, if used + database: + # enable/disable basic auth + basicAuth: false + # basic auth username, if used + basicAuthUser: + # basic auth password, if used + basicAuthPassword: + # enable/disable with credentials headers + withCredentials: + # mark as default datasource. Max one per org + isDefault: true + # fields that will be converted to json and stored in json_data + jsonData: + httpMethod: GET + graphiteVersion: "1.1" + tlsAuth: false + tlsAuthWithCACert: false + # json object of data that will be encrypted. + secureJsonData: + tlsCACert: "..." + tlsClientCert: "..." + tlsClientKey: "..." + version: 1 + # allow users to edit datasources from the UI. + editable: true diff --git a/DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml b/DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml new file mode 100644 index 0000000000..837496e047 --- /dev/null +++ b/DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml @@ -0,0 +1,27 @@ +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 +# [IP_ADDR]:{PORT_OUTSIDE_CONTAINER} -> {PORT_INSIDE_CONTAINER} / {PROTOCOL} +global: + scrape_interval: 5s + external_labels: + monitor: "my-monitor" +scrape_configs: + - job_name: "prometheus" + static_configs: + - targets: ["opea_prometheus:9090"] + - job_name: "vllm" + metrics_path: /metrics + static_configs: + - targets: ["docsum-xeon-vllm-service:80"] + - job_name: "tgi" + metrics_path: /metrics + static_configs: + - targets: ["docsum-xeon-tgi-server:80"] + - job_name: "docsum-backend-server" + metrics_path: /metrics + static_configs: + - targets: ["docsum-xeon-backend-server:8888"] + - job_name: "prometheus-node-exporter" + metrics_path: /metrics + static_configs: + - targets: ["node-exporter:9100"] \ No newline at end of file diff --git a/DocSum/docker_compose/intel/set_env.sh b/DocSum/docker_compose/intel/cpu/xeon/set_env.sh similarity index 76% rename from DocSum/docker_compose/intel/set_env.sh rename to DocSum/docker_compose/intel/cpu/xeon/set_env.sh index 0411335847..b2140e7565 100644 --- a/DocSum/docker_compose/intel/set_env.sh +++ b/DocSum/docker_compose/intel/cpu/xeon/set_env.sh @@ -8,9 +8,6 @@ source .set_env.sh popd > /dev/null export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1" -export no_proxy="${no_proxy},${host_ip}" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" -export http_proxy=$http_proxy -export https_proxy=$https_proxy export HF_TOKEN=${HF_TOKEN} export LLM_ENDPOINT_PORT=8008 @@ -41,3 +38,13 @@ export NUM_CARDS=1 export BLOCK_SIZE=128 export MAX_NUM_SEQS=256 export MAX_SEQ_LEN_TO_CAPTURE=2048 + +# Download Grafana configurations +pushd "grafana/dashboards" > /dev/null +source download_opea_dashboard.sh +popd > /dev/null + +# Set network proxy settings +export no_proxy="${no_proxy},${host_ip},docsum-xeon-vllm-service,docsum-xeon-tgi-server,docsum-xeon-backend-server,opea_prometheus,grafana,node-exporter,$JAEGER_IP" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" +export http_proxy=$http_proxy +export https_proxy=$https_proxy diff --git a/DocSum/docker_compose/intel/hpu/gaudi/README.md b/DocSum/docker_compose/intel/hpu/gaudi/README.md index 03e53101e1..fa6b224909 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/README.md +++ b/DocSum/docker_compose/intel/hpu/gaudi/README.md @@ -15,13 +15,23 @@ This example includes the following sections: This section describes how to quickly deploy and test the DocSum service manually on an Intel® Gaudi® platform. The basic steps are: -1. [Access the Code](#access-the-code) -2. [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token) -3. [Configure the Deployment Environment](#configure-the-deployment-environment) -4. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) -5. [Check the Deployment Status](#check-the-deployment-status) -6. [Test the Pipeline](#test-the-pipeline) -7. [Cleanup the Deployment](#cleanup-the-deployment) +- [Example DocSum deployments on Intel® Gaudi® Platform](#example-docsum-deployments-on-intel-gaudi-platform) + - [DocSum Quick Start Deployment](#docsum-quick-start-deployment) + - [Access the Code and Set Up Environment](#access-the-code-and-set-up-environment) + - [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token) + - [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) + - [Check the Deployment Status](#check-the-deployment-status) + - [Test the Pipeline](#test-the-pipeline) + - [Cleanup the Deployment](#cleanup-the-deployment) + - [DocSum Docker Compose Files](#docsum-docker-compose-files) + - [DocSum Detailed Usage](#docsum-detailed-usage) + - [Query with text](#query-with-text) + - [Query with audio and video](#query-with-audio-and-video) + - [Query with long context](#query-with-long-context) + - [Launch the UI](#launch-the-ui) + - [Gradio UI](#gradio-ui) + - [Launch the Svelte UI](#launch-the-svelte-ui) + - [Launch the React UI (Optional)](#launch-the-react-ui-optional) ### Access the Code and Set Up Environment @@ -30,7 +40,7 @@ Clone the GenAIExample repository and access the DocSum Intel® Gaudi® platform ```bash git clone https://github.com/opea-project/GenAIExamples.git cd GenAIExamples/DocSum/docker_compose -source intel/set_env.sh +source intel/hpu/gaudi/set_env.sh ``` > NOTE: by default vLLM does "warmup" at start, to optimize its performance for the specified model and the underlying platform, which can take long time. For development (and e.g. autoscaling) it can be skipped with `export VLLM_SKIP_WARMUP=true`. diff --git a/DocSum/tests/test_compose_on_xeon.sh b/DocSum/tests/test_compose_on_xeon.sh index 6e252ff9c6..26033e2a8c 100644 --- a/DocSum/tests/test_compose_on_xeon.sh +++ b/DocSum/tests/test_compose_on_xeon.sh @@ -46,7 +46,7 @@ function build_docker_images() { function start_services() { cd $WORKPATH/docker_compose/intel/cpu/xeon/ export no_proxy="localhost,127.0.0.1,$ip_address" - docker compose -f compose.yaml up -d > ${LOG_PATH}/start_services_with_compose.log + docker compose -f compose.yaml -f compose.monitoring.yaml up -d > ${LOG_PATH}/start_services_with_compose.log sleep 1m } @@ -346,7 +346,7 @@ function validate_megaservice_long_text() { function stop_docker() { cd $WORKPATH/docker_compose/intel/cpu/xeon/ - docker compose stop && docker compose rm -f + docker compose -f compose.yaml -f compose.monitoring.yaml down && docker compose -f compose.yaml -f compose.monitoring.yaml rm -f } function main() { From b5a5679eed5216d39aaa4e3f26d8b2e42fff98f7 Mon Sep 17 00:00:00 2001 From: Yi Yao Date: Tue, 28 Oct 2025 11:43:17 +0800 Subject: [PATCH 2/7] Add monitoring for DocSum on Gaudi Signed-off-by: Yi Yao --- .../docker_compose/intel/cpu/xeon/README.md | 17 +++--- .../docker_compose/intel/cpu/xeon/set_env.sh | 8 ++- .../docker_compose/intel/hpu/gaudi/README.md | 15 +++++ .../intel/hpu/gaudi/compose.monitoring.yaml | 59 +++++++++++++++++++ .../dashboards/download_opea_dashboard.sh | 12 ++++ .../provisioning/dashboards/local.yaml | 14 +++++ .../provisioning/datasources/datasource.yml | 54 +++++++++++++++++ .../docker_compose/intel/hpu/gaudi/set_env.sh | 52 ++++++++++++++++ DocSum/tests/test_compose_on_gaudi.sh | 6 +- DocSum/tests/test_compose_on_xeon.sh | 2 +- DocSum/tests/test_compose_tgi_on_gaudi.sh | 6 +- DocSum/tests/test_compose_tgi_on_xeon.sh | 6 +- 12 files changed, 229 insertions(+), 22 deletions(-) create mode 100644 DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml create mode 100644 DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh create mode 100644 DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/dashboards/local.yaml create mode 100644 DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/datasources/datasource.yml create mode 100644 DocSum/docker_compose/intel/hpu/gaudi/set_env.sh diff --git a/DocSum/docker_compose/intel/cpu/xeon/README.md b/DocSum/docker_compose/intel/cpu/xeon/README.md index f21f02bbc7..122f1e78f7 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/README.md +++ b/DocSum/docker_compose/intel/cpu/xeon/README.md @@ -18,8 +18,8 @@ This section describes how to quickly deploy and test the DocSum service manuall - [Access the Code and Set Up Environment](#access-the-code-and-set-up-environment) - [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token) - [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) - - [Option #1:](#option-1) - - [Option #2:](#option-2) + - [Option #1](#option-1) + - [Option #2](#option-2) - [Check the Deployment Status](#check-the-deployment-status) - [Test the Pipeline](#test-the-pipeline) - [Cleanup the Deployment](#cleanup-the-deployment) @@ -60,7 +60,7 @@ Some HuggingFace resources, such as some models, are only accessible if you have ### Deploy the Services Using Docker Compose -#### Option #1: +#### Option #1 To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute: @@ -69,14 +69,14 @@ cd intel/cpu/xeon/ docker compose up -d ``` -#### Option #2: -> NOTE : To enable mornitoring, `compose.telemetry.yaml` file need to be merged along with default `compose.yaml` file. +#### Option #2 +> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. To deploy with mornitoring: ```bash cd intel/cpu/xeon/ -docker compose -f compose.yaml -f compose.telemetry.yaml up -d +docker compose -f compose.yaml -f compose.monitoring.yaml up -d ``` **Note**: developers should build docker image from source when: @@ -138,7 +138,7 @@ If mornitoring is enabled, execute the following command: ```bash cd intel/cpu/xeon/ -docker compose -f compose.yaml -f compose.telemetry.yaml down +docker compose -f compose.yaml -f compose.monitoring.yaml down ``` @@ -153,8 +153,7 @@ In the context of deploying a DocSum pipeline on an Intel® Xeon® platform, we | [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | | [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as default | | [compose_remote.yaml](./compose_remote.yaml) | Uses remote inference endpoints for LLMs. All other configurations are same as default | -| [compose.telemetry.yaml](./compose.telemetry.yaml) | Helper file for telemetry features for vllm. Can be used along with any compose files that serves vllm | -| [compose_tgi.telemetry.yaml](./compose_tgi.telemetry.yaml) | Helper file for telemetry features for tgi. Can be used along with any compose files that serves tgi | +| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files | ### Running LLM models with remote endpoints diff --git a/DocSum/docker_compose/intel/cpu/xeon/set_env.sh b/DocSum/docker_compose/intel/cpu/xeon/set_env.sh index b2140e7565..07f734f36e 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/set_env.sh +++ b/DocSum/docker_compose/intel/cpu/xeon/set_env.sh @@ -2,8 +2,10 @@ # Copyright (C) 2024 Intel Corporation # SPDX-License-Identifier: Apache-2.0 -SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) -pushd "${SCRIPT_DIR}/../../.." > /dev/null + +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &> /dev/null && pwd) + +pushd "$SCRIPT_DIR/../../../../../" > /dev/null source .set_env.sh popd > /dev/null @@ -40,7 +42,7 @@ export MAX_NUM_SEQS=256 export MAX_SEQ_LEN_TO_CAPTURE=2048 # Download Grafana configurations -pushd "grafana/dashboards" > /dev/null +pushd "${SCRIPT_DIR}/grafana/dashboards" > /dev/null source download_opea_dashboard.sh popd > /dev/null diff --git a/DocSum/docker_compose/intel/hpu/gaudi/README.md b/DocSum/docker_compose/intel/hpu/gaudi/README.md index fa6b224909..197176ff48 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/README.md +++ b/DocSum/docker_compose/intel/hpu/gaudi/README.md @@ -20,6 +20,8 @@ This section describes how to quickly deploy and test the DocSum service manuall - [Access the Code and Set Up Environment](#access-the-code-and-set-up-environment) - [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token) - [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) + - [Option #1](#option-1) + - [Option #2](#option-2) - [Check the Deployment Status](#check-the-deployment-status) - [Test the Pipeline](#test-the-pipeline) - [Cleanup the Deployment](#cleanup-the-deployment) @@ -59,6 +61,8 @@ Some HuggingFace resources, such as some models, are only accessible if you have ### Deploy the Services Using Docker Compose +#### Option #1 + To deploy the DocSum services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute: ```bash @@ -66,6 +70,16 @@ cd intel/hpu/gaudi/ docker compose up -d ``` +#### Option #2 +> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. + +To deploy with mornitoring: + +```bash +cd intel/cpu/xeon/ +docker compose -f compose.yaml -f compose.monitoring.yaml up -d +``` + **Note**: developers should build docker image from source when: - Developing off the git main branch (as the container's ports in the repo may be different from the published docker image). @@ -131,6 +145,7 @@ In the context of deploying a DocSum pipeline on an Intel® Gaudi® platform, th | -------------------------------------- | ----------------------------------------------------------------------------------------- | | [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | | [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default | +| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files ## DocSum Detailed Usage diff --git a/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml b/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml new file mode 100644 index 0000000000..495174ebe5 --- /dev/null +++ b/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml @@ -0,0 +1,59 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +services: + prometheus: + image: prom/prometheus:v2.52.0 + container_name: opea_prometheus + user: root + volumes: + - ./prometheus.yaml:/etc/prometheus/prometheus.yaml + - ./prometheus_data:/prometheus + command: + - '--config.file=/etc/prometheus/prometheus.yaml' + ports: + - '9090:9090' + ipc: host + restart: unless-stopped + + grafana: + image: grafana/grafana:11.0.0 + container_name: grafana + volumes: + - ./grafana_data:/var/lib/grafana + - ./grafana/dashboards:/var/lib/grafana/dashboards + - ./grafana/provisioning:/etc/grafana/provisioning + user: root + environment: + GF_SECURITY_ADMIN_PASSWORD: admin + GF_RENDERING_CALLBACK_URL: http://grafana:3000/ + GF_LOG_FILTERS: rendering:debug + no_proxy: ${no_proxy} + host_ip: ${host_ip} + depends_on: + - prometheus + ports: + - '3000:3000' + ipc: host + restart: unless-stopped + + node-exporter: + image: prom/node-exporter + container_name: node-exporter + volumes: + - /proc:/host/proc:ro + - /sys:/host/sys:ro + - /:/rootfs:ro + command: + - '--path.procfs=/host/proc' + - '--path.sysfs=/host/sys' + - --collector.filesystem.ignored-mount-points + - "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)" + environment: + no_proxy: ${no_proxy} + ports: + - 9100:9100 + restart: always + deploy: + mode: global + diff --git a/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh b/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh new file mode 100644 index 0000000000..bdab5892bc --- /dev/null +++ b/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh @@ -0,0 +1,12 @@ +#!/bin/bash +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 +if ls *.json 1> /dev/null 2>&1; then + rm *.json +fi + +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/gaudi_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/docsum_megaservice_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json \ No newline at end of file diff --git a/DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/dashboards/local.yaml b/DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/dashboards/local.yaml new file mode 100644 index 0000000000..13922a769b --- /dev/null +++ b/DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/dashboards/local.yaml @@ -0,0 +1,14 @@ +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: 1 + +providers: +- name: 'default' + orgId: 1 + folder: '' + type: file + disableDeletion: false + updateIntervalSeconds: 10 #how often Grafana will scan for changed dashboards + options: + path: /var/lib/grafana/dashboards diff --git a/DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/datasources/datasource.yml b/DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/datasources/datasource.yml new file mode 100644 index 0000000000..a206521d67 --- /dev/null +++ b/DocSum/docker_compose/intel/hpu/gaudi/grafana/provisioning/datasources/datasource.yml @@ -0,0 +1,54 @@ +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# config file version +apiVersion: 1 + +# list of datasources that should be deleted from the database +deleteDatasources: + - name: Prometheus + orgId: 1 + +# list of datasources to insert/update depending +# what's available in the database +datasources: + # name of the datasource. Required +- name: Prometheus + # datasource type. Required + type: prometheus + # access mode. direct or proxy. Required + access: proxy + # org id. will default to orgId 1 if not specified + orgId: 1 + # url + url: http://$host_ip:9090 + # database password, if used + password: + # database user, if used + user: + # database name, if used + database: + # enable/disable basic auth + basicAuth: false + # basic auth username, if used + basicAuthUser: + # basic auth password, if used + basicAuthPassword: + # enable/disable with credentials headers + withCredentials: + # mark as default datasource. Max one per org + isDefault: true + # fields that will be converted to json and stored in json_data + jsonData: + httpMethod: GET + graphiteVersion: "1.1" + tlsAuth: false + tlsAuthWithCACert: false + # json object of data that will be encrypted. + secureJsonData: + tlsCACert: "..." + tlsClientCert: "..." + tlsClientKey: "..." + version: 1 + # allow users to edit datasources from the UI. + editable: true diff --git a/DocSum/docker_compose/intel/hpu/gaudi/set_env.sh b/DocSum/docker_compose/intel/hpu/gaudi/set_env.sh new file mode 100644 index 0000000000..15d3dd626e --- /dev/null +++ b/DocSum/docker_compose/intel/hpu/gaudi/set_env.sh @@ -0,0 +1,52 @@ +#!/usr/bin/env bash + +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &> /dev/null && pwd) + +pushd "$SCRIPT_DIR/../../../../../" > /dev/null +source .set_env.sh +popd > /dev/null + +export host_ip=$(hostname -I | awk '{print $1}') # Example: host_ip="192.168.1.1" +export HF_TOKEN=${HF_TOKEN} + +export LLM_ENDPOINT_PORT=8008 +export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct" + +export BLOCK_SIZE=128 +export MAX_NUM_SEQS=256 +export MAX_SEQ_LEN_TO_CAPTURE=2048 +export NUM_CARDS=1 +export MAX_INPUT_TOKENS=1024 +export MAX_TOTAL_TOKENS=2048 + +export LLM_PORT=9000 +export LLM_ENDPOINT="http://${host_ip}:${LLM_ENDPOINT_PORT}" +export ASR_SERVICE_PORT=7066 +export DocSum_COMPONENT_NAME="OpeaDocSumvLLM" # OpeaDocSumTgi +export FRONTEND_SERVICE_PORT=5173 +export MEGA_SERVICE_HOST_IP=${host_ip} +export LLM_SERVICE_HOST_IP=${host_ip} +export ASR_SERVICE_HOST_IP=${host_ip} + +export BACKEND_SERVICE_PORT=8888 +export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:${BACKEND_SERVICE_PORT}/v1/docsum" + +export LOGFLAG=True + +export NUM_CARDS=1 +export BLOCK_SIZE=128 +export MAX_NUM_SEQS=256 +export MAX_SEQ_LEN_TO_CAPTURE=2048 + +# Download Grafana configurations +pushd "grafana/dashboards" > /dev/null +source download_opea_dashboard.sh +popd > /dev/null + +# Set network proxy settings +export no_proxy="${no_proxy},${host_ip},docsum-gaudi-vllm-service,docsum-gaudi-tgi-server,docsum-gaudi-backend-server,opea_prometheus,grafana,node-exporter,$JAEGER_IP" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" +export http_proxy=$http_proxy +export https_proxy=$https_proxy diff --git a/DocSum/tests/test_compose_on_gaudi.sh b/DocSum/tests/test_compose_on_gaudi.sh index 598e373524..529cebcd56 100644 --- a/DocSum/tests/test_compose_on_gaudi.sh +++ b/DocSum/tests/test_compose_on_gaudi.sh @@ -16,7 +16,7 @@ echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}" echo "TAG=IMAGE_TAG=${IMAGE_TAG}" export REGISTRY=${IMAGE_REPO} export TAG=${IMAGE_TAG} -source $WORKPATH/docker_compose/intel/set_env.sh +source $WORKPATH/docker_compose/intel/hpu/gaudi/set_env.sh export MODEL_CACHE=${model_cache:-"./data"} @@ -56,7 +56,7 @@ function build_docker_images() { function start_services() { cd $WORKPATH/docker_compose/intel/hpu/gaudi export no_proxy="localhost,127.0.0.1,$ip_address" - docker compose -f compose.yaml up -d > ${LOG_PATH}/start_services_with_compose.log + docker compose -f compose.yaml -f compose.monitoring.yaml up -d > ${LOG_PATH}/start_services_with_compose.log sleep 2m } @@ -356,7 +356,7 @@ function validate_megaservice_long_text() { function stop_docker() { cd $WORKPATH/docker_compose/intel/hpu/gaudi - docker compose -f compose.yaml stop && docker compose rm -f + docker compose -f compose.yaml -f compose.monitoring.yaml stop && docker compose -f compose.yaml -f compose.monitoring.yaml rm -f } function main() { diff --git a/DocSum/tests/test_compose_on_xeon.sh b/DocSum/tests/test_compose_on_xeon.sh index 26033e2a8c..19e0754da2 100644 --- a/DocSum/tests/test_compose_on_xeon.sh +++ b/DocSum/tests/test_compose_on_xeon.sh @@ -17,7 +17,7 @@ echo "TAG=IMAGE_TAG=${IMAGE_TAG}" export REGISTRY=${IMAGE_REPO} export TAG=${IMAGE_TAG} -source $WORKPATH/docker_compose/intel/set_env.sh +source $WORKPATH/docker_compose/intel/cpu/xeon/set_env.sh export MODEL_CACHE=${model_cache:-"./data"} export MAX_INPUT_TOKENS=2048 diff --git a/DocSum/tests/test_compose_tgi_on_gaudi.sh b/DocSum/tests/test_compose_tgi_on_gaudi.sh index 3ef9a92cff..7eb154cbbb 100644 --- a/DocSum/tests/test_compose_tgi_on_gaudi.sh +++ b/DocSum/tests/test_compose_tgi_on_gaudi.sh @@ -16,7 +16,7 @@ echo "TAG=IMAGE_TAG=${IMAGE_TAG}" export REGISTRY=${IMAGE_REPO} export TAG=${IMAGE_TAG} -source $WORKPATH/docker_compose/intel/set_env.sh +source $WORKPATH/docker_compose/intel/hpu/gaudi/set_env.sh export MODEL_CACHE=${model_cache:-"./data"} export MAX_INPUT_TOKENS=2048 @@ -46,7 +46,7 @@ function build_docker_images() { function start_services() { cd $WORKPATH/docker_compose/intel/hpu/gaudi export no_proxy="localhost,127.0.0.1,$ip_address" - docker compose -f compose_tgi.yaml up -d > ${LOG_PATH}/start_services_with_compose.log + docker compose -f compose_tgi.yaml -f compose.monitoring.yaml up -d > ${LOG_PATH}/start_services_with_compose.log sleep 1m } @@ -355,7 +355,7 @@ function validate_megaservice_long_text() { function stop_docker() { cd $WORKPATH/docker_compose/intel/hpu/gaudi - docker compose -f compose_tgi.yaml stop && docker compose rm -f + docker compose -f compose_tgi.yaml -f compose.monitoring.yaml stop && docker compose -f compose_tgi.yaml -f compose.monitoring.yaml rm -f } function main() { diff --git a/DocSum/tests/test_compose_tgi_on_xeon.sh b/DocSum/tests/test_compose_tgi_on_xeon.sh index 04cad66c37..1fbaa4d357 100644 --- a/DocSum/tests/test_compose_tgi_on_xeon.sh +++ b/DocSum/tests/test_compose_tgi_on_xeon.sh @@ -16,7 +16,7 @@ echo "TAG=IMAGE_TAG=${IMAGE_TAG}" export REGISTRY=${IMAGE_REPO} export TAG=${IMAGE_TAG} -source $WORKPATH/docker_compose/intel/set_env.sh +source $WORKPATH/docker_compose/intel/cpu/xeon/set_env.sh export MODEL_CACHE=${model_cache:-"./data"} export MAX_INPUT_TOKENS=2048 @@ -46,7 +46,7 @@ function build_docker_images() { function start_services() { cd $WORKPATH/docker_compose/intel/cpu/xeon/ export no_proxy="localhost,127.0.0.1,$ip_address" - docker compose -f compose_tgi.yaml up -d > ${LOG_PATH}/start_services_with_compose.log + docker compose -f compose_tgi.yaml -f compose.monitoring.yaml up -d > ${LOG_PATH}/start_services_with_compose.log sleep 1m } @@ -355,7 +355,7 @@ function validate_megaservice_long_text() { function stop_docker() { cd $WORKPATH/docker_compose/intel/cpu/xeon/ - docker compose -f compose_tgi.yaml stop && docker compose rm -f + docker compose -f compose_tgi.yaml -f compose.monitoring.yaml stop && docker compose -f compose_tgi.yaml -f compose.monitoring.yaml rm -f } function main() { From cd5c8372764a7724a29d0c22dfb283f7c75776dc Mon Sep 17 00:00:00 2001 From: Joshua Yao Date: Wed, 29 Oct 2025 01:38:41 +0000 Subject: [PATCH 3/7] Enhance monitoring for DocSum on Gaudi Signed-off-by: Joshua Yao --- .../intel/hpu/gaudi/compose.telemetry.yaml | 2 +- .../intel/cpu/xeon/compose.monitoring.yaml | 1 + .../intel/hpu/gaudi/compose.monitoring.yaml | 17 ++++++++++ .../dashboards/download_opea_dashboard.sh | 3 +- .../intel/hpu/gaudi/prometheus.yaml | 34 +++++++++++++++++++ .../docker_compose/intel/hpu/gaudi/set_env.sh | 4 +-- DocSum/tests/test_compose_on_gaudi.sh | 7 ---- DocSum/tests/test_compose_tgi_on_gaudi.sh | 1 - 8 files changed, 57 insertions(+), 12 deletions(-) create mode 100644 DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml diff --git a/ChatQnA/docker_compose/intel/hpu/gaudi/compose.telemetry.yaml b/ChatQnA/docker_compose/intel/hpu/gaudi/compose.telemetry.yaml index 00ace1e451..428271991c 100644 --- a/ChatQnA/docker_compose/intel/hpu/gaudi/compose.telemetry.yaml +++ b/ChatQnA/docker_compose/intel/hpu/gaudi/compose.telemetry.yaml @@ -62,7 +62,7 @@ services: command: - '--path.procfs=/host/proc' - '--path.sysfs=/host/sys' - - --collector.filesystem.ignored-mount-points + - --collector.filesystem.mount-points-exclude - "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)" ports: - 9100:9100 diff --git a/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml b/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml index 495174ebe5..88deb1063c 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml +++ b/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml @@ -53,6 +53,7 @@ services: no_proxy: ${no_proxy} ports: - 9100:9100 + ipc: host restart: always deploy: mode: global diff --git a/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml b/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml index 495174ebe5..664de1f9ba 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml +++ b/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml @@ -46,7 +46,9 @@ services: - /:/rootfs:ro command: - '--path.procfs=/host/proc' + - '--path.rootfs=/rootfs' - '--path.sysfs=/host/sys' + - '--path.udev.data=/rootfs/run/udev/data' - --collector.filesystem.ignored-mount-points - "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)" environment: @@ -57,3 +59,18 @@ services: deploy: mode: global + gaudi-metrics-exporter: + image: vault.habana.ai/gaudi-metric-exporter/metric-exporter:latest + privileged: true + container_name: gaudi-metrics-exporter + volumes: + - /proc:/host/proc:ro + - /sys:/host/sys:ro + - /:/rootfs:ro + - /dev:/dev + deploy: + mode: global + ports: + - 41611:41611 + restart: unless-stopped + diff --git a/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh b/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh index bdab5892bc..fbebe4e6f8 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh +++ b/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh @@ -1,12 +1,13 @@ #!/bin/bash # Copyright (C) 2025 Intel Corporation # SPDX-License-Identifier: Apache-2.0 + if ls *.json 1> /dev/null 2>&1; then rm *.json fi wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json -wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/gaudi_grafana.json +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/gaudi_grafana_v2.json wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/docsum_megaservice_grafana.json wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json \ No newline at end of file diff --git a/DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml b/DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml new file mode 100644 index 0000000000..e52d5fd316 --- /dev/null +++ b/DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml @@ -0,0 +1,34 @@ +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 +# [IP_ADDR]:{PORT_OUTSIDE_CONTAINER} -> {PORT_INSIDE_CONTAINER} / {PROTOCOL} +global: + scrape_interval: 5s + external_labels: + monitor: "my-monitor" +scrape_configs: + - job_name: "prometheus" + static_configs: + - targets: ["opea_prometheus:9090"] + - job_name: "vllm" + metrics_path: /metrics + static_configs: + - targets: ["docsum-gaudi-vllm-service:80"] + - job_name: "tgi" + metrics_path: /metrics + static_configs: + - targets: ["docsum-gaudi-tgi-server:80"] + - job_name: "docsum-backend-server" + metrics_path: /metrics + static_configs: + - targets: ["docsum-gaudi-backend-server:8888"] + - job_name: "prometheus-node-exporter" + scrape_interval: 30s + scrape_timeout: 25s + metrics_path: /metrics + static_configs: + - targets: ["node-exporter:9100"] + - job_name: "gaudi-metrics-exporter" + scrape_interval: 30s + metrics_path: /metrics + static_configs: + - targets: ["gaudi-metrics-exporter:41611"] \ No newline at end of file diff --git a/DocSum/docker_compose/intel/hpu/gaudi/set_env.sh b/DocSum/docker_compose/intel/hpu/gaudi/set_env.sh index 15d3dd626e..e571ad82ab 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/set_env.sh +++ b/DocSum/docker_compose/intel/hpu/gaudi/set_env.sh @@ -42,11 +42,11 @@ export MAX_NUM_SEQS=256 export MAX_SEQ_LEN_TO_CAPTURE=2048 # Download Grafana configurations -pushd "grafana/dashboards" > /dev/null +pushd "${SCRIPT_DIR}/grafana/dashboards" > /dev/null source download_opea_dashboard.sh popd > /dev/null # Set network proxy settings -export no_proxy="${no_proxy},${host_ip},docsum-gaudi-vllm-service,docsum-gaudi-tgi-server,docsum-gaudi-backend-server,opea_prometheus,grafana,node-exporter,$JAEGER_IP" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" +export no_proxy="${no_proxy},${host_ip},docsum-gaudi-vllm-service,docsum-gaudi-tgi-server,docsum-gaudi-backend-server,gaudi-metrics-exporter,opea_prometheus,grafana,node-exporter,$JAEGER_IP" # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" export http_proxy=$http_proxy export https_proxy=$https_proxy diff --git a/DocSum/tests/test_compose_on_gaudi.sh b/DocSum/tests/test_compose_on_gaudi.sh index 529cebcd56..654ad01282 100644 --- a/DocSum/tests/test_compose_on_gaudi.sh +++ b/DocSum/tests/test_compose_on_gaudi.sh @@ -27,12 +27,6 @@ export MAX_SEQ_LEN_TO_CAPTURE=2048 export MAX_INPUT_TOKENS=2048 export MAX_TOTAL_TOKENS=4096 -# set service host and no_proxy -export LLM_ENDPOINT="http://vllm-service:80" -export LLM_SERVICE_HOST_IP="llm-docsum-vllm" -export ASR_SERVICE_HOST_IP="whisper" -export no_proxy=$no_proxy,$LLM_SERVICE_HOST_IP,$ASR_SERVICE_HOST_IP,"vllm-service" - # Get the root folder of the current script ROOT_FOLDER=$(dirname "$(readlink -f "$0")") @@ -55,7 +49,6 @@ function build_docker_images() { function start_services() { cd $WORKPATH/docker_compose/intel/hpu/gaudi - export no_proxy="localhost,127.0.0.1,$ip_address" docker compose -f compose.yaml -f compose.monitoring.yaml up -d > ${LOG_PATH}/start_services_with_compose.log sleep 2m } diff --git a/DocSum/tests/test_compose_tgi_on_gaudi.sh b/DocSum/tests/test_compose_tgi_on_gaudi.sh index 7eb154cbbb..1dff40864b 100644 --- a/DocSum/tests/test_compose_tgi_on_gaudi.sh +++ b/DocSum/tests/test_compose_tgi_on_gaudi.sh @@ -45,7 +45,6 @@ function build_docker_images() { function start_services() { cd $WORKPATH/docker_compose/intel/hpu/gaudi - export no_proxy="localhost,127.0.0.1,$ip_address" docker compose -f compose_tgi.yaml -f compose.monitoring.yaml up -d > ${LOG_PATH}/start_services_with_compose.log sleep 1m } From 4812f664c757cd67582b78a342882c5ccde878ae Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed, 29 Oct 2025 01:42:12 +0000 Subject: [PATCH 4/7] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- .../docker_compose/intel/cpu/xeon/README.md | 20 +++++++++---------- .../intel/cpu/xeon/compose.monitoring.yaml | 1 - .../dashboards/download_opea_dashboard.sh | 2 +- .../intel/cpu/xeon/prometheus.yaml | 2 +- .../docker_compose/intel/hpu/gaudi/README.md | 15 +++++++------- .../intel/hpu/gaudi/compose.monitoring.yaml | 1 - .../dashboards/download_opea_dashboard.sh | 2 +- .../intel/hpu/gaudi/prometheus.yaml | 2 +- 8 files changed, 22 insertions(+), 23 deletions(-) diff --git a/DocSum/docker_compose/intel/cpu/xeon/README.md b/DocSum/docker_compose/intel/cpu/xeon/README.md index 122f1e78f7..2237ca92d4 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/README.md +++ b/DocSum/docker_compose/intel/cpu/xeon/README.md @@ -70,14 +70,15 @@ docker compose up -d ``` #### Option #2 -> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. + +> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. To deploy with mornitoring: ```bash cd intel/cpu/xeon/ docker compose -f compose.yaml -f compose.monitoring.yaml up -d -``` +``` **Note**: developers should build docker image from source when: @@ -139,8 +140,7 @@ If mornitoring is enabled, execute the following command: ```bash cd intel/cpu/xeon/ docker compose -f compose.yaml -f compose.monitoring.yaml down -``` - +``` All the DocSum containers will be stopped and then removed on completion of the "down" command. @@ -148,12 +148,12 @@ All the DocSum containers will be stopped and then removed on completion of the In the context of deploying a DocSum pipeline on an Intel® Xeon® platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application. -| File | Description | -| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------ | -| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | -| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as default | -| [compose_remote.yaml](./compose_remote.yaml) | Uses remote inference endpoints for LLMs. All other configurations are same as default | -| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files | +| File | Description | +| ---------------------------------------------------- | -------------------------------------------------------------------------------------- | +| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | +| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as default | +| [compose_remote.yaml](./compose_remote.yaml) | Uses remote inference endpoints for LLMs. All other configurations are same as default | +| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files | ### Running LLM models with remote endpoints diff --git a/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml b/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml index 88deb1063c..187427d348 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml +++ b/DocSum/docker_compose/intel/cpu/xeon/compose.monitoring.yaml @@ -57,4 +57,3 @@ services: restart: always deploy: mode: global - diff --git a/DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh b/DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh index be1b36dc64..5b59b3cd34 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh +++ b/DocSum/docker_compose/intel/cpu/xeon/grafana/dashboards/download_opea_dashboard.sh @@ -8,4 +8,4 @@ fi wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/docsum_megaservice_grafana.json -wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json \ No newline at end of file +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json diff --git a/DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml b/DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml index 837496e047..758627c077 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml +++ b/DocSum/docker_compose/intel/cpu/xeon/prometheus.yaml @@ -24,4 +24,4 @@ scrape_configs: - job_name: "prometheus-node-exporter" metrics_path: /metrics static_configs: - - targets: ["node-exporter:9100"] \ No newline at end of file + - targets: ["node-exporter:9100"] diff --git a/DocSum/docker_compose/intel/hpu/gaudi/README.md b/DocSum/docker_compose/intel/hpu/gaudi/README.md index 197176ff48..5d8d5f9b77 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/README.md +++ b/DocSum/docker_compose/intel/hpu/gaudi/README.md @@ -71,14 +71,15 @@ docker compose up -d ``` #### Option #2 -> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. + +> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. To deploy with mornitoring: ```bash cd intel/cpu/xeon/ docker compose -f compose.yaml -f compose.monitoring.yaml up -d -``` +``` **Note**: developers should build docker image from source when: @@ -141,11 +142,11 @@ All the DocSum containers will be stopped and then removed on completion of the In the context of deploying a DocSum pipeline on an Intel® Gaudi® platform, the allocation and utilization of Gaudi devices across different services are important considerations for optimizing performance and resource efficiency. Each of the example deployments, defined by the example Docker compose yaml files, demonstrates a unique approach to leveraging Gaudi hardware, reflecting different priorities and operational strategies. -| File | Description | -| -------------------------------------- | ----------------------------------------------------------------------------------------- | -| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | -| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default | -| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files +| File | Description | +| ---------------------------------------------------- | ----------------------------------------------------------------------------------------- | +| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework | +| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default | +| [compose.monitoring.yaml](./compose.monitoring.yaml) | Helper file for monitoring features. Can be used along with any compose files | ## DocSum Detailed Usage diff --git a/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml b/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml index 664de1f9ba..691671e656 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml +++ b/DocSum/docker_compose/intel/hpu/gaudi/compose.monitoring.yaml @@ -73,4 +73,3 @@ services: ports: - 41611:41611 restart: unless-stopped - diff --git a/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh b/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh index fbebe4e6f8..b02827a300 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh +++ b/DocSum/docker_compose/intel/hpu/gaudi/grafana/dashboards/download_opea_dashboard.sh @@ -10,4 +10,4 @@ wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/ev wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/gaudi_grafana_v2.json wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/docsum_megaservice_grafana.json -wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json \ No newline at end of file +wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json diff --git a/DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml b/DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml index e52d5fd316..16693ae112 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml +++ b/DocSum/docker_compose/intel/hpu/gaudi/prometheus.yaml @@ -31,4 +31,4 @@ scrape_configs: scrape_interval: 30s metrics_path: /metrics static_configs: - - targets: ["gaudi-metrics-exporter:41611"] \ No newline at end of file + - targets: ["gaudi-metrics-exporter:41611"] From f234f3c6bbc7921846fd9257a3f44fbf0ceb892e Mon Sep 17 00:00:00 2001 From: Yi Yao Date: Wed, 29 Oct 2025 09:47:53 +0800 Subject: [PATCH 5/7] Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- DocSum/docker_compose/intel/cpu/xeon/README.md | 4 ++-- DocSum/docker_compose/intel/hpu/gaudi/README.md | 4 ++-- DocSum/tests/test_compose_on_xeon.sh | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/DocSum/docker_compose/intel/cpu/xeon/README.md b/DocSum/docker_compose/intel/cpu/xeon/README.md index 2237ca92d4..59d4e12a98 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/README.md +++ b/DocSum/docker_compose/intel/cpu/xeon/README.md @@ -70,9 +70,9 @@ docker compose up -d ``` #### Option #2 +> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. -> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. - +To deploy with monitoring: To deploy with mornitoring: ```bash diff --git a/DocSum/docker_compose/intel/hpu/gaudi/README.md b/DocSum/docker_compose/intel/hpu/gaudi/README.md index 5d8d5f9b77..34819b7e3d 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/README.md +++ b/DocSum/docker_compose/intel/hpu/gaudi/README.md @@ -71,9 +71,9 @@ docker compose up -d ``` #### Option #2 +> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. -> NOTE : To enable mornitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. - +To deploy with monitoring: To deploy with mornitoring: ```bash diff --git a/DocSum/tests/test_compose_on_xeon.sh b/DocSum/tests/test_compose_on_xeon.sh index 19e0754da2..0b7d678db2 100644 --- a/DocSum/tests/test_compose_on_xeon.sh +++ b/DocSum/tests/test_compose_on_xeon.sh @@ -346,7 +346,7 @@ function validate_megaservice_long_text() { function stop_docker() { cd $WORKPATH/docker_compose/intel/cpu/xeon/ - docker compose -f compose.yaml -f compose.monitoring.yaml down && docker compose -f compose.yaml -f compose.monitoring.yaml rm -f + docker compose -f compose.yaml -f compose.monitoring.yaml down } function main() { From 87cbafc6e9949f3dee5fd95eb43c6a444d4044eb Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed, 29 Oct 2025 01:48:39 +0000 Subject: [PATCH 6/7] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- DocSum/docker_compose/intel/cpu/xeon/README.md | 3 ++- DocSum/docker_compose/intel/hpu/gaudi/README.md | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/DocSum/docker_compose/intel/cpu/xeon/README.md b/DocSum/docker_compose/intel/cpu/xeon/README.md index 59d4e12a98..36eb6fa8ec 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/README.md +++ b/DocSum/docker_compose/intel/cpu/xeon/README.md @@ -70,7 +70,8 @@ docker compose up -d ``` #### Option #2 -> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. + +> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. To deploy with monitoring: To deploy with mornitoring: diff --git a/DocSum/docker_compose/intel/hpu/gaudi/README.md b/DocSum/docker_compose/intel/hpu/gaudi/README.md index 34819b7e3d..65e5ae759c 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/README.md +++ b/DocSum/docker_compose/intel/hpu/gaudi/README.md @@ -71,7 +71,8 @@ docker compose up -d ``` #### Option #2 -> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. + +> NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. To deploy with monitoring: To deploy with mornitoring: From 6e344d44ef6a93347358455320fa311aeec021f8 Mon Sep 17 00:00:00 2001 From: Joshua Yao Date: Wed, 29 Oct 2025 06:10:02 +0000 Subject: [PATCH 7/7] Fix typo in DocSum README Signed-off-by: Joshua Yao --- DocSum/docker_compose/intel/cpu/xeon/README.md | 1 - DocSum/docker_compose/intel/hpu/gaudi/README.md | 1 - 2 files changed, 2 deletions(-) diff --git a/DocSum/docker_compose/intel/cpu/xeon/README.md b/DocSum/docker_compose/intel/cpu/xeon/README.md index 36eb6fa8ec..acd64b9eca 100644 --- a/DocSum/docker_compose/intel/cpu/xeon/README.md +++ b/DocSum/docker_compose/intel/cpu/xeon/README.md @@ -74,7 +74,6 @@ docker compose up -d > NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. To deploy with monitoring: -To deploy with mornitoring: ```bash cd intel/cpu/xeon/ diff --git a/DocSum/docker_compose/intel/hpu/gaudi/README.md b/DocSum/docker_compose/intel/hpu/gaudi/README.md index 65e5ae759c..70e251e869 100644 --- a/DocSum/docker_compose/intel/hpu/gaudi/README.md +++ b/DocSum/docker_compose/intel/hpu/gaudi/README.md @@ -75,7 +75,6 @@ docker compose up -d > NOTE : To enable monitoring, `compose.monitoring.yaml` file need to be merged along with default `compose.yaml` file. To deploy with monitoring: -To deploy with mornitoring: ```bash cd intel/cpu/xeon/