opea-project
diff --git a/‎ArbPostHearingAssistant/Dockerfile‎
Lines changed: 17 additions & 0 deletions b/‎ArbPostHearingAssistant/Dockerfile‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎ArbPostHearingAssistant/README.md‎
Lines changed: 32 additions & 0 deletions b/‎ArbPostHearingAssistant/README.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎ArbPostHearingAssistant/README_miscellaneous.md‎
Lines changed: 45 additions & 0 deletions b/‎ArbPostHearingAssistant/README_miscellaneous.md‎
Lines changed: 45 additions & 0 deletions
diff --git a/‎ArbPostHearingAssistant/arb_post_hearing_assistant.py‎
Lines changed: 148 additions & 0 deletions b/‎ArbPostHearingAssistant/arb_post_hearing_assistant.py‎
Lines changed: 148 additions & 0 deletions
diff --git a/‎ArbPostHearingAssistant/assets/img/arbitration_architecture.png‎
846 KB b/‎ArbPostHearingAssistant/assets/img/arbitration_architecture.png‎
846 KB
diff --git a/‎ArbPostHearingAssistant/assets/img/arbritation_post_hearing_ui_gradio_text.png‎
582 KB b/‎ArbPostHearingAssistant/assets/img/arbritation_post_hearing_ui_gradio_text.png‎
582 KB
diff --git a/‎ArbPostHearingAssistant/benchmark_arb_post_hearing_assistant.yaml‎
Lines changed: 77 additions & 0 deletions b/‎ArbPostHearingAssistant/benchmark_arb_post_hearing_assistant.yaml‎
Lines changed: 77 additions & 0 deletions
@@ -0,0 +1,17 @@
+# Copyright (C) 2025 Zensar Technologies Private Ltd.
+# SPDX-License-Identifier: Apache-2.0
+
+ARG IMAGE_REPO=opea
+ARG BASE_TAG=latest
+FROM opea/comps-base:latest
+
+USER root
+# FFmpeg needed for media processing
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends ffmpeg && \
+    apt-get clean && rm -rf /var/lib/apt/lists/*
+USER user
+
+COPY ./arb_post_hearing_assistant.py $HOME/arb_post_hearing_assistant.py
+
+ENTRYPOINT ["python", "arb_post_hearing_assistant.py"]
@@ -0,0 +1,32 @@
+# Arbitration Post-Hearing Assistant
+
+The Arbitration Post-Hearing Assistant is a GenAI-based module designed to process and summarize post-hearing transcripts or arbitration-related documents. It intelligently extracts key entities and insights to assist arbitrators, legal teams, and case managers in managing case follow-ups efficiently.
+
+## Table of contents
+
+1. [Architecture](#architecture)
+2. [Deployment Options](#deployment-options)
+
+## Architecture
+
+The architecture of the ArbPostHearingAssistant Application is illustrated below:
+
+![Architecture](./assets/img/arbitration_architecture.png)
+
+The ArbPostHearingAssistant example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps).
+
+## Deployment Options
+
+The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.
+
+| Category               | Deployment Option      | Description                                                                     |
+| ---------------------- | ---------------------- | ------------------------------------------------------------------------------- |
+| On-premise Deployments | Docker Compose (Xeon)  | [ArbPostHearingAssistant deployment on Xeon](./docker_compose/intel/cpu/xeon)   |
+|                        | Docker Compose (Gaudi) | [ArbPostHearingAssistant deployment on Gaudi](./docker_compose/intel/hpu/gaudi) |
+
+## Validated Configurations
+
+| **Deploy Method** | **LLM Engine** | **LLM Model**                      | **Hardware** |
+| ----------------- | -------------- | ---------------------------------- | ------------ |
+| Docker Compose    | vLLM, TGI      | mistralai/Mistral-7B-Instruct-v0.2 | Intel Gaudi  |
+| Docker Compose    | vLLM, TGI      | mistralai/Mistral-7B-Instruct-v0.2 | Intel Xeon   |
@@ -0,0 +1,45 @@
+# Table of Contents
+
+- [Table of Contents](#table-of-contents)
+  - [Build MegaService Docker Image](#build-megaservice-docker-image)
+  - [Build UI Docker Image](#build-ui-docker-image)
+  - [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
+  - [Troubleshooting](#troubleshooting)
+
+## Build MegaService Docker Image
+
+To construct the Megaservice of ArbPostHearingAssistant, the [GenAIExamples](https://github.com/opea-project/GenAIExamples.git) repository is utilized. Build MegaService Docker image via command below:
+
+```bash
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/ArbPostHearingAssistant
+docker build --no-cache -t opea/arb-post-hearing-assistant:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+```
+
+## Build UI Docker Image
+
+Build frontend Docker image via below command:
+
+```bash
+cd GenAIExamples/ArbPostHearingAssistant/ui
+docker build -t opea/arb-post-hearing-assistant-gradio-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
+```
+
+## Generate a HuggingFace Access Token
+
+Some HuggingFace resources, such as certain models, are only accessible if the developer has an access token. If you don't have a HuggingFace access token, you can create one by registering at [HuggingFace](https://huggingface.co/) and following [these steps](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
+
+## Troubleshooting
+
+1. If you get errors like "Access Denied", [validate micro service](./README.md#validate-microservices) first. A simple example:
+
+   ```bash
+   http_proxy=""
+   curl http://${host_ip}:8008/generate \
+     -X POST \
+     -d '{"inputs":"    ### System: Please translate the following Golang codes into  Python codes.    ### Original codes:    '\'''\'''\''Golang    \npackage main\n\nimport \"fmt\"\nfunc main() {\n    fmt.Println(\"Hello, World!\");\n    '\'''\'''\''    ### Translated codes:","parameters":{"max_tokens":17, "do_sample": true}}' \
+     -H 'Content-Type: application/json'
+   ```
+
+2. (Docker only) If all microservices work well, check the port ${host_ip}:7777, the port may be allocated by other users, you can modify the `compose.yaml`.
+3. (Docker only) If you get errors like "The container name is in use", change container name in `compose.yaml`.
@@ -0,0 +1,148 @@
+# Copyright (C) 2025 Zensar Technologies Private Ltd.
+# SPDX-License-Identifier: Apache-2.0
+
+import asyncio
+import base64
+import json
+import os
+import subprocess
+import uuid
+from typing import List
+
+from comps import MegaServiceEndpoint, MicroService, ServiceOrchestrator, ServiceRoleType, ServiceType
+from comps.cores.mega.utils import handle_message
+from comps.cores.proto.api_protocol import (
+    ArbPostHearingAssistantChatCompletionRequest,
+    ChatCompletionRequest,
+    ChatCompletionResponse,
+    ChatCompletionResponseChoice,
+    ChatMessage,
+    UsageInfo,
+)
+from fastapi import Request
+from fastapi.responses import StreamingResponse
+
+MEGA_SERVICE_PORT = int(os.getenv("MEGA_SERVICE_PORT", 8888))
+
+LLM_SERVICE_HOST_IP = os.getenv("LLM_SERVICE_HOST_IP", "0.0.0.0")
+LLM_SERVICE_PORT = int(os.getenv("LLM_SERVICE_PORT", 9000))
+
+
+def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **kwargs):
+    if self.services[cur_node].service_type == ServiceType.ARB_POST_HEARING_ASSISTANT:
+        for key_to_replace in ["text", "asr_result"]:
+            if key_to_replace in inputs:
+                inputs["messages"] = inputs[key_to_replace]
+                del inputs[key_to_replace]
+
+        arbPostHearingAssistant_parameters = kwargs.get("arbPostHearingAssistant_parameters", None)
+        if arbPostHearingAssistant_parameters:
+            arbPostHearingAssistant_parameters = arbPostHearingAssistant_parameters.model_dump()
+            del arbPostHearingAssistant_parameters["messages"]
+            inputs.update(arbPostHearingAssistant_parameters)
+        if "id" in inputs:
+            del inputs["id"]
+        if "max_new_tokens" in inputs:
+            del inputs["max_new_tokens"]
+        if "input" in inputs:
+            del inputs["input"]
+    return inputs
+
+
+def align_outputs(self, data, *args, **kwargs):
+    return data
+
+
+class OpeaArbPostHearingAssistantService:
+    def __init__(self, host="0.0.0.0", port=8000):
+        self.host = host
+        self.port = port
+        ServiceOrchestrator.align_inputs = align_inputs
+        ServiceOrchestrator.align_outputs = align_outputs
+        self.megaservice = ServiceOrchestrator()
+        self.endpoint = "/v1/arb-post-hearing"
+
+    def add_remote_service(self):
+
+        arb_post_hearing_assistant = MicroService(
+            name="opea_service@arb_post_hearing_assistant",
+            host=LLM_SERVICE_HOST_IP,
+            port=LLM_SERVICE_PORT,
+            endpoint="/v1/arb-post-hearing",
+            use_remote_service=True,
+            service_type=ServiceType.ARB_POST_HEARING_ASSISTANT,
+        )
+        self.megaservice.add(arb_post_hearing_assistant)
+
+    async def handle_request(self, request: Request):
+        """Accept pure text."""
+        if "application/json" in request.headers.get("content-type"):
+            data = await request.json()
+            chunk_size = data.get("chunk_size", -1)
+            chunk_overlap = data.get("chunk_overlap", -1)
+            chat_request = ArbPostHearingAssistantChatCompletionRequest.model_validate(data)
+            prompt = handle_message(chat_request.messages)
+            print(f"messages:{chat_request.messages}")
+            print(f"prompt: {prompt}")
+            initial_inputs_data = {data["type"]: prompt}
+        else:
+            raise ValueError(f"Unknown request type: {request.headers.get('content-type')}")
+
+        arbPostHearingAssistant_parameters = ArbPostHearingAssistantChatCompletionRequest(
+            messages=chat_request.messages,
+            max_tokens=chat_request.max_tokens if chat_request.max_tokens else 1024,
+            top_k=chat_request.top_k if chat_request.top_k else 10,
+            top_p=chat_request.top_p if chat_request.top_p else 0.95,
+            temperature=chat_request.temperature if chat_request.temperature else 0.01,
+            frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
+            presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
+            repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
+            model=chat_request.model if chat_request.model else None,
+            language=chat_request.language if chat_request.language else "en",
+            chunk_overlap=chunk_overlap,
+            chunk_size=chunk_size,
+        )
+        result_dict, runtime_graph = await self.megaservice.schedule(
+            initial_inputs=initial_inputs_data, arbPostHearingAssistant_parameters=arbPostHearingAssistant_parameters
+        )
+
+        for node, response in result_dict.items():
+            # Here it suppose the last microservice in the megaservice is LLM.
+            if (
+                isinstance(response, StreamingResponse)
+                and node == list(self.megaservice.services.keys())[-1]
+                and self.megaservice.services[node].service_type == ServiceType.ARB_POST_HEARING_ASSISTANT
+            ):
+                return response
+
+        last_node = runtime_graph.all_leaves()[-1]
+        response = result_dict[last_node]["text"]
+        choices = []
+        usage = UsageInfo()
+        choices.append(
+            ChatCompletionResponseChoice(
+                index=0,
+                message=ChatMessage(role="assistant", content=response),
+                finish_reason="stop",
+            )
+        )
+        return ChatCompletionResponse(model="arbPostHearingAssistant", choices=choices, usage=usage)
+
+    def start(self):
+        self.service = MicroService(
+            self.__class__.__name__,
+            service_role=ServiceRoleType.MEGASERVICE,
+            host=self.host,
+            port=self.port,
+            endpoint=self.endpoint,
+            input_datatype=ArbPostHearingAssistantChatCompletionRequest,
+            output_datatype=ChatCompletionResponse,
+        )
+        self.service.add_route(self.endpoint, self.handle_request, methods=["POST"])
+        self.service.start()
+
+
+if __name__ == "__main__":
+    arbPostHearingAssistant = OpeaArbPostHearingAssistantService(port=MEGA_SERVICE_PORT)
+    arbPostHearingAssistant.add_remote_service()
+    arbPostHearingAssistant.start()
@@ -0,0 +1,77 @@
+# Copyright (C) 2025 Zensar Technologies Private Ltd.
+# SPDX-License-Identifier: Apache-2.0
+
+deploy:
+  device: gaudi
+  version: 1.3.0
+  modelUseHostPath: /mnt/models
+  HF_TOKEN: "" # mandatory
+  node: [1]
+  namespace: ""
+  node_name: []
+  timeout: 1000  # timeout in seconds for services to be ready, default 30 minutes
+  interval: 5    # interval in seconds between service ready checks, default 5 seconds
+
+  services:
+    backend:
+      resources:
+        enabled: False
+        cores_per_instance: "16"
+        memory_capacity: "8000Mi"
+      replicaCount: [1]
+
+    llm:
+      engine: vllm  # or tgi
+      model_id: "mistralai/Mistral-7B-Instruct-v0.2" # mandatory
+      replicaCount: [1]
+      resources:
+        enabled: False
+        cards_per_instance: 1
+      model_params:
+        vllm:  # VLLM specific parameters
+          batch_params:
+            enabled: True
+            max_num_seqs: "8"    # Each value triggers an LLM service upgrade
+          token_params:
+            enabled: True
+            max_input_length: ""
+            max_total_tokens: ""
+            max_batch_total_tokens: ""
+            max_batch_prefill_tokens: ""
+        tgi:   # TGI specific parameters
+          batch_params:
+            enabled: True
+            max_batch_size: [1]  # Each value triggers an LLM service upgrade
+          token_params:
+            enabled: False
+            max_input_length: "1280"
+            max_total_tokens: "2048"
+            max_batch_total_tokens: "65536"
+            max_batch_prefill_tokens: "4096"
+
+    arbPostHearingAssistant-ui:
+      replicaCount: [1]
+
+    llm-uservice:
+      model_id: "mistralai/Mistral-7B-Instruct-v0.2" # mandatory
+      replicaCount: [1]
+
+    nginx:
+      replicaCount: [1]
+
+benchmark:
+  # http request behavior related fields
+  user_queries:              [16]
+  concurrency:               [4]
+  load_shape_type:           "constant" # "constant" or "poisson"
+  poisson_arrival_rate:      1.0  # only used when load_shape_type is "poisson"
+  warmup_iterations:         10
+  seed:                      1024
+  collect_service_metric:    True
+
+  # workload, all of the test cases will run for benchmark
+  bench_target: ["arbPostHearingAssistantfixed"] # specify the bench_target for benchmark
+  dataset: "/home/sdp/pubmed_10.txt"  # specify the absolute path to the dataset file
+  llm:
+    # specify the llm output token size
+    max_token_size:          [1024]