Skip to content

Commit 9a76890

Browse files
committed
Merge branch 'main' into ze-fix/sec
2 parents 5e4b769 + 197678c commit 9a76890

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+2465
-8
lines changed

ArbPostHearingAssistant/Dockerfile

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Copyright (C) 2025 Zensar Technologies Private Ltd.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
ARG IMAGE_REPO=opea
5+
ARG BASE_TAG=latest
6+
FROM opea/comps-base:latest
7+
8+
USER root
9+
# FFmpeg needed for media processing
10+
RUN apt-get update && \
11+
apt-get install -y --no-install-recommends ffmpeg && \
12+
apt-get clean && rm -rf /var/lib/apt/lists/*
13+
USER user
14+
15+
COPY ./arb_post_hearing_assistant.py $HOME/arb_post_hearing_assistant.py
16+
17+
ENTRYPOINT ["python", "arb_post_hearing_assistant.py"]

ArbPostHearingAssistant/README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Arbitration Post-Hearing Assistant
2+
3+
The Arbitration Post-Hearing Assistant is a GenAI-based module designed to process and summarize post-hearing transcripts or arbitration-related documents. It intelligently extracts key entities and insights to assist arbitrators, legal teams, and case managers in managing case follow-ups efficiently.
4+
5+
## Table of contents
6+
7+
1. [Architecture](#architecture)
8+
2. [Deployment Options](#deployment-options)
9+
10+
## Architecture
11+
12+
The architecture of the ArbPostHearingAssistant Application is illustrated below:
13+
14+
![Architecture](./assets/img/arbitration_architecture.png)
15+
16+
The ArbPostHearingAssistant example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps).
17+
18+
## Deployment Options
19+
20+
The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.
21+
22+
| Category | Deployment Option | Description |
23+
| ---------------------- | ---------------------- | ------------------------------------------------------------------------------- |
24+
| On-premise Deployments | Docker Compose (Xeon) | [ArbPostHearingAssistant deployment on Xeon](./docker_compose/intel/cpu/xeon) |
25+
| | Docker Compose (Gaudi) | [ArbPostHearingAssistant deployment on Gaudi](./docker_compose/intel/hpu/gaudi) |
26+
27+
## Validated Configurations
28+
29+
| **Deploy Method** | **LLM Engine** | **LLM Model** | **Hardware** |
30+
| ----------------- | -------------- | ---------------------------------- | ------------ |
31+
| Docker Compose | vLLM, TGI | mistralai/Mistral-7B-Instruct-v0.2 | Intel Gaudi |
32+
| Docker Compose | vLLM, TGI | mistralai/Mistral-7B-Instruct-v0.2 | Intel Xeon |
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Table of Contents
2+
3+
- [Table of Contents](#table-of-contents)
4+
- [Build MegaService Docker Image](#build-megaservice-docker-image)
5+
- [Build UI Docker Image](#build-ui-docker-image)
6+
- [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token)
7+
- [Troubleshooting](#troubleshooting)
8+
9+
## Build MegaService Docker Image
10+
11+
To construct the Megaservice of ArbPostHearingAssistant, the [GenAIExamples](https://github.com/opea-project/GenAIExamples.git) repository is utilized. Build MegaService Docker image via command below:
12+
13+
```bash
14+
git clone https://github.com/opea-project/GenAIExamples.git
15+
cd GenAIExamples/ArbPostHearingAssistant
16+
docker build --no-cache -t opea/arb-post-hearing-assistant:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
17+
```
18+
19+
## Build UI Docker Image
20+
21+
Build frontend Docker image via below command:
22+
23+
```bash
24+
cd GenAIExamples/ArbPostHearingAssistant/ui
25+
docker build -t opea/arb-post-hearing-assistant-gradio-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
26+
```
27+
28+
## Generate a HuggingFace Access Token
29+
30+
Some HuggingFace resources, such as certain models, are only accessible if the developer has an access token. If you don't have a HuggingFace access token, you can create one by registering at [HuggingFace](https://huggingface.co/) and following [these steps](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
31+
32+
## Troubleshooting
33+
34+
1. If you get errors like "Access Denied", [validate micro service](./README.md#validate-microservices) first. A simple example:
35+
36+
```bash
37+
http_proxy=""
38+
curl http://${host_ip}:8008/generate \
39+
-X POST \
40+
-d '{"inputs":" ### System: Please translate the following Golang codes into Python codes. ### Original codes: '\'''\'''\''Golang \npackage main\n\nimport \"fmt\"\nfunc main() {\n fmt.Println(\"Hello, World!\");\n '\'''\'''\'' ### Translated codes:","parameters":{"max_tokens":17, "do_sample": true}}' \
41+
-H 'Content-Type: application/json'
42+
```
43+
44+
2. (Docker only) If all microservices work well, check the port ${host_ip}:7777, the port may be allocated by other users, you can modify the `compose.yaml`.
45+
3. (Docker only) If you get errors like "The container name is in use", change container name in `compose.yaml`.
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
# Copyright (C) 2025 Zensar Technologies Private Ltd.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
import asyncio
5+
import base64
6+
import json
7+
import os
8+
import subprocess
9+
import uuid
10+
from typing import List
11+
12+
from comps import MegaServiceEndpoint, MicroService, ServiceOrchestrator, ServiceRoleType, ServiceType
13+
from comps.cores.mega.utils import handle_message
14+
from comps.cores.proto.api_protocol import (
15+
ArbPostHearingAssistantChatCompletionRequest,
16+
ChatCompletionRequest,
17+
ChatCompletionResponse,
18+
ChatCompletionResponseChoice,
19+
ChatMessage,
20+
UsageInfo,
21+
)
22+
from fastapi import Request
23+
from fastapi.responses import StreamingResponse
24+
25+
MEGA_SERVICE_PORT = int(os.getenv("MEGA_SERVICE_PORT", 8888))
26+
27+
LLM_SERVICE_HOST_IP = os.getenv("LLM_SERVICE_HOST_IP", "0.0.0.0")
28+
LLM_SERVICE_PORT = int(os.getenv("LLM_SERVICE_PORT", 9000))
29+
30+
31+
def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **kwargs):
32+
if self.services[cur_node].service_type == ServiceType.ARB_POST_HEARING_ASSISTANT:
33+
for key_to_replace in ["text", "asr_result"]:
34+
if key_to_replace in inputs:
35+
inputs["messages"] = inputs[key_to_replace]
36+
del inputs[key_to_replace]
37+
38+
arbPostHearingAssistant_parameters = kwargs.get("arbPostHearingAssistant_parameters", None)
39+
if arbPostHearingAssistant_parameters:
40+
arbPostHearingAssistant_parameters = arbPostHearingAssistant_parameters.model_dump()
41+
del arbPostHearingAssistant_parameters["messages"]
42+
inputs.update(arbPostHearingAssistant_parameters)
43+
if "id" in inputs:
44+
del inputs["id"]
45+
if "max_new_tokens" in inputs:
46+
del inputs["max_new_tokens"]
47+
if "input" in inputs:
48+
del inputs["input"]
49+
return inputs
50+
51+
52+
def align_outputs(self, data, *args, **kwargs):
53+
return data
54+
55+
56+
class OpeaArbPostHearingAssistantService:
57+
def __init__(self, host="0.0.0.0", port=8000):
58+
self.host = host
59+
self.port = port
60+
ServiceOrchestrator.align_inputs = align_inputs
61+
ServiceOrchestrator.align_outputs = align_outputs
62+
self.megaservice = ServiceOrchestrator()
63+
self.endpoint = "/v1/arb-post-hearing"
64+
65+
def add_remote_service(self):
66+
67+
arb_post_hearing_assistant = MicroService(
68+
name="opea_service@arb_post_hearing_assistant",
69+
host=LLM_SERVICE_HOST_IP,
70+
port=LLM_SERVICE_PORT,
71+
endpoint="/v1/arb-post-hearing",
72+
use_remote_service=True,
73+
service_type=ServiceType.ARB_POST_HEARING_ASSISTANT,
74+
)
75+
self.megaservice.add(arb_post_hearing_assistant)
76+
77+
async def handle_request(self, request: Request):
78+
"""Accept pure text."""
79+
if "application/json" in request.headers.get("content-type"):
80+
data = await request.json()
81+
chunk_size = data.get("chunk_size", -1)
82+
chunk_overlap = data.get("chunk_overlap", -1)
83+
chat_request = ArbPostHearingAssistantChatCompletionRequest.model_validate(data)
84+
prompt = handle_message(chat_request.messages)
85+
print(f"messages:{chat_request.messages}")
86+
print(f"prompt: {prompt}")
87+
initial_inputs_data = {data["type"]: prompt}
88+
else:
89+
raise ValueError(f"Unknown request type: {request.headers.get('content-type')}")
90+
91+
arbPostHearingAssistant_parameters = ArbPostHearingAssistantChatCompletionRequest(
92+
messages=chat_request.messages,
93+
max_tokens=chat_request.max_tokens if chat_request.max_tokens else 1024,
94+
top_k=chat_request.top_k if chat_request.top_k else 10,
95+
top_p=chat_request.top_p if chat_request.top_p else 0.95,
96+
temperature=chat_request.temperature if chat_request.temperature else 0.01,
97+
frequency_penalty=chat_request.frequency_penalty if chat_request.frequency_penalty else 0.0,
98+
presence_penalty=chat_request.presence_penalty if chat_request.presence_penalty else 0.0,
99+
repetition_penalty=chat_request.repetition_penalty if chat_request.repetition_penalty else 1.03,
100+
model=chat_request.model if chat_request.model else None,
101+
language=chat_request.language if chat_request.language else "en",
102+
chunk_overlap=chunk_overlap,
103+
chunk_size=chunk_size,
104+
)
105+
result_dict, runtime_graph = await self.megaservice.schedule(
106+
initial_inputs=initial_inputs_data, arbPostHearingAssistant_parameters=arbPostHearingAssistant_parameters
107+
)
108+
109+
for node, response in result_dict.items():
110+
# Here it suppose the last microservice in the megaservice is LLM.
111+
if (
112+
isinstance(response, StreamingResponse)
113+
and node == list(self.megaservice.services.keys())[-1]
114+
and self.megaservice.services[node].service_type == ServiceType.ARB_POST_HEARING_ASSISTANT
115+
):
116+
return response
117+
118+
last_node = runtime_graph.all_leaves()[-1]
119+
response = result_dict[last_node]["text"]
120+
choices = []
121+
usage = UsageInfo()
122+
choices.append(
123+
ChatCompletionResponseChoice(
124+
index=0,
125+
message=ChatMessage(role="assistant", content=response),
126+
finish_reason="stop",
127+
)
128+
)
129+
return ChatCompletionResponse(model="arbPostHearingAssistant", choices=choices, usage=usage)
130+
131+
def start(self):
132+
self.service = MicroService(
133+
self.__class__.__name__,
134+
service_role=ServiceRoleType.MEGASERVICE,
135+
host=self.host,
136+
port=self.port,
137+
endpoint=self.endpoint,
138+
input_datatype=ArbPostHearingAssistantChatCompletionRequest,
139+
output_datatype=ChatCompletionResponse,
140+
)
141+
self.service.add_route(self.endpoint, self.handle_request, methods=["POST"])
142+
self.service.start()
143+
144+
145+
if __name__ == "__main__":
146+
arbPostHearingAssistant = OpeaArbPostHearingAssistantService(port=MEGA_SERVICE_PORT)
147+
arbPostHearingAssistant.add_remote_service()
148+
arbPostHearingAssistant.start()
846 KB
Loading
582 KB
Loading
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# Copyright (C) 2025 Zensar Technologies Private Ltd.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
deploy:
5+
device: gaudi
6+
version: 1.3.0
7+
modelUseHostPath: /mnt/models
8+
HF_TOKEN: "" # mandatory
9+
node: [1]
10+
namespace: ""
11+
node_name: []
12+
timeout: 1000 # timeout in seconds for services to be ready, default 30 minutes
13+
interval: 5 # interval in seconds between service ready checks, default 5 seconds
14+
15+
services:
16+
backend:
17+
resources:
18+
enabled: False
19+
cores_per_instance: "16"
20+
memory_capacity: "8000Mi"
21+
replicaCount: [1]
22+
23+
llm:
24+
engine: vllm # or tgi
25+
model_id: "mistralai/Mistral-7B-Instruct-v0.2" # mandatory
26+
replicaCount: [1]
27+
resources:
28+
enabled: False
29+
cards_per_instance: 1
30+
model_params:
31+
vllm: # VLLM specific parameters
32+
batch_params:
33+
enabled: True
34+
max_num_seqs: "8" # Each value triggers an LLM service upgrade
35+
token_params:
36+
enabled: True
37+
max_input_length: ""
38+
max_total_tokens: ""
39+
max_batch_total_tokens: ""
40+
max_batch_prefill_tokens: ""
41+
tgi: # TGI specific parameters
42+
batch_params:
43+
enabled: True
44+
max_batch_size: [1] # Each value triggers an LLM service upgrade
45+
token_params:
46+
enabled: False
47+
max_input_length: "1280"
48+
max_total_tokens: "2048"
49+
max_batch_total_tokens: "65536"
50+
max_batch_prefill_tokens: "4096"
51+
52+
arbPostHearingAssistant-ui:
53+
replicaCount: [1]
54+
55+
llm-uservice:
56+
model_id: "mistralai/Mistral-7B-Instruct-v0.2" # mandatory
57+
replicaCount: [1]
58+
59+
nginx:
60+
replicaCount: [1]
61+
62+
benchmark:
63+
# http request behavior related fields
64+
user_queries: [16]
65+
concurrency: [4]
66+
load_shape_type: "constant" # "constant" or "poisson"
67+
poisson_arrival_rate: 1.0 # only used when load_shape_type is "poisson"
68+
warmup_iterations: 10
69+
seed: 1024
70+
collect_service_metric: True
71+
72+
# workload, all of the test cases will run for benchmark
73+
bench_target: ["arbPostHearingAssistantfixed"] # specify the bench_target for benchmark
74+
dataset: "/home/sdp/pubmed_10.txt" # specify the absolute path to the dataset file
75+
llm:
76+
# specify the llm output token size
77+
max_token_size: [1024]

0 commit comments

Comments
 (0)