Skip to content

Add Dataflow GPU inference benchmark#6541

Open
piyush123 wants to merge 1 commit intoGoogleCloudPlatform:masterfrom
piyush123:add-dataflow-gpu-inference-benchmark
Open

Add Dataflow GPU inference benchmark#6541
piyush123 wants to merge 1 commit intoGoogleCloudPlatform:masterfrom
piyush123:add-dataflow-gpu-inference-benchmark

Conversation

@piyush123
Copy link
Member

Adds dpb_dataflow_gpu_inference_benchmark, a streaming benchmark that measures BERT text classification latency and throughput on Dataflow GPU workers. Compares two inference approaches across a configurable rate sweep:

  • local_gpu: model runs directly on the worker GPU via RunInference
  • vertex_ai: workers send HTTP requests to a Vertex AI endpoint

Reports p50/p95/p99 latency, throughput, loss rate, and estimated cost per hour. Includes 27 unit tests.

Adds dpb_dataflow_gpu_inference_benchmark, a streaming benchmark that
measures BERT text classification latency and throughput on Dataflow
GPU workers. Compares two inference approaches across a configurable
rate sweep:
- local_gpu: model runs directly on the worker GPU via RunInference
- vertex_ai: workers send HTTP requests to a Vertex AI endpoint

Reports p50/p95/p99 latency, throughput, loss rate, and estimated
cost per hour. Includes 27 unit tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant