Add Dataflow GPU inference benchmark by piyush123 · Pull Request #6541 · GoogleCloudPlatform/PerfKitBenchmarker

piyush123 · 2026-03-17T06:29:00Z

Adds dpb_dataflow_gpu_inference_benchmark, a streaming benchmark that measures BERT text classification latency and throughput on Dataflow GPU workers. Compares two inference approaches across a configurable rate sweep:

local_gpu: model runs directly on the worker GPU via RunInference
vertex_ai: workers send HTTP requests to a Vertex AI endpoint

Reports p50/p95/p99 latency, throughput, loss rate, and estimated cost per hour. Includes 27 unit tests.

Adds dpb_dataflow_gpu_inference_benchmark, a streaming benchmark that measures BERT text classification latency and throughput on Dataflow GPU workers. Compares two inference approaches across a configurable rate sweep: - local_gpu: model runs directly on the worker GPU via RunInference - vertex_ai: workers send HTTP requests to a Vertex AI endpoint Reports p50/p95/p99 latency, throughput, loss rate, and estimated cost per hour. Includes 27 unit tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Dataflow GPU inference benchmark#6541

Add Dataflow GPU inference benchmark#6541
piyush123 wants to merge 1 commit intoGoogleCloudPlatform:masterfrom
piyush123:add-dataflow-gpu-inference-benchmark

piyush123 commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

piyush123 commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant