Add postStart hook for engine #113

brianrudolf · 2025-11-13T15:03:17Z

Proposed changes

This change adds (optional) new configuration to the engine container spec to set a postStart lifecycle command hook.

The motivation for this change comes from encountering difficulty running the Flux model on Google Kubernetes Engine. Following the guide for running GPUs in GKE does not use Nvidia's GPU Operator to configure nodes for use with CUDA applications and instead follows a similar but slightly different configuration approach that relies on the LD_LIBRARY_PATH environment variable to access the Nvidia drivers and CUDA libraries.

Due to the technical complexities of operating Flux, its startup process actually clears this environment variable early (but not immediately) in the start up sequence. A simple solution to this problem is for the engine container to run ldconfig to create the necessary run time bindings prior to the engine startup, which Kubernetes facilitates with this postStart hook.

Use of the toJson function ensures proper formatting of the command value:

          lifecycle:
            postStart:
              exec:
                command: ["/sbin/ldconfig"]

Relevant information from Google's documentation:

About the NVIDIA CUDA-X libraries

To use CUDA applications, the image that you use must have the libraries. To add the NVIDIA CUDA-X libraries, you can build and use your own image by including the following values in the LD_LIBRARY_PATH environment variable in your container specification:

/usr/local/nvidia/lib64: the location of the NVIDIA device drivers.
/usr/local/cuda-CUDA_VERSION/lib64: the location of the NVIDIA CUDA-X libraries on the node.

Types of changes

What types of changes does your code introduce to the Deepgram self-hosted resources?
Put an x in the boxes that apply

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update or tests (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

I have read the CONTRIBUTING doc
I have tested my changes in my local self-hosted environment
- I have deployed this chart with Flux enabled and see the engine container start successfully using the GPU
I have added necessary documentation (if appropriate)

Further comments

pcgeek86

LGTM

Add postStart hook for engine

ca48e7d

brianrudolf marked this pull request as ready for review November 13, 2025 16:20

brianrudolf requested review from a team and therealevanhenry as code owners November 13, 2025 16:20

Correct formatting of postStart

b6a0ec5

pcgeek86 approved these changes Nov 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add postStart hook for engine #113

Add postStart hook for engine #113

Uh oh!

brianrudolf commented Nov 13, 2025 •

edited

Loading

Uh oh!

pcgeek86 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add postStart hook for engine #113

Are you sure you want to change the base?

Add postStart hook for engine #113

Uh oh!

Conversation

brianrudolf commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Types of changes

Checklist

Further comments

Uh oh!

pcgeek86 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brianrudolf commented Nov 13, 2025 •

edited

Loading