I'm deploying the TTS on premise on a Kubernetes Cluster using the following values:
textToSpeech:
enabled: true
numberOfConcurrentRequest: 2
optimizeForTurboMode: true
image:
registry: mcr.microsoft.com
repository: azure-cognitive-services/speechservices/neural-text-to-speech
tag: 2.8.0-amd64-de-de-conradneural
pullSecrets:
- mcr # Or an existing secret
args:
eula: accept
billing: <my-endpoint>
apikey: <my-api-key>
service:
autoScaler:
maxAvailablePods: 4
type: NodePort
verification:
enabled: false
speechToText:
enabled: false
Shortly after the creation of the TTS Pods they start continuously crashing due to OOM Errors. Their calculated memory limit of 3GB is not sufficient as their average usage of memory seems to be around 3 to 4GB.
I can prevent the crashing by manually removing the limit of the deployment after installing the helm chart, but still this is not the desired behavior I assume.
Is there something I am missing or does the chart need an update?