-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Batch transcription works using the same code tested in the python console.
When sandboxed, the Batch transcription process fails, as some underlying library tries to access "/private/etc/apache2/mime.types".
To prevent this permission error, the file needs to be accessed within the app environment (or an alternative option is needed to avoid the file, if possible).
Real-time transcription works in the sandboxed context.
I first thought to store a local copy of the mime.types file and track down where Speechmatics is accessing it (to reroute the library to access the local version), but it is elusive and I suspect there is a better solution.
If there isn't a straightforward solution using the Speechmatics Python method, I'll plan to test with a lower-abstraction approach in python.
Batch transcription test:
import speechmatics
from speechmatics.batch_client import BatchClient
ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations(certifi.where())
conf = speechmatics.models.BatchTranscriptionConfig(
language=LANGUAGE,
output_local=englishLocale if LANGUAGE == "en" else None,
operating_point=operatingPoint,
)
settings = speechmatics.models.ConnectionSettings(
url="https://asr.api.speechmatics.com/v2",
auth_token=speechmaticsAPIkey,
ssl_context=ssl_context,
)
try:
with BatchClient(settings) as client:
job_id = client.submit_job(audio=audio_file, transcription_config=conf)
transcript = client.wait_for_completion(job_id, transcription_format='json-v2')