Member-only story
This can easily apply to other unsupported languages
Hebrew segmentation with AI
Using Google AI to segment/align Hebrew audio/text.
This article documents my journey to find a solution for automating the segmentation and alignment of the Hebrew Bible’s audio with its text at the word-by-word level as it happens ( not a tutorial ).
Friends link! ( if you’re not a member )
In a previous article, I explored using Montreal Forced Alignment (MFA) for this task and discussed the limitations of current cloud services, which did not support word-level segmentation. This time, I take on a new challenge: leveraging Google Cloud Platform (GCP) Speech-to-Text (STT) to achieve precise alignment.
For those following this series, I may also explore AWS and Azure in future articles (if there’s interest?!) and conclude with a comparison of the platforms, weighing their pros and cons.
My journey
I began my journey by testing the waters: I uploaded a single chapter directly through the GCP console. The results were far from perfect, with plenty of errors and variations in the words understood, but they were encouraging enough to motivate me to…