Member-only story
Documenting my journey:
Montreal Forced Alignment — Hebrew
Step by step guide for word-level alignment using MFA + other approached
7 min readDec 23, 2024
If you’re not a member you can use this link to read this article for free.
Introduction
I’m curious and like to experiment with new things — recently, I’ve ventured into learning Biblical Hebrew (starting with the Torah in the Tanakh). I wanted to segment a large collection of Torah audios and align them with the text. After manually aligning a few chapters, I decided to pursue an AI/ML approach. This post documents my attempts with Montreal Forced Aligner (MFA) and Hebrew audio.
Background & Other Attempts:
- Cloud Speech Services (AWS & GCP): These services support Hebrew out of the box, but do not provide word-level segmentation or much customization over alignment. They can be great for splitting long audios into smaller chunks that can be manually fine-aligned or passed in a pipeline to other tools such as MFA, but I needed something more precise and stright-forward.
- Then I’ve tried Wav2Vec2-Hebrew This model naturally supports Hebrew and performs well for short segments. However, a single mistake can cause the alignment to drift irrecoverably…