In a church that I attend, we were met with a new problem one day: a deaf person began attending our services. Which is great, except that at the time we had no way of making our services accessible to him. Fortunately, he exercised patience with us and got by with lip-reading until we figured out a good way to caption our weekly services. Some options we had:
- Pay for remote captioning
- Hire an interpreter
- Use text-to-speech software like Dragon NaturallySpeaking or Narrator in Windows
Paying for remote captioning would be costly, and in order to keep it operational every week we would need to set up and maintain equipment and train a skilled person. And being a very small church with currently only one deaf person, we wanted to avoid hiring a costly deaf interpreter if we could.
We tried using Dragon NaturallySpeaking, a popular speech recognition package, for a while. We would have the audio from main speaker of the day passed through to a computer and have the software type into a text editor window that was open and focused full screen.
It worked with mixed results. Many times, we would experience a delay in transcribing in minutes or more or until we toggled the microphone off and on. The software needed to be “trained” with the voices beforehand before it would begin recognizing them. And sometimes it would “forget” an audio source and we would need to either restart the program or reboot the computer. We would also need to set the font size of the text editor manually so it was readable from a distance or open up a template with the font ready to go. And you had to be careful not to click outside of the text editor or do something to cause it to lose focus, because then Dragon would no longer be typing into it.
All things we don’t want to deal with in the middle of an event. 👎
Browser-Based Web Speech API
It was around this time that I discovered the Web Speech API. The Web Speech API lets you do both speech synthesis and speech recognition directly in the browser. As of July 2017, Google Chrome is the only browser to support the speech recognition part of the Web Speech API. Firefox’s implementation of the API is unfinished. Microsoft Edge currently supports speech synthesis but not speech recognition. So currently, support API is quite varied. W3C’s Web Speech API Specification has been complete since 2012, so at this point it’s just up to browser vendors to implement it.
Google’s Web Speech API Demonstration shows off the API beautifully. You just click the microphone button, give it access to your microphone, and start talking.
Some quick tests on the demo page using recordings from a church service showed great results. The demo page itself was about 75% of the way there to being exactly what we needed. I tweaked fonts, colors, and added some quick logic for making text editable and automatically scrolling to the bottom of the page when new text gets transcribed.
One problem I also addressed (and continue to deal with) is that the Web Speech API will stop returning text at random times. Because the whole API is somewhat of a black box — it only exposes a few switches and buttons that I can play with — it’s hard to see exactly what causes the problem. Others have reported this for a while now but it continues to be an issue. The way I deal with this right now is to simply check if audio is coming in but no text is coming out. If that’s the case, then I restart the speech recognition service. It works pretty well for now but I’m continuing to look for ways to improve reliability.
The transition from Dragon NaturallySpeaking (our previous “best” speech-to-text solution) to Web Captioner was painless. We were able to use existing equipment and only minimal training was needed to get a volunteer on board with maintaining it weekly. We now use Web Captioner during every Sunday morning service.
Web Captioner in action at my small church.
Check out posts from visitors to the Web Captioner Facebook page to see some very cool uses of Web Captioner, including this one from Constance Free Church in Andover, Minnesota. They have a monitor visible to the front row of the congregation displaying text from Web Captioner.
Yesterday was our second day running Web Captioner in our church services and loving it. This is my computer at FOH controlling the MacBook running the captioner down by the stage on a 32in TV.Posted by Nate Kirby on Monday, July 10, 2017
Questions, Support, and Comments
The Web Captioner Users Group on Facebook is the best place to get help with Web Captioner and see how others are using it. You can also like Web Captioner on Facebook to be notified of new updates and upcoming features. If you’ve got an idea for something you’d like to see Web Captioner do, let’s hear about it!