Text-2-Speech Integration

HelpXplain is the exciting new animated infographics and screencast tool that integrates with Help+Manual.

Moderators: Alexander Halser, Tim Green

Post Reply
CarstenB
Posts: 2
Joined: Wed Nov 11, 2020 12:36 pm

Text-2-Speech Integration

Unread post by CarstenB »

Here's a feature request concerning slide audio:
I often have to create a narration off of the side or bottom text. Using synthetic high-quality neural voices proved to be a very practical solution.
It would be a big time saver to have a subscription for MS Azure Cognitive Services, AWS Polly or Google Cloud Text-to-Speech integrated into HelpXplain.
A push of a button would then create the voice-over and the resulting MP3 file could be automatically linked to the current slide.
Does that sound doable? MS Azure would be my favorite due to the fact that they have the most neural voices available (even multiple Chinese ones).
:D
User avatar
Tim Green
Site Admin
Posts: 23155
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Text-2-Speech Integration

Unread post by Tim Green »

Hi Carsten,

We've considered this and decided against it. The problem is that all these text-to-speech services have different APIs that are complex to integrate and also change over time, so you will repeatedly have the problem that they will stop working for a while until software maintenance catches up with them. HelpXplain only needs MP3 files for this, and inserting them is just a couple of clicks. So all you really need is a quick and easy way to create the MP3 files locally for your Xplains, and those are already available.

For example, Amazon Polly has a web console that saves the spoken text directly in MP3s. I'm not sure if that is possible with their free 5M characters per month version, but even the paid version is very reasonable. Alternatively you can use a tool like Replay Music to save the spoken text from any audio played in the browser and save it as an MP3. Amazon Polly is more convenient, and depending on the service you're recording you would have to check what the copyright situation is if you are integrating those recordings in a commercial Xplain, but as a quick ad-hoc solution for internal use that is also a good and inexpensive way to go.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Post Reply