Text-to-Speech (TTS)

Introduction

This service supports voice-based interaction. The text_to_speech service enables the use of the Google Text-to-Speech platform within your application. It artificially produces human speech from text. It can also be used in combination with social robots to replace the robot’s own ('onboard') speech synthesis capabilities.

Docker name: `text_to_speech`

Input

sensors: none
actuators: none
parameters:
- text:
  - value: text to synthesise
  - type: string

Initial (One-Time) Setup

Download the keyfile JSON from the TTS Google Agent
- Note: the JSON file can be changed in the application at runtime using the TTSConnector's set_tts_key(<path to keyfile>) method
Choose a voice ID for the text synthesising from here
- the language is automatically inferred from the voice ID
- Note: the voice can be changed in the application at runtime using the TTSConnector's set_tts_voice(<voice ID>) method

Output

sensors: none
actuators: speakers
data:
- value: synthesised text as speech
- type: bytestream

Service Configuration

In order to use the service, you first need to create a TTS Agent. For that, visit this page to set up a Google service account and download the JSON keyfile.

Create a service account
- log in to the (Google) account you want to use for the TTS agent
Select your project
- If a Dialogflow agent exists on the same Google Account with a key, select it from your already available projects
- Otherwise, if you plan on using Dialogflow, you can create a Dialogflow agent first
- if there are no pre-existing projects, create one using the “Create Project” button
Follow the rest of the instructions here
Search for “text to speech” in the search bar and select “Cloud Text-to-Speech API”
- enable the Cloud Text to Speech API by clicking the “Enable” button
  - you will be asked to set up billing information which you can do by pressing “Enable Billing” > “Create Billing Account” > fill in your information > “Start My Free Trial”.
    - If you’re using the TTS service for the first time, you will be granted $300 to spend, which should be more than enough for your application.

You should now be able to use the TTS Google Agent.

Usage

Regular Usage

Create a TTSConnector instance.
set the keyfile and the voice ID in the TTSConnector
call the connector’s say_text_to_speech([text]) with the text to be synthesized

Example

See this for an example usage.

Events

PlayAudioDone is raised when the speech synthesis is done. The event can be listened to

Known Issues

Google’s TTS has no limitations on the number of characters. However, depending on your subscription plan, there is a limit on the monthly number of spoken characters.

The free subscriptions allows standard voices (non-WaveNet) with up to 4 million characters per month.
The WaveNet voices are limited to 1 million characters per month.
Charges apply per character after the use of the monthly quota.