Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

The dialogflow service allows using the Google Dialogflow platform within your application. Dialogflow is used to translate human speech into intents (intent recognition). In other words, not only does it (try to) convert an audio stream into readable text, it also processes this text into an intent (possibly with some additional parameters). For example, an audio stream can translate to the string "I am 15 years old", which is, in turn, converted to the intent 'answer_age' with the parameter 'age = 15'.

The service is available in multiple languages.

Docker name: dialogflow

Input

...

  • sensors:

...

Required actuators: None

...

  • Microphone

    • Audio input can also be provided in the form of an audio file

  • audio length + audio type (bytestream)

No external services from the local infrastructure are in need to be running to run this service, however, you need to have the following devices running, if testing locally; computer-robot, computer-speaker, and computer-microphone.

Output

The output solely depends on your project and the set-up of your intents and entities of the Dialogflow agent.

  • confidence level - 0-100

  • transcript - string

  • entities + their type

  • intent - string

Service Configuration

Setting up Dialogflow

...

  • actuators: None

  • services: stream_audio

  • parameters:

    • audio:

      • value: the audio to be interpreted

      • type: bytestream

Service Configuration

The following steps will help you get a keyfile, project ID and select a language:

  1. Create a Dialogflow agent by clicking the following link: https://dialogflow.cloud.google.com/

  2. Use the ‘Create Agent' button at the left top to start your first project. Press the settings icon next to your agent's name at the left top to see the Project ID.

  3. Follow the steps here to retrieve your private key file in JSON format.

...

The following parameters can be modified to configure the service:

In order to run this service, the following parameters are needed:

  1. dialogflow_agent_id

  2. dialogflow_key_file

If you do not possess the mentioned parameters, please refer to the section Initialisation → Setting up Dialogflow.

Initialisation

Setting up intents

The main items of interest are the Intents and the Entities. Intent is something you want to recognise from an end-user; here we will show you an example of an intent that is aimed at recognising someone’s name.

For each intent that you create, make sure that:

  • abstract away + concrete example

When creating an intent, you can name it anything you like; we go with 'answer_name' here. Below 'Action and parameters', you should give the name of the intent that will actually be used in your program. Here, we also make that 'answer_name'. Moreover, it is useful to set a context for the intent. A context is set by the requester in order to indicate that we only want to recognise this specific intent and not another one. Usually, in a social robotics application, the kind of answer we want to get is known. We match the name of the (input)context with the name of the intent and thus make it 'answer_name' as well. By default, Dialogflow makes the context 'stick' for 5 answers; we can fix this by changing the 5 (at the output context) to a 0. Now we arrive at the most important aspect of the intent: the training phrases. Here you can give the kinds of input strings you would expect; from these Dialogflow learns the model it will eventually use. You can make a part of the phrase into a parameter by double-clicking on the relevant word and selecting the appropriate entity from the list. It will then automatically appear below ‘Action and parameters' as well; the ‘parameter name’ there will be passed in the result (we use 'name’ here). The system has many built-in entities (like 'given name'), but you can define your own entities as well (by importing CSV files). Our complete intent example thus looks like this (note: using sys.given-name is usually preferred):

...

  1. If you don’t have a service account, you can create one by pressing the “Create Service Account” button in the upper part of the screen.

    • Note: the JSON file can be changed in the application at runtime using the BasicSICConnector's set_dialogflow_key(<path to keyfile>) method

  2. Choose a language for the agent from https://cloud.google.com/dialogflow/es/docs/reference/language

    • Note: the language can be changed in the application at runtime using the BasicSICConnector's set_dialogflow_voice(<voice ID>) method

Output

  • sensors: none

  • actuators: none

The output consists of a collection of information of the type dict:

Code Block
languagepy
{'intent': '[YOUR_INTENT]', 
'parameters': 
    {'[YOUR_PARAMETER]': '[PARAMETER_RESPONSE]'}, 
'confidence': [CONFIDENCE_VALUE], 
'text': '[RESPONSE_TEXT]', 
'source': 'audio'}
  • 'intent': str

    • the intent on which the audio was recognised, corresponding to the intent set on the agent

  • ‘parameters’: dict

    • the parameters defined in the agent

    • each parameter is a str key, with the its response as str value pairing

  • 'confidence': int

    • number ranging from 0 to 100 that defines how confident the API is with the intent and text detection

  • ‘text’: str

    • speech-to-text response from the API

  • 'source': str

    • for the SIC framework Dialogflow usage, the source is always ‘audio’

Initialisation

Using the service

In order to use our service for your purposes, there are two classes with which you must interact, namely BasicSICConnector and ActionRunneran instance of the BasicSICConnector class has to be created. You can find the details of these classes this class here. You may also need a class to manage speech_recognition attempts and a callback function for retrieving a recognized entity from the detection result.

In order to run this service, the following steps must be taken into consideration:

  1. You have the

...

  1. relevant services and drivers running.

  2. To pass your local IP address, Dialogflow key file path, and Dialogflow agent ID, when creating an instance of

...

  1. BasicSIC connector.

  2. A partial function is set up for retrieving a recognized entity from the detection result.

Example

We have provided a The following file, https://bitbucket.org/socialroboticshub/connectorsexamples/src/mastermain/python/3_speech_recognition_example.py , is available for the purpose of demonstration.

There are two classes worth paying attention to.

Recognition Manager:

Recognition manager manages speech recognition attempts and is used by the class Example

Example:

Two types of questions are includeddealt with in this example. The first is an entity question where we are interested in a specific entity in the answer. In this case the point of interest is the name of the person that is interacting with the robotuser. The second is a yes or no question. The answer can be yes, no, or don't know (or any synonyms).This class interacts with two external classes, BasicSICConnector and ActionRunner. An instance of BasicSICConnector is needed to interact with the Social Interaction Cloud, and ActionRunner to execute the desired actions appropriately.

In order to run this example, ensure to pass your agent name and keyfile(path) as parameters when creating an instance of class Example.

Events

Events that the service creates and can have listenersdon’t know question.

Setting up the agent

In order to deal with the first question, an intent needs to be set up. An intent is a value recognised from an end-user. In our example, the name of the person. The following steps will set an intent of your Dialogflow agent:

  1. Navigate to the agent’s page to set the intent, training phrases and parameters.

  2. Create an agent intent.

    • It is recommended that the name suggests the kind of answer you are looking for in the audio. In our example, the name of the user (‘answer_name’).

    • the intent defined in the agent should correspond to the intent used in the code

  3. Create a context.

    • the number next to the context corresponds to the number of responses expected from the user in that context. In our example, that number is 0

  4. Create training phrases for the intent

    • the training phrases should be input examples that contain the intent. In our example, 'my name is name`

    • Dialogflow learns from these phrases and matches future user inputs based on them

  5. Create parameters for the intent

    • select words from the training phrases as parameters by double-clicking on them, then match them with their corresponding entity. They automatically appear in the ‘Action and parameters’ section. In our example, we are only interested in the ‘name’ of the user

Our complete intent example thus looks like this (note: using sys.given-name is usually preferred):

...

Events

  • onAudioIntent

    • a new intent is detected

  • IntentDetectionDone

    • a new intent has finished being detected

  • onAudioLanguage

    • the audio language has been changed

  • LoadAudioDone

    • if an audio file is used, the event is raised when the file has finished being loaded

Known Issues

  • There is a rare bug where sometimes Dialogflow will suddenly only respond with ‘UNAUTHENTICATED’ errors. Restarting Docker and/or your entire machine seems to be the only way to resolve this.