...
sensors: Camera (stereo or mono)
actuators: None
services: X
parameters (note that the following parameters are hard-coded at the top of the file
object_detection_service.py
):Threshold:
float
, sets the confidence level threshold. Default:0.7
DPI:
int
, sets the number of Detections Per Image. Default:100
MODEL:
str
, path to the model file (.pkl
). Default:model_final_f10217.pkl
MODEL_PATH:
str
, path to the model configuration file (.yaml
). Default:'COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml'
...
Protobuf output The Protobuf output is used to output the segmentation masks. The Protobuf is build as follows:
'source': str
...
Code Block | ||
---|---|---|
| ||
{'intent': '[YOUR_INTENT]',
'parameters':
{'[YOUR_PARAMETER]': '[PARAMETER_RESPONSE]'},
'confidence': [CONFIDENCE_VALUE],
'text': '[RESPONSE_TEXT]',
'source': 'audio'} |
...
'intent': str
the intent on which the audio was recognised, corresponding to the intent set on the agent
...
‘parameters’: dict
the parameters defined in the agent
each parameter is a
str
key, with the its response asstr
value pairing
...
'confidence': int
number ranging from 0 to 100 that defines how confident the API is with the intent and text detection
...
‘text’: str
speech-to-text response from the API
image_masks = ImageMasks()
image_masks.timestamp_ms # timestamp of image in miliseconds
image_masks.mask_width # width in pixels of mask
image_masks.mask_height # height in pixels of mask
image_masks.mask_count # number of detected objects
image_masks.masks # Python array (list) of booleans |
Such a Protobuf object can be 'unpacked' to obtain the original masks again:
Code Block |
---|
orginal_masks = array(image_masks.masks).reshape((image_masks.mask_count, image_masks.mask_height, image_masks.mask_width))
orginal_masks = orginal_masks.astype(bool) |
As you can see the shape of orginal_masks
is (N, H, W), where N is the number of masks, H the height in pixels, and W the width in pixels.
This Protobuf output is added to the zrange
of the segmentation_stream
as a serialized Protobuf object. A zrange
is the redis-implementation of a Python dictionary. The timestamp_ms
is used as key, where the serialized Protobuf is the value.
Initialisation
Using the service
In order to use our service for your purposes, an instance of the BasicSICConnector class has to be created. You can find the details of this class here. You may also need a class to manage speech_recognition attempts and want to write a callback function for retrieving a recognized entity object from the detection result.
...
You have the relevant services and drivers running.
To pass your local IP address, Dialogflow key file pathinstance of BasicSIC connector, and Dialogflow agent ID, when creating an instance of BasicSIC connectorActionRunner.
A partial callback function is set up for retrieving a recognized entity object from the detection result.
...
onAudioIntent
a new intent is detected
IntentDetectionDone
a new intent has finished being detected
onAudioLanguage
the audio language has been changed
LoadAudioDone
if an audio file is used, the event is raised when the file has finished being loaded
Known Issues
There is a rare bug where sometimes Dialogflow will suddenly only respond with ‘UNAUTHENTICATED’ errors. Restarting Docker and/or your entire machine seems to be the only way to resolve this.None