Available services

The social interaction cloud has many components available for you to speed up creating social interactions with the robot’s.

Service	Command	Source	Demo files	Install	Note

Service	Command	Source	Demo files	Install	Note
Dialogflow for creating conversational agents using google’s framework. This provides a flow chart like dialog management and speech recognition.	`run-dialogflow`	dialogflow	demo_desktop_microphone_dialogflow.py or demo_nao_dialogflow.py	`pip install social-interaction-cloud[dialogflow]`
Face detection using OpenCV’s cascading classifier, which is very fast and can run on a laptop CPU	`run-face-detection`	face_detection	demo_desktop_camera_facedetection.py	None, no extra dependencies are needed
DNN Face detection using a YOLOv7 neural network for accurate detection, and detection of small faces.	`run-face-detection-dnn --model yolo7-face.pt`	face_detection_dnn	demo_desktop_camera_facedetection_dnn.py	`pip install social-interaction-cloud[face-detection-dnn]`	The model file used in this example can be found here: https://drive.google.com/file/d/1oIaGXFd4goyBvB1mYDK24GLof53H9ZYo/view
DNN Face recognition using a resnet50 network to extract face embeddings and assign an id based on automatic clustering.	`run-face-recognition --model xxx.pt --cascadefile xxx.xml`	face_recognition_dnn	demo_desktop_camera_facerecognition.py	`pip install social-interaction-cloud[face-recognition]`	The cascade classifier file used in this example can be found here: haarcascade_frontalface_default.xml. The model file can be found here: resnet50_ft_weight.pt
OpenAI ChatGPT a text based large language model that provides a very capable dialog agent. Requires a credit card.	`run-gpt`	gpt	demo_openai_gpt.py	`pip install social-interaction-cloud[openai-gpt]`	An openai api key can be created here: https://platform.openai.com/api-keys
OpenAI Whisper a powerful speech to text model, capable of running both local and in the cloud. Cloud usage requires a credit card. Start and end recognition is performed using python’s SpeechRecognition	`run-whisper`	whisper_speech_to_text	demo_desktop_microphone_whisper.py	`pip install social-interaction-cloud[whisper-speech-to-text]`	An openai api key can be created here: https://platform.openai.com/api-keys
Google Text to speech using google cloud API. Requires a credit card.	`run-google-tts`	text2speech	demo_desktop_google_tts.py	`pip install social-interaction-cloud[google-tts]`	A credential keyfile has to be configured: See https://console.cloud.google.com/apis/api/texttospeech.googleapis.com/. A credit card is required.
Natural language understanding (NLU), a joint learning model of intent and slot classification with BERT.		nlu	a simple demo with ASR+NLU pipeline demo_desktop_asr_nlu.py.	git clone & checkout the development branch nlu_component `pip install ."[whisper-speech-to-text,nlu]"`
LLM, A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own local machine. llm		llm	demo_desktop_llm.py	git clone & checkout the development branch nlu_component `pip install ."[llm]"`	You can use both free local LLMs and remote LLMs with your own API keys.
Templates for creating your own components		templates