Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) have matured to a level where it is possible to translate a user's utterances into text (ASR) and to classify text into intents (NLU) to make sense of what a user says. Also, Text-To-Speech (TTS) can be used for speech synthesis to produce well-pronounced spoken utterances from written text. Yet, conversational agents have not become mainstream, and whoever has used a home assistant (Google Home or Apple Siri) has experienced being misunderstood. These assistants are typically able to perform well on basic Question-Answering (QA) interactions, which most of the time consist of just two conversational turns: a question and an answer. However, conducting longer conversations tends to be more challenging. This is because a longer conversation can take (too) many directions, and the chance that a user says something unexpected significantly increases. We will investigate this challenge in this project.
In this project, you will be developing a conversational recipe recommendation agent in a team of 6 people that uses speech to interact and is able to conduct a conversation for selecting a recipe to cook. Your agent should be able to assist a user in selecting a recipe using a variety of filters. The agent does not need to be able to assist a user with the instruction steps of the recipe itself, which is out of the scope of the Project MAS course. We chose to focus on the recipe selection activity since it already poses several challenges for building an effective and robust conversational agent. First, there are many different ways in which this conversation may be conducted and many different ways in which a user can phrase what they want from the agent. A user can specify different aspects of a recipe that the recipe it will finally select should satisfy (e.g., type of ingredients, cooking duration, type of course, etc.). Second, the recipe recommendation domain already is a broad knowledge space that the agent needs to be able to handle to understand what the user is looking for. The agent will have to reason over its database of recipes to filter for recipes that fit the user’s preferences.
You are provided with a Prolog knowledge base of close to 1,000 recipes and their components, which still requires an effort from your group to make it usable by the recipe recommendation agent. In the instructions, we walk you through the procedure by pinpointing aspects of the agent that need to be altered or filled in. If you fill in these blanks, the agent should work, but it will still be pretty basic…
Therefore, the project does not stop there, and we challenge you to extend the basic bot with your own flare and ideas (with our assistance), to earn a higher grade.
Before you get started, make sure to check out the main Project Deliverables. Apart from a working agent (a MARBEL and Dialogflow agent) that you evaluated, you will conduct weekly presentation-based check-ins and write a Final Report in which you describe the agent's main features and its performance based on the testing you did, amongst other things. You can also look ahead at how your work will be evaluated by checking out the Assessment Rubric.
At the end of this project, you and your team should have a fully functioning conversational agent that is able to assist users with selecting a recipe they would like to cook!
The PMAS Project Outline will be your definitive guide to this project, you should go through it section by section to prepare for the course, get course information, build your agent, and then write your report.