The end Your report should briefly 1. explain your agent’s basic conversational features (functionality), discuss the progress you were able to make and how you tested competence (how can a user interact with it), 2. motivate your design choices, 3. present the testing data you collected, 4. interpret the results from testing your agent, and 5. present the overall findings and lessons learned of your project.
Note |
---|
Your report should not exceed 10 pages. You are allowed to add additional materials in appendices of your report (following the 10 main pages that we will grade). Your main report, however, should be self-comprehensive (a reader should get it without having to access the appendices). |
Info |
---|
We recommend writing your report in latex. |
Your report needs to have the following structure, and should comply with the maximum page limit indications for each section:
Title
Add a title. Also add Add your group number, student names, emails, and student numbers right below the title.
Introduction (max
...
1
...
page)
...
Briefly introduce your conversational recipe recommendation agent project, and describe your aims for the conversational agentthe main goals you as a team have set yourselves to achieve.
How does your conversational agent
...
work? (
...
max 3 pages)
...
Think of this section as providing a kind of quick-start manual to a user who knows little about conversational agents (a friend or relative without any expertise in AI, for example). After reading this section, anyone should have a clear idea of what they can do with your agent. To this end, provide a functional specification of your agent, describing its main capabilities and the different flow(s) of the interaction variety of conversational interactions a user can have with the your agent. Make sure that you give a complete coverage of the features Mention all features or skills of your agent, so they can be tested based on your descriptionwe can use that to test these features and skills based on what you write about how they work.
Design choices and rationale (
...
max 3 pages)
...
In this section, you should describe the different which design choices you have made while implementing your agent. These design choices are twofold. First, the tasks that are open to interpretation in the instructions. This should include but not be limited to training phrases used (that were not provided), visuals (essentially everything), conversational patterns and agent responses. Second, the different extensions that you have made, adding among others to, for example, the agent functionalities, persona, conversational directionsDesign choices should include the more important choices that you made for developing your Dialogflow agent (NLU), for your MARBEL Dialog manager (patterns, responses, recipe database logic), and for what your agent displays while talking (visuals, webpages). You should also describe which extensions you choose to implement. Make sure to not only describe these design choices, but also to give a clear motivation.
...
motivate why you made these choices.
Test results and discussion (max 4 pages)
In this section, you should talk about the way in which you tested findings of testing your own agent continuously. What did you find? Read a bit more about this in Agent Testing .
User Study (2-3 pages):
In this section, you should explain the testing and data gathered that was described in User Study . You should very briefly include the test setup, your goals, and then explain and analyze the data, and discuss lessons learned from your small user study. In the analysis, discuss the performance of some of your extensions separately. To what extent did they improve the interaction with your agent?
For the analysis, looking at quantitative measures found in the logs ca provide valuable insights. Examples of information that can be extracted from the logs and used for analysis include (but are not limited to):
Number of utterances/intents per interaction
Variety in intents and entities (are there unused intents?)
Confidence values for the intent classification (are there any patterns of specific intents having low/high confidence values?)
Interaction length (in time)
...
There are two sets of data that you should discuss: 1. The data your team collected while developing your conversational agent during Agent Testing; 2. The data your team collected in the Pilot User Study (where fellow students tested your agent).
You should briefly present and analyze the results from both datasets. What were the main findings? What does the data show about how well your agent performed? What issues did they uncover? What does the data tell about your Dialogflow agent, about your Dialog manager, and about the pages you developed? What kind of results did you obtain from your pilot user study? Present both quantitative (numbers, figures, surveys) and qualitative (observations, feedback) data.
You should next try to explain, interpret and discuss the data. What insights and lessons learned can you draw from the data you collected yourselves and from the pilot user study? What does the data say about your design choices and the extensions that you implemented? Can you say anything about how these impacted the performance of your agent? What does the data tell about the effectiveness, efficiency, robustness of your agent, and what users (from your pilot user study) think about your agent (satisfaction).
Conclusion (max 1 page)
Write a brief conclusion about the project itself, as well as a reflection on the process (what went well and what could have been done better?). If, based on the test outcomes, you have ideas on how to improve on the agent if you had more time, this that could also be described here.
...