Your report should not exceed 10 pages.
You are allowed to add additional materials in appendices of your report (following the 10 main pages that we will grade). Your main report, however, should be self-comprehensive (a reader should get it without having to access the appendices).
Your completed report should be in the main branch of your repository on GitHub.
Your report needs to have the following structure, and should comply with the maximum page limit indications for each section:
Title
Intro
How does your conversational agent work?
Intent and Slot Classifier - performance, extensions to model.
Exclusion - how, testing, comparison with and without exclusion models accuracy
Extensions to bot (not including hpo and model updates)
Pilot User Study
Conclusion
1. Title (0.5 page)
Content:
Add the report title.
Include group number, student names, emails, and student numbers right below title.
2. Introduction (1 page)
Content:
Briefly introduce the project: What is a conversational recipe recommendation agent? What are Task-Oriented Spoken Dialogue Systems?
Summarize Preliminaries: Describe what definitions and knowledge you utilized to complete this project.
State the goals your team aimed to achieve.
Tips:
Ensure clarity and conciseness. This section sets the stage for the reader.
Provide some background on the significance of conversational agents in recipe recommendations.
3. Your Pipeline: How Does Your Conversational Agent Work? (2 pages)
Content:
Describe your pipeline, and give an overview.
Explain the overall functionality of your agent.
What problems does it solve?
What can users achieve by interacting with the agent?
Describe the conversational flow. How does it work?
4. Intent and Slot Classifier (2 pages)
Content:
Explain the role of the intent and slot classifier in the agent.
Summarize training and testing
Performance analysis:
Metrics: Accuracy, precision, recall, F1 score, confusion matrix.
Any improvements made, such as additional training data, custom models, and improvements to the model or training procedure.
Discuss challenges.
Extensions to the model:
Use of other pre-trained models, hyperparameter tuning, additional or custom layers, or domain-specific tuning.
Tips:
Include visual aids such as tables or charts to present performance data effectively.
Highlight innovative solutions your team implemented.
5. Exclusion (2 pages)
Content:
Implementation:
How exclusion works (e.g., filtering ingredients, cuisines, and tags).
Approaches used: intent-based, slot-based, rule-based, classifier-based.
Tools/technologies: Integration of MARBEL, Prolog, Python, and ontology updates.
Testing:
Comparison of inclusion-only vs. exclusion models.
Examples of exclusion in action (e.g., excluding dairy, gluten, or specific cuisines).
Performance Analysis:
Accuracy with and without exclusion.
Trade-offs and impact on user satisfaction.
Pros and Cons:
Discuss strengths and limitations of the exclusion mechanism.
Tips:
Use examples and data to illustrate the effectiveness of the exclusion approach.
Be critical and discuss what could be improved.
6. Extensions to the Bot (1 page)
Content:
Describe additional enhancements:
New functionalities added.
Improvements in user interaction and experience (e.g., better response generation, conversational adaptability).
Explain the motivation and impact of these extensions.
Tips:
Highlight how these extensions make the bot stand out beyond baseline requirements.
7. Pilot User Study (1.5 pages)
Content:
Setup:
Who were the users?
What tasks were they asked to perform?
Methodology for data collection (e.g., surveys, observation, interaction logs).
Results:
Quantitative data: Success rates, error rates, average response time, etc.
Qualitative data: User feedback, observations of user behavior.
Analysis:
Key findings: What worked well? What needs improvement?
Lessons learned and implications for future work.
Tips:
Use charts or graphs to present quantitative findings.
Include excerpts from user feedback to illustrate qualitative insights.
8. Conclusion (1 page)
Content:
Summarize the main outcomes of the project.
Reflect on the process:
What went well (e.g., teamwork, innovative solutions)?
What could be improved (e.g., time management, data quality)?
Suggest future improvements or extensions.
Tips:
Keep it concise but reflective.
Focus on high-level takeaways and actionable insights for future work.
Title
Add a title. Add your group number, student names, emails, and student numbers right below the title.
Introduction (max 1 page)
Briefly introduce your conversational recipe recommendation agent project, and describe the main goals you as a team have set yourselves to achieve.
How does your conversational agent work? (max 3 pages)
Think of this section as providing a kind of quick-start manual to a user who knows little about conversational agents (a friend or relative without any expertise in AI, for example). After reading this section, anyone should have a clear idea of what they can do with your agent. To this end, provide a functional specification of your agent, describing its main capabilities and the variety of conversational interactions a user can have with your agent. Mention all features or skills of your agent, so we can use that to test these features and skills based on what you write about how they work.
Design choices and rationale (max 3 pages)
In this section, you should describe which design choices you have made while implementing your agent. Design choices should include the more important choices that you made for developing your Dialogflow agent (NLU), for your MARBEL Dialog manager (patterns, responses, recipe database logic), and for what your agent displays while talking (visuals, webpages). You should also describe which extensions you choose to implement. Make sure to not only describe these design choices, but also motivate why you made these choices.
Test results and discussion (max 4 pages)
In this section, you should talk about the findings of testing your own agent. What did you find? There are two sets of data that you should discuss: 1. The data your team collected while developing your conversational agent during [TBU]Pipeline Testing; 2. The data your team collected in the [TBU]Pilot User Study (where fellow students tested your agent).
You should briefly present and analyze the results from both datasets. What were the main findings? What does the data show about how well your agent performed? What issues did they uncover? What does the data tell about your Dialogflow agent, about your Dialog manager, and about the pages you developed? What kind of results did you obtain from your pilot user study? Present both quantitative (numbers, figures, surveys) and qualitative (observations, feedback) data.
You should next try to explain, interpret and discuss the data. What insights and lessons learned can you draw from the data you collected yourselves and from the pilot user study? What does the data say about your design choices and the extensions that you implemented? Can you say anything about how these impacted the performance of your agent? What does the data tell about the effectiveness, efficiency, robustness of your agent, and what users (from your pilot user study) think about your agent (satisfaction).
Conclusion (max 1 page)
Write a brief conclusion about the project itself, as well as a reflection on the process (what went well and what could have been done better?). If, based on the test outcomes, you have ideas on how to improve on the agent if you had more time, that could also be described here.