Your report should not exceed 10 pages.

You are allowed to add additional materials in appendices of your report (following the 10 main pages that we will grade). Your main report, however, should be self-comprehensive (a reader should get it without having to access the appendices).

Your completed report should be in the main branch of your repository on GitHub.

Your report needs to have the following structure, and should comply with the maximum page limit indications for each section:

Title

Intro

How does your conversational agent work?

Intent and Slot Classifier - performance, extensions to model.

Exclusion - how, testing, comparison with and without exclusion models accuracy

Extensions to bot (not including hpo and model updates)

Pilot User Study

Conclusion

1. Title (0.25 page)

Content:

Write a concise and descriptive title for your report.
Below the title, list your:
- Group number
- Student names
- Emails
- Student numbers

Notes:
Make the title eye-catching and informative to immediately communicate the essence of your project

2. Introduction (0.75 page)

Content:

Introduction to the Project:
- Define a conversational recipe recommendation agent and its purpose.
- Introduce Task-Oriented Spoken Dialogue Systems (TOSDS), emphasizing their role in handling structured tasks like recipe recommendations.
Preliminaries:
- Explain any key definitions, methodologies, or prior knowledge (e.g., intents, slots, NLU pipelines, or ontology design) that you used as foundational elements.
Goals:
- State your team’s objectives clearly. Examples:
  - Build an agent capable of personalized recipe recommendations.
  - Etc.

Tips:
Set a positive tone for the report and provide context for why conversational agents are valuable in recipe recommendations. Mention briefly the importance of personalization (e.g., excluding allergens, adapting to dietary preferences).

3. Your Pipeline: How Does Your Conversational Agent Work? (2 pages)

Content:

Pipeline Overview:
- Provide a high-level description of the architecture, from user input to recipe output.
- Mention the key components seen in the diagram above as described in [TBC]Preliminaries and Quiz Materials.
Functionality:
- Highlight the agent’s primary use cases.
Conversational Flow:
- Walk through a typical user interaction.
- Explain how user queries are processed through intent recognition, slot filling, and database queries.

Tips:
Use diagrams or flowcharts to visually illustrate the pipeline. Focus on making it understandable to both technical and non-technical audiences.

4. Intent and Slot Classifier (2 pages)

Content:

Role of Intent and Slot Classifier:
- Explain the importance of these components in identifying user intentions and extracting relevant information (e.g., cuisine type, dietary restrictions).
Training and Testing:
- Describe the datasets used for training/testing.
- Highlight any preprocessing techniques or augmentation strategies employed.
Performance Analysis:
- Present metrics:
  - Accuracy, precision, recall, F1 score.
  - Use tables or confusion matrices to compare results across iterations.
- Discuss challenges faced (e.g., ambiguous intents, overlapping slots).
Extensions to the Model:
- Mention improvements like:
  - Use of pre-trained models (e.g., BERT or GPT-based embeddings).
  - Hyperparameter tuning and architectural modifications.

5. Exclusion (2 pages)

Content:

Implementation:
- How exclusion works (e.g., excluding ingredients, cuisines, and mealTypes).
- Approaches used: Describe the approach your team used to implement Exclusion into your model.
- Tools/technologies: Integration of MARBEL, Prolog, Python, and ontology updates.
Pros and Cons:
- Discuss the strengths and limitations of the exclusion mechanism you choose. What can your exclusion do, what can it not do?
Testing:
- Comparison of inclusion-only vs. exclusion models.
Performance Analysis:
- Accuracy with and without exclusion.
- Trade-offs and impact on user satisfaction.

Tips:

Use examples and data to illustrate the effectiveness of the exclusion approach.
Be critical and discuss what could be improved.

6. Extensions to the Bot (1 page)

Content:

Describe additional enhancements:
- New functionalities, filters, capabilities added.
- Improvements in user interaction and experience (e.g., better response generation, conversational adaptability).
Explain the motivation and impact of these extensions.

Tips:

Highlight how these extensions make the bot stand out beyond baseline requirements.

7. Pilot User Study (1 page)

Content:

Setup:
- Who were the users?
- What tasks were they asked to perform?
- Methodology for data collection (e.g., surveys, observation, interaction logs).
Results:
- Quantitative data: Success rates, error rates, average response time, etc.
- Qualitative data: User feedback, observations of user behavior.
Analysis:
- Key findings: What worked well? What needs improvement?
- Lessons learned and implications for future work.

Tips:

Use charts or graphs to present quantitative findings.
Include excerpts from user feedback to illustrate qualitative insights.

8. Conclusion (1 page)

Content:

Summarize the main outcomes of the project.
Reflect on the process:
- What went well (e.g., teamwork, innovative solutions)?
- What could be improved (e.g., time management, data quality)?
Suggest future improvements or extensions.

Tips:

Keep it concise but reflective.
Focus on high-level takeaways and actionable insights for future work.

Title

Add a title. Add your group number, student names, emails, and student numbers right below the title.

Introduction (max 1 page)

Briefly introduce your conversational recipe recommendation agent project, and describe the main goals you as a team have set yourselves to achieve.

How does your conversational agent work? (max 3 pages)

Think of this section as providing a kind of quick-start manual to a user who knows little about conversational agents (a friend or relative without any expertise in AI, for example). After reading this section, anyone should have a clear idea of what they can do with your agent. To this end, provide a functional specification of your agent, describing its main capabilities and the variety of conversational interactions a user can have with your agent. Mention all features or skills of your agent, so we can use that to test these features and skills based on what you write about how they work.

Design choices and rationale (max 3 pages)

In this section, you should describe which design choices you have made while implementing your agent. Design choices should include the more important choices that you made for developing your Dialogflow agent (NLU), for your MARBEL Dialog manager (patterns, responses, recipe database logic), and for what your agent displays while talking (visuals, webpages). You should also describe which extensions you choose to implement. Make sure to not only describe these design choices, but also motivate why you made these choices.

Test results and discussion (max 4 pages)

In this section, you should talk about the findings of testing your own agent. What did you find? There are two sets of data that you should discuss: 1. The data your team collected while developing your conversational agent during [TBU]Pipeline Testing; 2. The data your team collected in the [TBU]Pilot User Study (where fellow students tested your agent).

You should briefly present and analyze the results from both datasets. What were the main findings? What does the data show about how well your agent performed? What issues did they uncover? What does the data tell about your Dialogflow agent, about your Dialog manager, and about the pages you developed? What kind of results did you obtain from your pilot user study? Present both quantitative (numbers, figures, surveys) and qualitative (observations, feedback) data.

You should next try to explain, interpret and discuss the data. What insights and lessons learned can you draw from the data you collected yourselves and from the pilot user study? What does the data say about your design choices and the extensions that you implemented? Can you say anything about how these impacted the performance of your agent? What does the data tell about the effectiveness, efficiency, robustness of your agent, and what users (from your pilot user study) think about your agent (satisfaction).

Conclusion (max 1 page)

Write a brief conclusion about the project itself, as well as a reflection on the process (what went well and what could have been done better?). If, based on the test outcomes, you have ideas on how to improve on the agent if you had more time, that could also be described here.

[TBU]Final Report

1. Title (0.25 page)

2. Introduction (0.75 page)

3. Your Pipeline: How Does Your Conversational Agent Work? (2 pages)

4. Intent and Slot Classifier (2 pages)

5. Exclusion (2 pages)

6. Extensions to the Bot (1 page)

7. Pilot User Study (1 page)

8. Conclusion (1 page)

Title

Introduction (max 1 page)

How does your conversational agent work? (max 3 pages)

Design choices and rationale (max 3 pages)

Test results and discussion (max 4 pages)

Conclusion (max 1 page)