Project MAS Rubric | |||||
Agent functionality | |||||
Criteria | Poor | Average | Good | Ratings | Max Pts |
Opening and recipe selection | The agent can only open with a greeting and does not assist the user in selecting a recipe. (0) | The agent can do a basic filtering of recipes based on features indicated by the user. (5) | The agent can effectively assist the user in selecting a recipe in a conversational (rather than website-like) way, and apply slot-filling to come to a subset of recipes for the user to choose from. (9) |
| 9 |
Recipe instruction and closing | The agent can only instruct the recipe step-by-step as directed by the user, and properly end the recipe instruction and conversation. (0) | The agent can cope with some deviations from the recipe instruction, such as clarification, appraisal, switching recipes, or a capability check. (5) | The agent can cope with many deviations from the recipe instruction, and apply slot-filling where needed. (9) |
| 9 |
Navigation | The agent can hardly accommodate navigation other than proceeding in the happy flow. (0) | The agent can do some backward navigation (such as undoing a chosen feature during recipe selection, or going back one recipe step) and/or go to a specific step in the recipe, but not at all relevant points in the conversation. (4) | The agent can accommodate navigation at all relevant points in the conversation. (7) |
| 7 |
Repair | The agent can only perform the basic fallback repair when a user intent is not understood. (0) | The agent can successfully initiate fallback repair and out-of-context repair, and respond properly to user-initiated repair at different positions in the conversation. (4) | In addition to initiating and responding to repair, where possible the agent can assist the user with information (in speech and visual presentation) on what part of its utterance needs to be addressed in order to repair. The agent will not end up in an endless repair cycle. (7) |
| 7 |
Extensions | No extensions to the original assignment have been implemented. (0) | Some functionality has been added to the agent’s capacities, albeit not very challenging or creative. (4) | The agent has been extended with (a) creative and challenging component(s) that makes a considerable difference to the utility of the cooking assistant. (8) |
| 8 |
Total points |
| 40 | |||
Visual support | |||||
Criteria | Poor | Average | Good | Ratings | Max Pts |
Orientation | The visuals merely display what the agent says at any point. (0) | The visuals indicate the status of the conversation (e.g.: filtering step, chosen recipe, current step) at some points, as well as the current conversation options of the user, but not at all points in the conversation or not insightful enough. (3) | The visuals give an insightful account of the conversation status and directions to take at any point in the conversation. (5) |
| 5 |
Overview | It is difficult to interpret the displayed information at many points in the conversation, or too little information is presented. (0) | The information presented on the display is too messy or unintuitive at points in the conversation. (3) | Overall the visuals strike the right balance between the amount of information presented and overview of the information. (5) |
| 5 |
Appearance | No effort is spent to display information in an attractive way to the user. (0) | The appearance is somewhat appealing, with images to support certain information, and a design structure that fits the information given at different points. (3) | The display is highly appealing, with images to support information where relevant, and a design structure that fits the information given at any point. (5) |
| 5 |
Total points |
| 15 | |||
Agent quality and design | |||||
Criteria | Poor | Average | Good | Ratings | Max Pts |
Robustness | The assistant easily breaks down during a conversation. (0) | The assistant can cope with some deviations by the user from the happy conversation flow. (3) | The assistant is rather robust and can cope with several deviations by the user from the happy conversation flow. (5) |
| 5 |
Conversation design | Agent utterances are not very enticing, and conversation patterns do not cater for much flexibility in the conversation. (0) | The user is to some extent considered in the way that agent utterances and conversation patterns were implemented. (3) | The group has written agent utterances in a style that engages the user, and has implemented many conversation patterns to cope with the different conversation paths that are expected to be conducted in the agent-user interactions. (5) |
| 5 |
Quality of coding and documentation | Many parts of the code are not reusable in multiple places, and comments are hardly added. (0) | The coding is well-documented, but not all comments make clear the idea behind a code snippet. Some parts of the code are needlessly hard-coded. (2) | The coding is of high quality and well-documented. (5) |
| 5 |
Total points |
| 15 | |||
Written report | |||||
Criteria | Poor | Average | Good | Ratings | Max Pts |
Introduction and Dialog engine | The project is poorly introduced, and the groups show little understanding of the dialog manager architecture and functionality. (0) | The group properly introduces the project and shows some understanding of the dialog manager architecture and functionality. (2) | The group properly introduces the project and shows in-depth understanding of the dialog manager architecture and functionality. (4) |
| 4 |
Design rationale | The design choices are difficult to understand from how they are presented. (0) | The design choices are written down in a mostly understandable way, while part of the choices are not well-motivated. (5) | The design choices are written down in an understandable and convincing way. (9) |
| 9 |
Testing | It is unclear from the testing section how the assistant is performing. (0) | The testing section gives a reasonable account of how the assistant was tested and what were the main findings. (3) | The capacities and points of improvement of the assistant are evaluated in a structured way and clearly presented in the testing section. (5) |
| 5 |
Improvements | The identified points of improvement are unclear or not addressable in any way. (0) | The identified points of improvement are clear, but the way in which they may be implemented is not properly described, unrealistic or unfeasible. (2) | The identified points of improvement are clear and the way in which they may be implemented is convincingly presented. (4) |
| 4 |
Clarity and presentation | The writing style, structure and lay-out are messy and unclear. (0) | The writing style and presentation are up to standards, but some parts are unclear. (4) | Very clear report in terms of writing style, structure and lay-out. (8)
|
| 8 |
Total points |
| 30 |
General
Content
Integrations