...
Automatic Speech Recognition (ASR) - Converts spoken language into textual transcript.
Natural Language Understanding (NLU) - Interprets and extracts meaning from the transcript.
Dialogue Management (DM) - Manages the flow of conversation and determines the system’s response.
Natural Language Generation (NLG) - Constructs responses in natural language.
Text to Speech (TTS) - Converts the generated text into spoken output.
...
https://link.springer.com/article/10.1007/s10462-022-10248-8
In this project, we will focus on building a simple pipeline that integrates ASR followed by a NLU component. We will use an existing ASR model (e.g., Whisper) for inference/prediction only (no training), while enhancing the performance of the NLU model (e.g., BERT) by training it on conversational data collected from the previous course.
...