This course aims to give the student an introduction into human speech processing, speech technology, and its applications and research challenges. In particular, the course focusses on the following learning aims:
- Human speech production: The student is able to demonstrate a basic understanding of human speech production and its important concepts, which includes vocal tract properties, phonology, source-filter-model, and speaker variation.
- Speech technology: The student is able to
- Demonstrate a basic understanding of how speech can be represented in features (e.g., formants, spectrograms, Mel-Frequency Cepstrum Coefficients)
- Demonstrate a basic understanding of how automatic speech recognition works and its important concepts, which includes Hidden Markov Models, acoustic and language model, and the Word Error Rate.
- Demonstrate a basic understanding of how paralinguistic information (e.g., socio-affective, mental, physical user states) can be detected in speech
- Speech technology applications: The student is able to demonstrate a basic understanding of certain speech technology-related applications and its challenges, and can argue for its relevance and impact in that application area. Examples of speech technology-related applications and application areas include spoken dialogue systems, speech retrieval, oral history, health care, human-robot interaction.
- Speech technology research: The student is able to
- Design a small, well-scoped, relevant speech technology-related study
- Find and evaluate relevant related work in speech technology literature
- Use and apply appropriate tools to a speech-related research question (e.g., recording devices, Praat, automatic speech recognition)
- Reflect on, and argue for the results of a speech technology-related study
|
 |
|
This course offers a more in-depth introduction to the automatic processing of spoken language and its application in the domain of Human Computer Interaction. Several methods of speech processing will be discussed, plus the feasibility of their applicability within e.g., man-machine interaction and information retrieval. Several examples of running systems will be presented in more detail, partly via practical assignments. In the lectures, the following subjects will be discussed:
- Human Speech Processing
- Signal Processing
- Speech Applications
- Speech Generation
- Speech Recognition
- Dialogue and Conversational Agents
- Social Signal Processing and Speech
- Speech Retrieval
|
|