After completing this course, the student can:
- Apply different AI methods and techniques for natural language processing to new problems and data
- Make use of appropriate (statistical) tools and methods for linguistic data analysis to investigate aspects of language use in social communication and assess the reliability of the results.
- Demonstrate understanding of the main linguistic concepts that play a role in natural language processing.
- Indicate the strengths and weaknesses of, and the differences between, the main techniques for modelling and parsing natural languages.
This course looks at a specific area of artificial intelligence (AI): how machines can analyse linguistic data, focusing on written language. We look at AI models such as finite state, grammatical, and statistical models (hidden Markov, bag of words), and how they are used in tools for natural language processing such as part of speech taggers, word sense disambiguation, sentiment analysis and parsing tools. We also look at how machines can learn to analyse linguistic data. We discuss how these models and techniques can be applied to investigate language use in (mediated) social communicative situations. Analysing the language that people use in various situations and media can give us interesting information about different aspects of the communicative situation. For example: what does the language that someone uses in social media tell us about his personality or about his cultural background, or about his/her mood or stance towards the topic (s)he is writing about? In the lectures we will present computational models of language as well as tools and techniques that can be used to build “natural language processing machines” and to address interesting questions and answers concerning language use. Students will do practical lab and homework assignments in which they will use the models and natural language processing tools to answer research questions regarding linguistic data, and carry out a small project where linguistic data analyses can give information about interesting properties of the speaker or the communicative situation. Students are also expected to present and discuss some papers on the current state of the art in natural language processing. Assessment is done in the form of a written exam about the theoretical parts of the course, and homework/lab assignments for pairs of students. Final grading is based on marks for assignments and exam. The course includes one or two mandatory presentations, which are not graded. |
Besides dealing with technology for natural (= human) language processing, the course also deals with understanding humans and context, as students learn about the structure and use of human language.
Prerequisites Basic programming skills (to write scripts for data analysis) and statistical analysis skills (use of SPSS or other tools)..Given the use of formal, mathematical and probabilistic methods in this course, as well as the use of algorithmic specifications based on these models, the course requires some practical experience with and feeling for mathematical formulas and formal specifications.
Prerequisite for This course is a desired prerequisite for the courses Speech Processing (201600075) and Conversational Agents (201600077).
Assumed previous knowledge
|Master Interaction Technology||Required materials|
Recommended materials-Instructional modesTests
|D. Jurafsky & J.H. Martin. Speech and Language Processing: Third Edition. (Online draft chapters).|
|Exam and Assignments|