MSc in Data Science with Specialization in Computational Linguistics
Collaborate with FAH
All courses are compulsory:
AHGC7315 Language and Linguistics
3 credits
This course introduces key concepts and approaches to the study of language and linguistics. It provides an overview, with basic terminology, of the major subfields of linguistics, investigating the nature, history, and structure of language and how language relates to the mind, society, and education. It provides the basis of investigation in subsequent courses in the Computational Linguistics program. Students are encouraged to reflect on their own language experience and apply the theories covered in the course to their own linguistic context.
AHGC7398 Project Report
6 credits
A “Project Report” addresses a particular issue in a specific context and may incorporate elements of data collection, data processing, data analysis and data solution. Students demonstrate their ability to reflectively examine a particular context, identify and define an issue in that context, access, summarize, and synthesize current literature relevant to the issue, and develop a justified approach to address that issue. Various linguistics and applied linguistics related topics are acceptable.
Choose 3 required elective courses from the following:
CISC7021 Applied Natural Language Processing
3 credits
This course covers both the fundamental and advanced topics in Natural Language Processing(NLP), which deals with the application of computational models to text data. In this course, the core tasks in natural language processing will be examined, including minimum edit distance, language modelling, Nävie Bayes, Maximum Entropy, text classification, sequence labelling, POS tagging, syntax parsing and computational lexical semantics. Modern NLP applications will be explored such as information retrieval, and statistical machine translation. Students will learn how to formulate and investigate research questions on related topics.
ENGL7022 Corpus Linguistics
3 credits
This course will introduce students to the methods and issues involved in corpus linguistics. It aims to demonstrate the practical use of online corpora to explore questions asked by scholars of language, and to address some of the central issues raised by corpus design and analysis. Students will familiarize themselves with the research use of online corpora and the construction and analysis of original corpora using freeware concordance tools. At the same time, the classes will consider various aspects of language analysis concerning lexis and grammar, and questions of language variation and change.
AHGC7039 Computer-aided Translation
3 credits
This course reviews state-of-the-art technology in translation. In addition to Translation Memory, which allows users to leverage existing data, deep learning technologies have allowed Machine Translation to achieve fully automatic translations for some genres of writing. The course will include an extensive practical component using these technologies in addition to exploring assessment of Machine Translation quality, emerging possibilities and issues. In addition to these, the use of the World Wide Web, Web-based translation aids and other applications for producing rapid, localized and high-quality translation are discussed.
AHGC7303 Evaluation and Assessment of Language Use
3 credits
This course is an introduction to the field of language assessment. Topics covered in the course include purpose and context of testing, assessing skills and components of language ability, content analysis of test instruments, and statistical analysis of test performance data, and fairness and justice in language assessment.
CHLL7101 Methods in Chinese Linguistics
3 credits
本課程旨在系統介紹語言學的研究思路、方法和一些重要的語言學理論,指導學生採用科學的工具和方法對漢語進行共時或歷時的系統研究。課程包括語言學研究的一般方法和特殊方法,語言學研究中經常使用的工具性概念,各個語言學流派所採用的分析技術,國內學者建立的語言學理論等內容。
AHGC7308 Special Topics in Applied Linguistics I
3 credits
This course is designed to offer visiting scholars, existing or future staff, the opportunity to offer courses in their particular area of specialization. The topic and content of the courses will vary from year to year depending on the availability of expert staff. Examples of specialized topics that may be offered include: data-driven learning, computer-adaptive language testing, automated scoring, corpus-based EAP (English for academic purposes), etc.
PHIL7101 Topics in Semantics
3 credits
This course is an introduction to the formal analysis of natural language semantics as situated within the larger domain of linguistics. Students will be introduced to the main goals and methods of formal semantics, with a special focus on some of its core empirical results, and to some mathematical concepts underlying semantic interpretation systems (sets, relations, functions). Topics include: predication, the semantics of quantificational determiners, indexicals, semantic binding, tense, and modality.