MSc in Data Science with Specialization in Computational Linguistics
Offered by FAH
All courses are compulsory:
AHGC7315 Language and Linguistics
3 credits
This course introduces key concepts and approaches to the study of language and linguistics. It provides an overview, with basic terminology, of the major sub-fields of linguistics, investigating the nature, history, and structure of language, and how language relates to the mind, society, and education. It provides the basis of investigation in subsequent courses in the MA in SLA program. Students are encouraged to reflect on their own language experience and apply the theories covered in the course to their own linguistic context.
AHGC7398 Project Report
6 credits
A “Project Report” addresses a particular issue in a specific context and may incorporate elements of data collection, data processing, data analysis and data solution. Students demonstrate their ability to reflectively examine a particular context, identify and define an issue in that context, access, summarize, and synthesize current literature relevant to the issue, and develop a justified approach to address that issue. Various linguistics and applied linguistics related topics are acceptable.
Choose 3 required elective courses from the following:
CISC7021 Applied Natural Language Processing
3 credits
This course covers both the fundamental and advanced topics in Natural Language Processing (NLP), which deals with the application of computational models to text data. In this course, the core tasks in natural language processing will be examined, including minimum edit distance, language modelling, Naive Bayes, Maximum Entropy, text classification, sequence labelling, POS tagging, syntax parsing and computational lexical semantics. Modern NLP applications will be explored such as information retrieval, and statistical machine translation. Students will learn how to formulate and investigate research questions on related topics.
ENGL7022 Corpus Linguistics
3 credits
This course introduces students to the practices and issues involved in corpus linguistics. Its aims are to demonstrate the practical use of online corpora to explore questions asked by scholars of language and literature, and to address some of the main theoretical issues raised by corpus design and analysis. Students familiarize themselves with some of the available online corpora and practise using concordancers, and search tools that give information about frequencies, collocations, colligations, etc. The classes cover language analysis at the levels of lexis, grammar and discourse and applications such as lexicography and language teaching. As the course progresses, the students consider questions of representation, size, transcription and tagging in the building of a customized corpus.
AHGC7039 Computer-aided Translation
3 credits
This course reviews state-of-the-art technology in translation. In addition to Translation Memory, which allows users to leverage existing data, deep learning technologies have allowed Machine Translation to achieve fully automatic translations for some genres of writing. The course will include an extensive practical component using these technologies in addition to exploring assessment of Machine Translation quality, emerging possibilities and issues. In addition to these, the use of the World Wide Web, Web-based translation aids and other applications for producing rapid, localized and high-quality translation are discussed.
AHGC7303 Evaluation and Assessment of Language Use
3 credits
This course is an introduction to the field of language assessment. Topics covered in the course include purpose and context of testing, assessing skills and components of language ability, content analysis of test instruments, and statistical analysis of test performance data, and fairness and justice in language assessment.
CHLL7101 Methods in Chinese Linguistics
3 credits
本課程旨在系統介紹語言學的研究思路、方法和一些重要的語言學理論,指導學生採用科學的工具和方法對漢語進行共時或歷時的系統研究。課程包括語言學研究的一般方法和特殊方法,語言學研究中經常使用的工具性概念,各個語言學流派所採用的分析技術,國內學者建立的語言學理論等內容。
AHGC7308 Special Topics in Applied Linguistics I
3 credits
This course is designed to offer visiting scholars, existing or future staff, the opportunity to offer courses in their particular area of specialisation. The topic and content of the courses will vary from year to year depending on the availability of expert staff. Examples of specialised topics that may be offered include: data-driven learning, computer-adaptive language testing, automated scoring, corpus-based EAP (English for academic purposes), etc.
PHIL7101 Topics in Semantics
3 credits
This course is an introduction to the formal analysis of natural language semantics as situated within the larger domain of linguistics. Students will be introduced to the main goals and methods of formal semantics, with a special focus on some of its core empirical results, and to some mathematical concepts underlying semantic interpretation systems (sets, relations, functions). Topics include: predication, the semantics of quantificational determiners, indexicals, semantic binding, tense, and modality.