I'm an interdisciplinary researcher with a wide range of interests and a strong background in Computational Linguistics and Natural Language Processing. I'm currently the PI of a Marie Skłodowska-Curie Action, funded by the European Commission and hosted at Palacký University Olomouc in the Czech Republic. This project aims to improve the tasks of Parsing and Grammatical Error Detection (GED) for Mandarin Chinese using computational grammars (see more).
I've recently completed my PhD from the Interdisciplinary Graduate Programme, at Nanyang Technological University (NTU), Singapore. My thesis was titled Using Rich Models of Language in Grammatical Error Detection, and looked at exploiting computational grammars to improve the task of GED for English and Mandarin Chinese. Towards this end, I've created a new kind of learner corpus (i.e. Learner Treebanks) where error labels are placed deep within the syntactic structure instead of on the sentence surface level.
Before starting my PhD, I was a Research Associate at the Computational Linguistics Lab under the supervision of Francis Bond, also at NTU, Singapore. During this time worked on tasks such as Parsing, Word Sense Disambiguation and Sentiment Analysis. I've also acquired a lot of experience in building language resources (corpora, wordnets, treebanks, etc.). In particular, I become a main contributor of the Open Multilingual Wordnet, a project pushing an open data agenda for language resources, in particular of wordnets.
I'm a member of DELPH-IN, sharing the communal commitment of open-source development of NLP tools for high quality (linguistically motivated) syntactic and semantic parsing. And I'm also a member of the Global Wordnet Association, contributing to open-source research on computational lexical semantics.
I have a broad range of research interests, including Parsing and Generation, Computational Lexicography, Computer Assisted Language Learning, Word Sense Disambiguation, Sentiment Analysis, Machine Translation, as well as general Mandarin Chinese and Japanese Linguistics.
The main focus of my research is to model diverse aspects of linguistic knowledge so they can be applied to different AI tasks (e.g. Sentiment Analysis, Word Sense Disambiguation, Machine Translation, Computer Assisted Language Learning, etc). I work mainly with English and Mandarin Chinese, but I've also worked with other languages such as Japanese, Portuguese, Kristang, Cantonese, Coptic, Indonesian and Abui.
For more detail about my work, you can click here to see a list of some of my current and previous projects.