Exploiting Morphology in Identifying Multiword Expressions from Multilingual Parallel Corpora

Daniel Hurwitz, M.Sc. Thesis Seminar
Wednesday, 25.1.2012, 12:30
Taub 601
Prof. A. Itai

Our research discusses multi-word expressions (MWE) such as "kick the bucket", "hot dog", "by and large", "look up, and "spill the beans". Identification of MWEs has been a hot subject of research in recent years. We present a method to identify MWEs in Hebrew using a new concept we term Language Isolation. We focus on dissecting (or "isolating") the morphological properties of words to discover potential MWEs and show that this method improves the alignment of multi-lingual (Spanish-English-Hebrew) parallel corprora. We then use our method to acquire Hebrew MWEs.

