Computational Linguistics Group

Shuly Wintner, Director of Laboratory

 

 

The Computational Linguistics Laboratory is involved in research and teaching in diverse areas of computational linguistics and natural language processing. Of primary interest are investigations related to Hebrew, as well as to other Semitic languages, notably Arabic.


The laboratory is headed by Dr. Shuly Wintner and includes: Postdoctoral researchers, Nurit Melnik and Yuval Krymolowski; Ph.D. student: Yael Sygal; and M.Sc. students: Ezra Daya, Daniel Feinstein, Amit Kirschenbaum, Danny Shacham and Shlomo Yona.

 

Research Pursued by this Laboratory Includes

 

Hebrew to English Machine Translation

 

Hebrew Morphological Disambiguation

 

Computational Grammar of Inverted Constructions in Modern Hebrew

 

WordNet for Hebrew

 

 

 

Hebrew to English Machine Translation

  Researchers in Haifa: Yuval Krymolowski and Shuly Wintner. This project is a joint collaboration with a team at the Language Technologies Institute, Carnegie Mellon University, headed by Alon Lavie.
  This project developed a preliminary Hebrew-to-English Machine Translation (MT) system under a transfer-based framework specifically designed for rapid MT prototyping for languages with limited linguistic resources. The task is particularly challenging due to two main reasons: the high lexical and morphological ambiguity of Hebrew and the dearth of available resources for the language. It uses existing, publicly available resources and adapts them in novel ways to support the MT task. The methodology behind the system will be based on two separate modules: a transfer engine which produces a lattice of possible translation segments, and a decoder which searches and selects the most likely translation according to an English language model. This project uses a set of manually crafted transfer rules to improve the translations. Performance is evaluated using state-of-the-art measures.

back to top

 

Hebrew Morphological Disambiguation

  Danny Shacham and Shuly Wintner
  Morphological analysis is a crucial stage in a variety of natural language processing applications. When languages with complex morphology are concerned, even shallow applications such as search engines, information retrieval or question answering, let alone heavier applications such as machine translation, require morphological analysis and disambiguation as a first step. The lack of a morphological disambiguation module for languages such as Hebrew or Arabic handicaps the performance of many other applications. The goal of this project is to develop a morphological disambiguation module which could be used to rank the analyses produced by a state-of-the-art morphological analyzer.

back to top

 

Computational Grammar of Inverted Constructions in Modern Hebrew

 

Researcher: Nurit Melnik

 

Verb-initial constructions are those in which the verb appears in a clause-initial position and is followed by the subject. Under the assumption that the default word order in Modern Hebrew is subject-verb-object, this type of a construction is considered inverted. A formal analysis of verb-initial constructions in the framework of Head- Driven Phrase Structure Grammar (HPSG) is presented. An important feature of HPSG, which distinguishes it from competing frameworks, such as Chomsky's Government and Binding theory and its variants, is its underlying mathematical formalism. As such, grammatical theories in HPSG can be implemented and consequently tested against “real” data. The main objective of this project is to develop a computational implementation of the grammar and to test it against “real” corpus data.

back to top

 

WordNet for Hebrew

 

Researchers in Haifa: Danny Shacham, Noam Ordan, Iris Eyal and Shuly Wintner (and previously, Margalit Zabludowski).

 

This project is jointly conducted with a team at the TCC Division at the Bruno Kessler Foundation (formerly ITC-irst) in Trento, Italy. WordNet is an online lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. Following the success of the English WordNet project, similar networks have been developed for a variety of languages. In particular, researchers at the Bruno Kessler Foundation have developed a methodology for parallel development of multilingual WordNets. The system, called MultiWordNet, contains information on several aspects of multilingual dictionaries, including lexical relationships between words, semantic relations over lexical concepts and several mappings of lexical concepts in different languages.


The goal in this project is to use the MultiWordNet methodology in order to construct a Hebrew WordNet, similar to the one developed at the Bruno Kessler Foundation (and, therefore, will work in English, Italian and Spanish).

back to top