Hybrid-based Approach to Handle Irregular Verb-Subject Agreements in English-Arabic Machine Translation  
Mohammed Abu Shquier1
*1, University of Tabuk, Email :
Abstract .Arabic is a highly inflectional language, with a rich morphology, relatively free word order, and two types of sentences: nominal and verbal. Arabic natural language processing in general is still underdeveloped and Arabic natural language generation is even less developed [32]. Word ordering plays an important role in the translation process between languages. This research is presenting work-in-progress to examine the implications of using verb subject object (VSO) and subject verb object (SVO) words order when dealing with the agreement requirements of irregular verbs in MT. several distinguishing cases of Arabic pertinent to MT will be explored in detail with reference to some potential difficulties that they might present. Irregular verbs can be defined as verbs that act differently from the basic patterns in all or some cases [31]; the definition of irregular verbs involves accounting doubled, hamzated and weak verbs. There are four categories of weak verbs depending on the position of the weak letter/Vowels in the root (first, middle, last letter, or more than one letter). The paper presents formalism to best suit word orders based on rules and examples of part of the morphological knowledge of the Arabic language based on irregular verbs and their derivatives. We will first perform a thorough study of irregular verbs of the Arabic Language and propose a model that is based on set theory and ontologies. We then show how this model can be used for some applications that include NLP applications. Approach: The main objective of this research is to reinforce a hybrid-based MT (EA-HBMT) to improve the quality of MT from English to Arabic. Arabic lexicon would be supported by a strong theoretical framework and implemented using robust tools that will facilitate its implementation. Rules will be used to recognise the derivative and inflexional nature of the Arabic language. Transfer-based MT is used to obtain an intermediate representation that captures the “meaning” of the original sentence in order to generate the correct translation. Example based-technique is used as well to handle the irregular cases. Semantic process is mainly conducted to detect the statements that require the use of SVO construction rather than VSO. Results: in this paper we built a module to detect irregular verbs, i.e, doubled, hamzated, Mithal, Hollow, defective, and enfolding. A set of 30 rules have been conducted based on the tense of the verb, place of the vowel root letter, first, second or third person representation, number and gender features, and diacritics preceding vowel letter, i.e., nominative, accusative or genitive case. Our proposed module has been effectively evaluated using real test data and achieved satisfactory results.
Keywords : agreement; irregular verb; hamzated verb; doubled verb; hollow verb; defective; EA-HBMT

