I work as the lead NLP/ML scientist for Contexta360. I earned my PhD at University of Amsterdam as a member of Language Technology Lab. I worked on Statistical Machine Translation under supervision of Dr. Christof Monz and Prof. Maarten de Rijke. Before starting my PhD, I was working as a research assistant in Natural Language and Text Processing Laboratory at University of Tehran. There, we were developing Faraazin machine translation system.
Earlier approaches indirectly studied the information captured by the hidden states of recurrent and non-recurrent neural machine translation models by feeding them into different classifiers. In this paper, we look at the encoder hidden states of both transformer and recurrent machine translation models from the nearest neighbors perspective. We investigate to what extent the nearest neighbors share information with the underlying word embeddings as well as related WordNet entries. Additionally, we study the underlying syntactic structure of the nearest neighbors to shed light on the role of syntactic similarities in bringing the neighbors together.More Info
Attention in neural machine translation provides the possibility to encode relevant parts of the source sentence at each translation step. As a result, attention is considered to be an alignment model as well. However, there is no work that specifically studies attention and provides analysis of what is being learned by attention models. Thus, the question still remains that how attention is similar or different from the traditional alignment. In this paper, we provide detailed analysis of attention and compare it to traditional alignment. We answer the question of whether attention is only capable of modelling translational equivalent or it captures more information. We show that attention is different from alignment in some cases and is capturing useful information other than alignments.More Info
We propose two models to use shorter sub-phrase pairs of an original phrase pair to smooth the phrase reordering distributions. In the first model we follow the classic idea of backing off to shorter histories commonly used in language model smoothing. In the second model, we use syntactic dependencies to identify the most relevant words in a phrase to back off to. We show how these models can be easily applied to existing lexicalized and hierarchical reordering models. The results show that not all the words inside a phrase pair are equally important in defining phrase reordering behavior and shortening towards important words will decrease the sparsity problem for long phrase pairs.More Info
In this work we proposed a fully-automated approach for constructing a Persian WordNet. Our acquired WordNet has a precision of 90.46% which is a considerable improvement in comparison with automatically-built WordNets in Persian. Just send me an email if you want the WordNet.More Info
Formerly, I was a member of machine translation development team in Natural Language and Text Processing Laboratory at University of Tehran. There we developed a hybrid machine translation system which combines transfer-based models with statistical approaches.More Info