chapter6.tex

\chapter{MT Battle: SMT Versus RBMT} % PROVIDE 5 CITATIONS, 8 TOTAL
To compare the two \textit{Machine Translation}(MT) models discussed in this paper: SMT (Statistical Machine Translation) and RBMT (Rule-Based Translation), it is important to understand how both models are implemented, what kind of raw data they require, and what kind of results to expect when using these two MT models. In addition to that, it is crucial to take into consideration the amount of effort it would take to improve or scale each of the two models. 
\section{Data Usage and Model Implementation}
The first main difference between SMT and RBMT is the type of data used in their implementations. The Moses SMT model that I implemented uses parallel corpora (translated sentences pairs) from both languages as primary input data. On the other hand, the RBMT model that I recently implemented using \textit{Python} uses a Kinyarwanda | English dictionary containing mostly single words and a few 'short expressions'. 

These differences in the kind of data used dictate how each particular model work and how it is implemented. As I explained in chapter one, SMT models translate a sentence by using statistical patterns derived from analyzing the parallel corpora data used as input\cite{Forcada2011}. Therefore it is possible that a good SMT model would correctly translate a sentence even if it has not seen it before\cite{Forcada2011}. Au contraire, RBMT models work by translating each word in the provided sentence thus making it impossible for an RBMT-based model to fully translate a sentence that contains a word that is not in its translation dictionary. This and other advantages and/or limitations of RBMT models versus SMT models are discussed in the following section that focuses mainly on the efficiency and the effectiveness of each model.

\section{Efficiency and Effectiveness}
According to the \textit{Cambridge English Dictionary}, the word \textit{efficiency} is defined as ``the condition or fact of producing the results you want without waste"\cite{cambridgeefficiency}, and \textit{effectiveness} is defined as ``the ability to be successful and produce the intended results"\cite{cambridgeeffectiveness}. Comparing SMT and RBMT based on \textit{efficiency} and \textit{effectiveness}, one may be tempted to conclude that SMT is the absolute winner. But by looking closely, one starts to realize that even though SMT is more efficient than RBMT as there is no wasted effort that goes into training an SMT model versus an RBMT model\footnote{In the training and tuning process of an MST model, the aim of each step is to improve the translation model. RBMT models on the other hand, contain several repetitive, and sometimes conflicting, conditional rules that may results in reduced efficiency in many cases.}, it is not easy to judge the effectiveness of both models because of the definition of this word itself. 

To get the full picture, it is crucial to define the ``intended results" before one starts comparing the effectiveness of these two models.  In this case, “intended results” can either be set based on the intended usage of each model, or adjusted to be slightly the same (therefore comparable) for both models. As the latter case is simpler, and therefore most likely easier to analyze, let's assume that both models have analogous ``intended results". That is, we are choosing to assume that we have similar (comparable) standards that we will use to evaluate the results of both models. If the intended goal of the translation model was to return a sentence that makes sense in the target language, SMT would be the absolute winner as it take into consideration the context in which words or groups of words are used. However, if all we cared about was the number of translated words in output sentence, RBMT would likely win this round because what an RBMT model does is to look up a word or an expression from a translations dictionary and return its equivalent translation of it exists. 

For example, if the dictionary contained all possible words, an RBMT model would translate all words, but that doesn't mean that the returned translation is correct as that solely depends on how the implementation of translation rules, and other factors such as words alignment, language grammar, etc. As you can see from the two possible scenarios discussed above, the effectiveness of SMT versus RBMT models depends on not only depends on the kind (and size) of input data used and the model implementation, but also the ``intended results".

\section{Improvement and Scalability}
Now that that I have talked about overall performance and model results, it is imperative that I discuss how both models can be improved and developed to be used on a larger scale. There are two obvious ways in which these two models can be improved. The first one is by improving the quality and increasing the quantity of the data used as input, and the second one is by enhancing the internal implementation of each model. From my experience, the easier and more effective way to improve an SMT model is by using more good-quality data in the training, tuning, and testing stages. 

Au contraire, improving an RBMT model requires revising and adjusting the existing code that handles the translation process and adding new use-cases if necessary. This can be challenging for anyone like me who is not a linguistics expert. During this project, even though I was dealing with Kinyarwanda and English, two languages that I speak and write comfortably, I was surprised by the amount of grammatical and lexical rules I didn't know or understand well enough when I tried writing code to improve my English to Kinyarwanda RBMT model.


% cite 3 sources in below paragraph
In addition to that, one can imagine the inconvenient complications that can be caused the by having to implement manually all the rules in an RBMT model. To that, add the cumulative burden of dealing with multiple languages unlike in SMT models where translations can be re-used for different languages, translation rules in RBMT can differ when from one language to another\cite{Forcada2011}. For example, the rules used to translate English to Kinyarwanda would be very different from the rules of translating English to Chinese. Imagine writing thousands lines of code for each languages-pair, and starting over again for another pair. 

A system like this is not only repetitive and inefficient, but also difficult to scale. For example, if you had an existing working RBMD based translation system that supported multiple languages, and you all want is to add a new rule to one of the languages, you would have to check all the other rules in all the pairs containing that language and make sure that none of those rules would conflict with the new rule; and if such rules exist, you would either have to modify them or delete them. This is why translation services such as \textit{Google Translate} use SMT instead of RBMT\cite{koehn2003statistical}, and many \textit{Machine Translation} scholars are spending more time doing research related to Statistical Machine Learning and its application in natural language translation\cite{koehn2007moses}.