Listing 1 - 10 of 13 | << page >> |
Sort by
|
Choose an application
Choose an application
Choose an application
Choose an application
Choose an application
Choose an application
Choose an application
Choose an application
The aim of this study is to contribute to one of the objectives of the Smart Computer-Aided Translation Environment (SCATE) project, namely the improvement of translation technology, and ultimately also translation efficiency, by means of a better integration of translation memories (TMs) and machine translation (MT). The TM-MT integration method developed here works by constraining MT output to include certain consistently aligned sub-segments stemming from one or more TM matches. The phrase-based statistical machine translation package Moses is used as baseline MT system. Tests on three datasets involving translation from English into Dutch demonstrate the potential benefits of the approach, with significantly better BLEU, METEOR and TER scores being reported for all datasets. To further improve performance, various suggestions for additions to the TM-MT integration system are considered.
Choose an application
Corpus cleaning is essential when conducting research, but it can be difficult to find the correct method as much depends on the goal of the corpus. This research aims to provide a solution for identifying non-language containing files in a corpus. In addition, it attempts to further break down the concept of language into textual and non-textual language. To this end, definitions of those three concepts are provided in order to convert them into (linguistic) features. These features are then used to train and test classification models, in this case logistic regressions and decision trees. The operationalisation of those definitions and features can, however, have a vast influence on the results of any classifiers. The results indicate that linguistic parsing aids in the scores, whereas balancing the class weight of the underrepresented labels mainly lowers the results. Only in multi-class classification does balancing the data ensure a better identification of those underrepresented classes. Both logistic regressions and decision trees yielded great results and are therefore useful in corpus cleaning purposes, particularly the models based on unbalanced and parsed data.
Choose an application
Listing 1 - 10 of 13 | << page >> |
Sort by
|