The International Conference on Artificial Intelligence and Applied Mathematics in Engineering, Warszawa, Polonya, 25 - 27 Eylül 2024, ss.70-86, (Tam Metin Bildiri)
Machine translation is a concept that computers could automatically
translate one language into another. Since the translation is carried
out between two languages, the success rate is usually closely related
to the size of the bilingual data that is used. In this work, neural
machine translation was investigated for the low-resource language pair
Turkish-English. The aim is to use grammatical features in addition to
sentences in the translation process. In this direction, some of the
language tags of Turkish sentences are added to the translation system
in order to increase the translation success. In this, Zemberek,
BERTTurk, multilingual-BERT, RoBERTa and DistilBERT models are used for
part of speech tagging. In the translation model, Transformer
architecture was used in a multi-featured form. When the results are
analyzed, the model created with the Zemberek POS tagging and the
standard model without any POS tags are BLEU: 25.13 - 24.15, ChrF: 52.41
- 51.87 and METEOR: 58.22 - 57.45. It was observed that the part of
speech tags obtained with Zemberek improved the success of translation.
Analyzing the results of the translation models created with the part of
speech tags obtained from the BERT-based models, it can be seen that
they do not contribute to the translation success.