IMPROVING UZBEK MACHINE TRANSLATION THROUGH PARALLEL CORPORA: CHALLENGES AND SOLUTIONS

IMPROVING UZBEK MACHINE TRANSLATION THROUGH PARALLEL CORPORA: CHALLENGES AND SOLUTIONS

Mualliflar

  • Botirova Nilufar Salimjon kizi

https://doi.org/10.5281/zenodo.15590463

Kalit so‘zlar

Corpus, corpus linguistics, parallel corpus, translation corpus, comparable corpus, segmentation, machine translation

Annotasiya

The thesis explores the significance of parallel corpora in modern translation studies, focusing on their crucial role in improving machine translation systems, specifically in the context of the Uzbek language. Parallel corpora, which consist of texts in multiple languages aligned at the sentence or paragraph level, are essential for training neural network-based translation systems. The paper outlines the main challenges in creating high-quality parallel corpora, particularly for underrepresented languages like Uzbek. These challenges include limited available resources, contextual mismatching, errors in segmentation and alignment, and copyright issues. The thesis discusses several solutions to these problems, such as building open-access databases, leveraging machine translation systems, using modern alignment tools, and engaging in crowdsourcing efforts. Additionally, it emphasizes the future potential of parallel corpora in advancing translation quality, supporting linguistic research, and promoting the global recognition of the Uzbek language. Ultimately, the paper argues that parallel corpora are not just a scientific resource but a technological tool, bridging the gap between human translators and machine translation systems.

Muallif haqida

Botirova Nilufar Salimjon kizi

PhD student Uzbekistan State World Languages University

Foydalanilgan adabiyotlar ro‘yhati

Koehn, P. Europarl: A Parallel Corpus for Statistical Machine Translation. MT Summit X.2005.

Tiedemann, J. Parallel Data, Tools and Interfaces in OPUS. In LREC.2012.

Bojar, O., et al. Findings of the 2014 Workshop on Statistical Machine Translation. ACL.2014.

Och, F. J., & Ney, H. A systematic comparison of various statistical alignment models. Computational Linguistics.2004.

Resnik, P., & Smith, N. A. The web as a parallel corpus. Computational Linguistics.2003.

Artetxe, M., & Schwenk, H. Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond. Transactions of the ACL.2019.

Sharoff, S. Constructing Comparable Corpora for Low-Resource Languages. Language Resources and Evaluation.2020.

Translators Without Borders – https://translatorswithoutborders.org

OPUS corpus – http://opus.nlpl.eu

LaBSE (Google Research) – https://github.com/google-research/bert

Downloads

Nashr qilingan

2025-06-05

Qanday qilib iqtibos keltirish kerak

Botirova Nilufar Salimjon kizi. (2025). IMPROVING UZBEK MACHINE TRANSLATION THROUGH PARALLEL CORPORA: CHALLENGES AND SOLUTIONS. ZAMONAVIY TILSHUNOSLIK ISTIQBOLLARI: MUAMMOLAR VA YUTUQLAR MAVZUSIDA XALQARO ILMIY AMALIY ANJUMAN, 1(6), 378–381. https://doi.org/10.5281/zenodo.15590463

Nashr

Sho'ba

SECTION 3. DIGITAL TECHNOLOGIES AND ARTIFICIAL INTELLIGENCE IN LINGUISTICS

Shunga o'xshash maqolalar

<< < 5 6 7 8 9 10 11 12 13 14 > >> 

Shuningdek, ushbu maqolaga o'xshash maqolalar uchun kengaytirilgan qidiruvni boshlashingiz mumkin.

Loading...