O‘ZBEK TILI MATNLARIDA NOMLOVCHI BIRLIKLARNI TEGLASH MUAMMOLARI
https://doi.org/10.5281/zenodo.19873696
Kalit so‘zlar
nomlovchi birliklarni aniqlash, avtomatik teglash, o‘zbek tili, korpus lingvistikasi, neyron tarmoqlar, tabiiy tilni qayta ishlash.Annotasiya
Ushbu maqolada o‘zbek tili matnlarida nomlovchi birliklarni (named entities) avtomatik teglash jarayonida yuzaga keladigan asosiy muammolar tahlil qilinadi. Xususan, shaxs nomlari, joy nomlari, tashkilotlar va boshqa maxsus birliklarni aniqlash va ularni to‘g‘ri klassifikatsiya qilish jarayonidagi lingvistik va texnik murakkabliklar ko‘rib chiqiladi. O‘zbek tilining agglutinativ xususiyati, erkin so‘z tartibi hamda morfologik boyligi nomlovchi birliklarni aniqlashda qo‘shimcha qiyinchiliklarni yuzaga keltiradi. Shuningdek, maqolada qoidaviy, statistik va neyron yondashuvlar asosida nomlovchi birliklarni teglash usullarining afzallik va cheklovlari muhokama qilinadi. Resurslarning cheklanganligi, belgilangan korpuslarning yetishmasligi hamda standartlashtirilgan lug‘at va ontologiyalarning kamligi ushbu sohadagi asosiy muammolar sifatida ajratib ko‘rsatiladi. Tadqiqot natijasida o‘zbek tilida nomlovchi birliklarni aniqlash va teglash samaradorligini oshirish uchun gibrid yondashuvlardan foydalanish, shuningdek, milliy lingvistik resurslarni kengaytirish zarurligi asoslab beriladi.
Foydalanilgan adabiyotlar ro‘yhati
D. Jurafsky and J. H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models, 3rd ed. Stanford University, 2026.
G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, “Neural Architectures for Named Entity Recognition,” in Proceedings of NAACL-HLT 2016, 2016.
M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep Contextualized Word Representations,” in Proceedings of NAACL-HLT 2018, pp. 2227–2237, 2018.
D. Mengliev, V. Barakhnin, N. Abdurakhmonova, and M. Eshkulov, “Developing named entity recognition algorithms for Uzbek: Dataset insights and implementation,” Data in Brief, vol. 54, 2024.
E. F. Tjong Kim Sang and F. De Meulder, “Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition,” in Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 142–147, 2003.
L. A. Ramshaw and M. P. Marcus, “Text Chunking Using Transformation-Based Learning,” in Proceedings of the Third Workshop on Very Large Corpora, 1995.
N. Z. Abdurakhmonova, R. K. Shirinova, R. Sayfullayeva, D. B. Mengliev, B. B. Ibragimov, and M. Ernazarova, “An annotated morphological dataset for Uzbek word forms: Towards rule-based and machine learning approaches,” Data in Brief, vol. 61, Art. 111702, 2025.
Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training,” OpenAI, 2018.
J. Howard and S. Ruder, “Universal Language Model Fine-tuning for Text Classification,” in Proceedings of ACL 2018, pp. 328–339, 2018.
