DEMOCRATIZING CORPUS-BASED TRANSLATION STUDIES: DESIGN AND IMPLEMENTATION OF AN OPEN-SOURCE ANALYTICAL PLATFORM
https://doi.org/10.5281/zenodo.19905352
Kalit so‘zlar
corpus-based translation studies, parallel concordance, WebBootCaT, open-source software, computational linguistics, Sketch Engine, natural language processing, low-resource languages.Annotasiya
The article investigates the design, implementation, and empirical application of a custom-built, open-source corpus analysis platform tailored for Translation Studies. While the theoretical necessity of Corpus-Based Translation Studies (CBTS) is well established, a significant methodological paradox persists: commercial platforms with advanced capabilities impose prohibitive paywalls, whereas open-source computational libraries require advanced programming expertise, effectively alienating linguists. This study introduces a novel web-based architecture that replicates and extends core commercial functionalities—including Parallel Concordance, Word Sketches, and an integrated WebBootCaT module for automated web corpus generation. Results from an empirical case study on a robust 50,330-pair English-Uzbek parallel dataset demonstrate the platform's utility in analyzing machine translation output compared to human reference texts, highlighting its capacity to detect translationese and terminological inconsistencies.
Foydalanilgan adabiyotlar ro‘yhati
Anthony, L. (2013). A critical look at software tools in corpus linguistics. Linguistic Research, 30(2), 141–161.
Baker, M. (1993). Corpus linguistics and translation studies: Implications and applications. In M. Baker, G. Francis, & E. Tognini-Bonelli (Eds.), Text and Technology: In Honour of John Sinclair (pp. 233–250). John Benjamins.
Baroni, M., & Bernardini, S. (2004). BootCaT: Bootstrapping corpora and terms from the web. Proceedings of LREC 2004, 1313–1316.
Bird, S., Loper, E., & Klein, E. (2009). Natural Language Processing with Python. O'Reilly Media.
Gellerstam, M. (1986). Translationese in Swedish novels translated from English. In L. Wollin & H. Lindquist (Eds.), Translation Studies in Scandinavia (pp. 88–95). CWK Gleerup.
Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004). Design science in information systems research. MIS Quarterly, 28(1), 75–105.
Honnibal, M., & Montani, I. (2017). spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. Explosion AI.
Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V. (2014). The Sketch Engine: Ten years on. Lexicography, 1(1), 7–36.
Laviosa, S. (2002). Corpus-based Translation Studies: Theory, Findings, Applications. Rodopi.
Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). BLEU: A method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), 311–318.
Popović, M. (2015). chrF: Character n-gram F-score for automatic MT evaluation. Proceedings of the Tenth Workshop on Statistical Machine Translation, 392–395.
