Getting The Data Right
BUILDING A COMPREHENSIVE DATA HUB Imagine having a centralized gateway that provides access to Uzbek language data from across the globe. This is the overarching objective of the "Getting the Data Right" project, and achieving this goal is within our grasp. |
Head of Centre
Doctor of sciences (DSC), Associate professor Tel: +998914040620 E-mail: shaxlo.xamrayeva@navoiy-uni.uz |
PHD, Associate Professor Tel: +998998789451 E-mail: elov@navoiy-uni.uz |
Under the project with the number IL-402104209, aimed at developing a software tool for automatic processing of information retrieval systems (Google, Yandex, Google Translate) for the Uzbek language, an "Uzbek Language Morphological Analyzer" has been developed. The initial version of this analyzer was created in 2022 and is accessible at uznatcorpara.uz. |
The goal of project IL-402104209 is to create a software tool for automatic processing of information retrieval systems (Google, Yandex, Google Translate) for the Uzbek language. This tool involves the development of the morpholexicon and morphological analyzer for the Uzbek language. Additionally, the project aims to describe the scientific articles published in national and international journals as part of its implementation and to publish the printed version of the morphological dictionary.
This project was awarded in the "Women's Grants for Science" competition in 2021 and is funded by the Ministry of Higher Education, Science, and Innovation of the Republic of Uzbekistan. The project is planned to be implemented from 2022 to 2024 with a budget of 1,200,000,000 Uzbekistani soms.
The project's brief description includes the creation of an automatic processing tool for information retrieval systems (Google, Yandex, Google Translate) for the Uzbek language, focusing on the improvement of the quality of automatic translation, machine translation, and the ability to perform automatic morphological and semantic analysis on corpus units.
The expected outcomes of the project include the development of a morphological database for the Uzbek language, the creation of automatic processing tools, and the improvement of machine translation quality. The project also aims to contribute to the development of natural language processing technologies for the Uzbek language and enhance the quality of information processing in systems like Google, Yandex, and Google Translate.
Project Leader: Dr. Shahlo Mirdjonovna Hamroyeva, Doctor of Philology (DSc), Associate Professor. She has published 69 scientific articles in Scopus-indexed journals in the past few years.
The project has received three patents for the creation of national language corpora.
Published Monographs:
"Linguistic Support for the Uzbek Language Morphological Analyzer" - Tashkent: Globe Edit, 2020. ISBN: 978-620-0-61728-6
"Fundamentals of Creating a Linguistic Database for the Morphological Analyzer of the
Uzbek Language" - LAP LAMBERT Academic Publishing, 2021. ISBN: 978-620-3-19504-0
"Linguistic Foundations of Creating the Uzbek Language Authorship Corpus" - ISBN-10: 6200515077; ISBN-13: 978-6200515070, GlobeEdit (February 7, 2020); 260 pages.
About the Implementing Organization: Alisher Navo’i Tashkent State University of Uzbek Language and Literature was established in 2016. In 2018, it signed an agreement to merge with the Institute of Uzbek Language, Literature, and Folklore of the Academy of Sciences of Uzbekistan, forming a joint institute. Additionally, academic lyceum and experimental-testing laboratory were established as part of the university's expansion.
PARTNER UNIVERSITY
Istanbul Technical University, one of the oldest technical universities of the world, was established with the name of “Mühendishane-i Bahr-i Hümayun” by Sultan Mustafa the Third. The first technical university of Turkey, İTÜ is identified with the education of engineering and architecture. İTÜ pioneered the innovation movements during the Ottoman Empire period; and left its mark on the development, modernization and managementof the country during the Republic period. İTÜ contributed with all efforts in every aspect of Turkey’s cities and anywhere in villages; with roads and bridges, factories and dams, communication networks and power plants. In addition to contributing to the development of the country for more than 2 centuries with brainpower, İTÜ has left marks in many areas by raising countless scientists, businesspeople, politicians and bureaucrats.
Here are some key members of the team:
Dr.Botir Boltayevich Elov (PhD), Head of the Department of Computer Linguistics and Digital Technologies at Tashkent State University of Uzbek Language and Literature. Contact: ebb@mail.ru.
Dr.Shahlo Mirdjonovna Xamrayeva (DSc), Associate Professor at the Department of Computer Linguistics and Digital Technologies at Tashkent State University of Uzbek Language and Literature.
Dr.Ruhillo Habibovich Alayev (PhD), Senior Lecturer at the Department of Information Security at Uzbekistan National University.
Dr.Oqila Xolmoʻminovna Abdullayeva (PhD), Associate Professor and Senior Lecturer at the Department of Computer Linguistics and Digital Technologies at Tashkent State University of Uzbek Language and Literature. Contact: abdullayevaoqila@gmail.com.
Zilola Yuldashevna Xusainova, Senior Lecturer at the Department of Computer Linguistics and Digital Technologies at Tashkent State University of Uzbek Language and Literature.
Malika Odil qizi Suyunova, a second-year master's student specializing in Computer Linguistics at Tashkent State University of Uzbek Language and Literature. Contact: malikasuyunova0@gmail.com.
Master's students specializing in Computer Linguistics and students of the Computer Linguistics program.
These individuals and their expertise were instrumental in the successful implementation of the project related to computer linguistics and digital technologies for the Uzbek language.
EVENTS
In April 2023,
an innovation project titled "Development of Software Tools for Automatic Processing of Information Retrieval Systems (Google, Yandex, Google Translate) for the Uzbek Language: Morpholexicon and Morphological Analyzer" was implemented at Azerbaijan State Pedagogical University in the field of computer linguistics.
The project achieved the following results:
Study and analysis of teaching methods for the Azerbaijani language, as well as comparative grammar of Turkic languages, resulting in the development and analysis of several textbooks.
Collaboration with Uzbekistan on various projects, including joint conference agendas and discussions on important issues.
Development of regulations related to scientific and cultural cooperation between Uzbekistan and Azerbaijan.
In May 2023,
a 3rd Annual Republic Scientific-Practical Conference on "Theoretical and Practical Issues of the Uzbek Language National and Educational Corpus" was held, and the following outcomes were achieved:
Tasks were set for the creation of the Uzbek language national corpus, which includes expanding the dictionary base, creating 15 linguistic, domain-terminological, and explanatory dictionaries, establishing specialized departments in higher education institutions for philology majors, and gathering all scientific, theoretical, and practical information related to the Uzbek language in an electronic format.
Challenges related to creating the formal grammar of the Uzbek language, using corpora in language teaching, and issues related to automatic processing for natural language were actively addressed.
In May 2023,
a scientific and educational trip was made to St. Petersburg State University for the development of expertise in the field of corpus linguistics, with a focus on the implementation of the innovation project mentioned earlier.
In September 2023,
participation in the 8th International Conference on Computer Science and Engineering (UBMK2023) led to the following results:
Discussion of issues related to automatic processing of the Uzbek language, including problems with POS tagging, automatic disambiguation of homographs in text, development of parallel corpora, and the use of HMM models for word tagging.
Collaboration with Dr. Himmet Buke, an associate professor and doctor from the Department of Turkish Language and Literature at Mehmet Akif Ersoy University in Turkey, to discuss mutual cooperation matters.
Head of Centre
Doctor of sciences (DSC), Associate professor Tel: +998914040620 E-mail: shaxlo.xamrayeva@navoiy-uni.uz |
|
PHD, Associate Professor Tel: +998998789451 E-mail: elov@navoiy-uni.uz |
|