
In 15 years, the Kyrgyz may begin speaking and writing in Kazakh, warned Kyrgyz Wikipedian Chorobek Saadanbekov, as reported by the Qaz365.kz news portal.
Sister languages
According to Saadanbekov, when users make requests in the Kyrgyz language on ChatGPT, the system sometimes inserts Kazakh words in its responses. This happens because the ChatGPT training database contains significantly more text and vocabulary in Kazakh than in Kyrgyz. As a result, when the system cannot find the necessary data in Kyrgyz, it automatically substitutes Kazakh words.
«If we don’t pay enough attention to developing the Kyrgyz language online, in 15 years, we may start speaking and writing in Kazakh, and we won’t even notice it,» he explained. «No matter how many language policies we draft, if ChatGPT continues to replace Kyrgyz data with Kazakh, those substitutions will gradually become entrenched.»
Kazakh and Kyrgyz are both Turkic languages from the Kipchak branch, sharing significant grammatical structures and vocabulary. While they are distinct, they share a high degree of similarity at both acoustic and linguistic levels. This close resemblance can sometimes cause large language models (LLMs), which power artificial intelligence tools like ChatGPT, to confuse the two.
High-resource vs. low-resource languages
Most AI models are trained predominantly on English data. As a result, their performance tends to be strongest in English and other high-resource languages, i.e., those spoken by large populations and widely used online. In contrast, expect more errors and inconsistencies when using AI tools in less-represented languages.
ChatGPT is an AI-powered language model developed by OpenAI. It can understand human language, answer questions, generate text, translate, explain information and offer advice on a wide range of topics. The model supports almost 100 languages, including both widely spoken and lesser-known ones. However, it performs best in high-resource languages such as English, Spanish, French, German, Chinese and Russian.
Previous reporting by Kursiv.media highlighted that Kazakhstan and OpenAI have agreed to collaborate on integrating AI into public services and education. Negotiations are underway to introduce KazLLM, a large language model for the Kazakh language, into OpenAI products, as well as to create AI-powered e-government services and launch educational programs.
In mid-May, Kazakhstan’s Ministry of Digital Development denied imposing a ban on using ChatGPT for civil servants, despite reports from some media outlets suggesting otherwise.