From Lab to Public: Young Kazakh scientist shares vision for future projects

25 July, 10:58 7304

UMAY is the first digital human prototype designed by the Institute of Intelligent Systems and Artificial Intelligence (ISSAI) at NU. Young Kazakh scientist Zhanat Makhataeva is engaged in this project, El.kz reports.

Question 1: Tell us more about “UMAY”. What is it? What is it for? What social significance does it have?

Answer: UMAY is the pioneering digital human prototype of ISSAI. UMAY incorporates several ground breaking technologies developed by ISSAI, including state-of-the-art kazakh text to speech model, advanced kazakh automatic speech recognition, the powerful TILMASH neuro translation engine, cutting edge ISSAI facial AI models and ISSAI photo realistic avatar technology. As the virtual assistant UMAY has the potential to complement the human workforce, increasing the quality and decreasing the cost of services related to human computer interaction. This in the long run can ultimately benefit the people and the economy.

Question 2: How did the idea of ​​working on a digital prototype come about? Can you call this project a success? Who is involved in the project?

Answer: Everybody wanted to create prototypes of digital humans at all times. For example roboticists wanted to create humanoid robots that look like human beings. For the first time over the last 2 years it has become possible to merge LLM and AI capabilities with photorealistic models to generate the avatars. This way you can naturally interact with AI. This is a new way of interacting with AI and information.

Yes, as a scientific project it is very successful. For the first time we showed that LLM, AI and photorealistic avatars can be merged and can provide natural interaction with information and AI. However, we hope to move from a laboratory environment to the public. But in the previous prototype of UMAY we have used the Chat GPT as the brain of the avatar, thus we are focusing on the creation of KAZ LLM this year. We foresee virtual teaching assistants to be integrated in the local schools.

This project is about system integration, composed of automatic speech recognition (ASR), text to speech (TTS) technology, KAZ LLM and creation of the digital personal for the photorealistic avatar including the gestures and animation. Currently, 30 data scientists are involved in the project. Specifically, me as a senior data scientist, Rakhat Meiramov as a data scientist, Mansur Galymov as a graduate research assistant, and undergraduate NU students as the undergraduate research assistants.

Question 3: Is it possible to commercialise this project? What prospects do you see?

Answer: Yes, it is possible to commercialise. Until last year ISSAI was focusing on academic research and making scientific publications. But now we transformed and intend to commercialise.

Digital screens with virtual avatar can be installed on the streets or shops to give assistance to people. In the future in the workplace we will have as many digital personas as humans. This is a future technology that will be everywhere as it transforms interaction between human and computer or technology.

We always have scenarios when humans interact with computers, technology and AI. Thus we will use virtual avatars for having more natural interaction with technology.

As a very rare case we are one of the early comers to this new technology. Usually Kazakhstan takes already developed technology. Thanks to AI and the emphasis of the state supporting AI technology Kazakhstan is now taking part in this race.

Question 4: What other projects is ISSAI involved in?

Answer: ISSAI has projects on Biometric i.e., Thermal Face Recognition, Natural Language Processing i.e., Projects such as Kazakh speech to text translation, translation of Kazakh text to speech,  TILMASH neuro translation engine. This year the goal of ISSAI in 2024 is to make the Republic of Kazakhstan enter the age of Generative AI. ISSAI wants to create our own KAZ LLM. The obvious starting point is to generate and interpret the text in Kazakh, English and Russian Languages. The next step would be to create a vision model capable of interpreting and generating images.

Question 5: Tell us about yourself, your scientific path, theprojects you have worked on.

Answer: I am Zhanat Makhataeva, a postdoc and a senior data scientist at the Institute of Smart Systems and Artificial Intelligence (ISSAI) at NU, and a Microsoft Research PhD Fellow 2021-2022.

I did my PhD at NU at the School of Engineering and Digital Sciences, upon graduating from NU in 2019 with a Degree of Master of Science in Robotics and Mechatronics. Earlier in 2017, I did my B.Sc. in Robotics at NU.

My research interest include augmented reality, cognitive assistive systems, artificial intelligence, human memory augmentation, cognitive ergonomics, system test and evaluation, human information processing concerns

My PhD supervisor is Prof. Huseyin Atakan Varol, and my  PhD dissertation title is "Augmented Reality-Based Human Memory Enhancement Using Artificial Intelligence".

During my studies at NU, I have developed the first human memory augmentation system ExoMem and the system for visualisation of safe and danger zones around the working robot in the industrial environment. Currently I am working on the development of a prototype of the virtual teaching assistant that can be integrated into local schools.

EL.KZ
Share: