Induqin

Aug 3, 202314 min read

How India Is Using AI To Build The Internet For Local Languages

Imagine a future in which there is no language barrier and everyone receives education in their native language.

With the medium of instruction firmly within their comfort zone, billions of Indians would be able to concentrate on acquiring knowledge and skills rather than navigating a minefield of unfamiliar language.

Similarly, imagine a time when people are better able to access services in crucial areas such as healthcare, banking, and the law, without having to worry about communication, thanks to a diverse linguistic platter delivered via accessible technology.

A government initiative in India is supporting and coordinating the efforts of a diverse ecosystem of players, spanning from researchers to startups, to ensure that language is not a barrier for the most populous country in the world.

Many may believe that the Internet is as ubiquitous as air. We are so reliant on the Internet that we gasp for a mobile data or WiFi signal when, to our apprehension, its availability decreases, for example, in the highlands or in remote areas.

Obviously, this perception does not correspond with reality. International Telecommunication Union (ITU) estimates that only about 66% of the world's population will use the internet by 2022. That is hardly "ubiquitous."

Countries categorised as "least developed" (LDCs) have less internet connectivity than other nations. Last year, only 36 percent of the LDC population was online.

Despite the fact that this internet access and utilisation gap is worrisome in light of the internet's great potential for emancipation, it is likely to be easily bridged in the future, as evidenced by the efforts of companies constructing internet-enabled satellite constellations in space.

However, there is a distinct type of Internet "access" that is similarly necessary but inadequately provided for, particularly in a country as diverse as India.

This access, or digital bridge, is not extended over physical cables or electromagnetic signals, but rather the written and spoken word.

Over 54 percent of all websites whose content language we know are written in English, making it the most prevalent language on the web. This is a high number considering that only about 18% of the globe speaks the language correctly; there are close to 1.5 billion speakers.

In contrast, Hindi, the world's third-most-spoken language and India's most-spoken, with approximately 602 million speakers worldwide and over 7.5% of the world's population, does not even make the top 20 list of content languages for websites. 0.1% of all websites are written in Hindi.

According to the Eighth Schedule of the Constitution of India, Hindi is only one of India's 22 official main languages. According to school-taught languages, there are between 69 and 72. Regarding the languages and dialects supported by the radio network, there are 146. The 2011 Census of India identifies 1,369 distinct mother languages.

In light of India's vast linguistic diversity and the fact that approximately 70% of the population does not speak or comprehend English adequately, there is an immediate and compelling case for creating an Internet that supports India's many languages.

This is especially true given the increasing number of Internet consumers in the country. India could have as many as 900 million active internet users by 2025, defined as those who access the internet at least once a month. This number is only anticipated to increase.

Therefore, the lack of content in India's numerous local languages is less of an inconvenience and more of an oversight. Internet access must be expanded in terms of the ability to connect to broadband, as well as the language barrier must be overcome.

The Ministry of Electronics and Information Technology (MeitY) of the Government of India has been implementing a novel initiative titled Mission Bhashini in recognition of this language barrier, as well as its potential, for Indians and Indian-language speakers.

Bhashini, an acronym for BHASHa INterface for India, was first mentioned as the "National Language Translation Mission (NLTM)" in the Budget 2021-22 speech of Finance Minister Nirmala Sitharaman.

"This (mission) will enable the Internet's wealth of governance- and policy-related information to be made available in major Indian languages," Sitharaman explained.

The mission was established following a recommendation from the Science, Technology, and Innovation Advisory Council to the Prime Minister (PM-STIAC).

The plan was to use technology to translate content into Indian languages, with the goal of making science and technology accessible to all Indian citizens in their native dialect.

After Sitharaman's announcement in February 2021, the MeitY approved the language translation mission in October of that year. The Digital India Bhashini Mission, as it was subsequently dubbed, was allocated Rs 495.51 billion for three years.

The mission was formally inaugurated by Prime Minister Narendra Modi on July 4, 2022, during the Digital India Week 2022 opening ceremony in Gandhinagar, Gujarat. The mission's implementation began in February 2022.

The objective of Bhashini is to facilitate simple access to the internet and digital services for all Indians in their native language, as well as to expand the availability of online content in Indian languages.

The fundamental method for achieving this objective is language translation through technology.

To achieve this goal, Bhashini has established an ecosystem, compiled data and models contributed by the ecosystem into a shared repository, and promoted the development of products and services in Indian languages by utilising the open repository. This is a continuous procedure.

The Bhashini ecosystem includes government, academia, research organisations, startups, industry, and even citizens, who are India's natural language repositories.

Work in progress within the ecosystem is amassing language data that can be used by researchers to develop artificial intelligence (AI) language models, upon which entrepreneurs, industry, and government will develop innovative products and services for citizens.

In this manner, Bhashini seeks to facilitate a future that is more inclusive and empowering for all Indians.

According to Amitabh Nag, chief executive officer of the Digital India Bhashini Division, not only will the translation mission overcome the language divide, but it will also bridge the digital and literary divides along the way.

MeitY's Digital India Bhashini Division, an independent corporate division within the Digital India Corporation, is responsible for the implementation of Bhashini.

"We're enabling voice understanding, which means we're providing the tools for the machine to understand speech, automatically recognise it, and translate it to a person who understands another language," explains Nag.

In addition, by facilitating speech-to-speech communication via machines, Bhashini will be able to remove the barrier associated with literacy — the ability to read and write — by ensuring that one can acquire knowledge and complete tasks solely through speech.

As a result, many of the digital public infrastructures and goods that are already in place for the benefit of citizens will be able to overcome the "last mile" barrier and reach anyone who was previously prevented from accessing them due to language, literacy, or technology barriers.

Bhashini will also have cultural significance by aiding in the preservation of India's rich knowledge and culture, which are alive and well in all of its native languages but absent from the English lexicon.

After language is no longer a barrier to communication between individuals, it will also facilitate greater innovation by expanding collaborative research and development across India.

In addition, Bhashini will create new economic opportunities for India's aspirational economy.

Indian businesses, including many young startups and particularly digital natives, will gain immediate access to a vastly expanded non-English-speaking market. This will be of particular benefit to the innovative startups that are proliferating in smaller cities and villages.

Technology is the primary delivery mechanism for language translation services.

More than seventy prominent technology partners are affiliated with Bhashini through approximately eleven consortiums, and they are all working to collect language data and develop the necessary AI models for language translation.

Many Indian Institutes of Technology (IITs), the International Institutes of Information Technology (IIITs), the Indian Institute of Science (IISc), several Centres for Development of Advanced Computing (C-DACs), and some National Institutes of Technology (NITs) are among the technology partners.

With some of the brightest minds in technology joining forces to help India overcome the language barrier, Bhashini is a genuinely formidable — also MeitY — display of India's technological prowess.

Nag believes that Bhashini represents one of the world's largest use cases for AI/machine language, data science, and APIs. APIs (application programming interfaces) allow applications to securely and readily exchange data and functionality.

The primary objective is to translate from one Indian language to another, including text, speech, and video.

Automatic speech recognition (ASR), optical character recognition (OCR), natural language understanding, machine translation (MT), and text-to-speech (TTS) are the technologies that serve this purpose the most.

ASR, or speech-to-text, for example, is the technology that enables computers to communicate with humans by identifying and recognising the spoken word and then converting it to a text format that is readable.

Text-to-speech (TTS), on the other hand, is the process of converting text input into speech output.

On the basis of a large data corpus and a high-performance computing infrastructure, technologies such as automatic speech recognition (ASR) and text-to-speech (TTS) can be used to create advanced machine translation systems, preferably in India and for India.

Professor Hema A. Murthy of the Department of Computer Science and Engineering at IIT Madras, who has been primarily engaged in speech efforts for years, explains that one of the primary goals of Bhashini is to develop indigenous deep tech, be it LLM (large language model), speech, machine translation, or text-to-speech, and not to rely on ChatGPT or other OpenAI models.

"While all of our models use deep technology," the researcher explains, "the fundamental difference is the use of a culture-specific approach, language family approach, to use deep technology judiciously; this ensures that we can perform good deep technology with small amounts of data, requiring less compute power and raw power."

Since the 1980s, work on Indian language machine translation has been ongoing.

MaTra, developed at the National Centre for Software Technology (now C-DAC Mumbai), and Anglabharati, Anubharati, and Anusaaraka, developed at IIT Kanpur, are notable early initiatives. Later, Anusaaraka relocated to IIIT Hyderabad.

In recent years, however, advances in deep learning and processing capacity have propelled AI Indian language work, which has been bolstered by the emergence of a nation-wide unification effort in Bhashini.

Before the mission's announcement in February 2021, at least three years of preparation were invested.

The initial spark was provided by two pilot initiatives. One was sponsored by the Office of the Principal Scientific Advisor (PSA) and the other was sponsored by MeitY; both were supervised by Professor Dipti Mista Sharma of IIIT Hyderabad, with IIT Madras and IIT Bombay as partners.

Under the PSA pilot, which Professor Rajeev Sangal of IIIT Hyderabad deserves credit for, a speech-to-speech translation system was being developed for the translation of video lectures created under the National Programme for Technology-Enabled Learning (NPTEL) into Indian languages.

Professor Murthy explains, "Within a year, it was demonstrated that it is possible to use technology to make this a reality."

The Ministry of Education subsequently collaborated with IIIT Hyderabad on the Swayam initiative.

The project entailed the transcription, translation, and subtitling of 82 courses, totaling approximately 1,600 hours of video content, in English and eight Indian languages, covering subjects such as law, taxes, and the environment.

Eventually, Swayam and NPTEL lectures were reproduced in multiple Indian languages utilising the technology developed during the pilot phase.

The engagement of IIIT Hyderabad was natural. Its Language Technologies Research Centre has long been at the forefront of technological innovation intended at bridging the language gap.

According to Dr. S K Srivastava, the MeitY-sponsored pilot focused on the "development of ASR and speech synthesis systems at IIT Madras, translation among Indian languages at IIIT Hyderabad, and translation from English to Indian languages at C-DAC and IIT Madras."

Prior to Bhashini, MeitY supported research in language technology through the Technology Development for Indian Languages (TDIL) programme since 1991.

The TDIL was established with the intention of developing information processing tools and techniques to facilitate human-machine interaction in Indian languages, as well as creating technologies to access multilingual knowledge resources.

Dr. Swaran Lata, the former leader of TDIL and now a consultant for the Bhashini mission, initiated mission-mode consortium initiatives for these efforts, thereby fostering collaboration.

A notable TDIL initiative was the Mandi initiative, which resulted in a system that helps farmers keep track of the latest prices for agricultural commodities and the weather using only a feature phone and their native language.

In another instance, a TTS synthesis system integrated with a screen reader was developed, which enables visually challenged people to interpret and perform computer operations with an audio interface. Supporting 13 Indian languages, the system is integrated into some government websites.

Two additional TDIL initiatives are noteworthy: Sampark and Anuvadaksh.

The Sampark machine translation system was developed by a consortium of institutes led by IIIT Hyderabad for translation from one Indian language to another, covering nine languages.

Anuvadaksh was developed by a consortium led by C-DAC Pune for translation from English to some Indian languages like Gujarati, Oriya, Tamil, Bodo, and Bengali.

Such R&D work served as a precursor to Bhashini, with many of the partner institutes now using that experience to develop language translation tools under the national mission.

The Bhashini mobile application, available on both the Android and iOS mobile operating systems, and the web service Anuvaad are both helpful and easy to use.

The mobile app can be used to translate text and speech from one Indian language to another and has a feature using which any two persons speaking different languages can communicate with each other in almost real time.

The text translation feature on Bhashini supports 11 languages, including Assamese, Gujarati, Kannada, Punjabi, and Tamil, with a further 11 languages in the beta stage.

The voice translation and conversation features support the same 11 languages, with two languages Bodo and Manipuri in the beta stage.

Anuvaad is a web service — meaning it can be accessed on a web browser — supporting text-to-text and speech-to-speech translation in 13 languages.

IIT Madras and IIT Bombay host similar translation tools developed under Bhashini.

The Speech Lab at IIT Madras (NLTM R&D) features four tools on its portal — ASR, speech-to-speech (S2S), text-to-speech (T2S), and video-to-video (V2V). (The names indicate the nature of translation.)

The head of the Speech Lab is Professor Srinivasan Umesh, who is leading the ASR efforts for the Bhashini mission and is a co-coordinator for the speech technology consortium of 21 institutions.

Previously, he led a multi-institution consortium to develop ASR systems in Indian languages in the agriculture domain, between 2010 and 2016.

IIT Bombay’s text-to-text and speech-to-speech translation systems are also available as web services. The languages supported are Hindi, English, Marathi, and Nepali.

The systems were developed by the Computation for Indian Language Technology (CFILT), a distinguished centre for natural language processing (NLP) under the leadership of Professor Pushpak Bhattacharyya.

The premier institute has two bidirectional text-to-text machine translation systems, called Ishaan and Vidyaapati, in the development stage. They will cover languages such as Assamese, Bodo, Manipuri, Nepali, Konkani, and Maithili.

In a little something different, a conversational AI has been developed under Bhashini through experimental integration with WhatsApp and OpenAI’s ChatGPT-3.

Using the Bhashini WhatsApp chatbot, one can ask a question in their language and receive a response in the same language. The input can either be text or voice and the output arrives in the form of both text and voice. Currently, Hindi, Gujarati, and Kannada are supported.

One can use the chatbot to enquire about a government product or service. In a popular demonstration of the chatbot, one plays a farmer and asks questions about the PM Kisan Yojana in Hindi and hears back responses in Hindi.

This chatbot speaks to an earlier point on how Bhashini is able to bridge the language, literacy, and digital divide and make people’s lives easier.

Several AI language applications based on Bhashini APIs have taken shape over the last couple of years, thanks to the more than 800 AI models on the national platform, many of which are in the process of being integrated into various services.

Jugalbandi is a free and open platform that combines the power of ChatGPT and Indian language translation models.

It drives WhatsApp and Telegram chatbots using which anyone can ask about 121 government schemes in 10 Indian languages. Farmers are advertised as the primary beneficiaries.

In the future, Jugalbandi can potentially power WhatsApp and Telegram chatbots to help democratise access to legal information and bring quality healthcare to citizens.

It has received the endorsement of Microsoft chief executive Satya Nadella. “The rate of diffusion of this next generation of AI is unlike anything we've seen, but even more remarkable is the sense of empowerment it has already unlocked in every corner of the world, including rural India,” Nadella said earlier this year.

Anuvaad, developed by AI4Bhārat at IIT Madras, is an AI-based open-source platform for the translation of documents into Indic languages at scale. The service supports 13 languages and leverages OCR and NMT to accomplish end-to-end document translation.

Its maker, AI4Bhārat, has been building open-source language AI for Indian languages, including datasets, models, and applications, as a public good. It serves as the Data Management Unit for Bhashini.

From its inception in 2019, using AI to connect Indians who speak different languages has been one of AI4Bhārat’s focus areas. Its technology efforts are primarily led by Professor Mitesh Khapra, Pratyush Kumar, and Microsoft researcher Anoop Kunchukuttan.

Their tool Anuvaad has found use in the Supreme Court and High Courts of India as “SUVAS,” in the Supreme Court of Bangladesh as “Amar Vasha,” and in the National Council of Educational Research and Training (NCERT) under the Ministry of Education as “Diksha.”

Anuvaad has helped digitise and translate more than 20,000 legal documents already. It was developed for the judicial domain, but has found general-purpose use over time, as well.

It is also likely to be integrated with the Unique Identification Authority of India (UIDAI), as indicated by UIDAI deputy director general Alok Shukla, speaking on ‘Bhashini and multilingual internet’ earlier this year on the occasion of Universal Acceptance Day.

“I was impressed with the quality of the translation,” Shukla said, adding “Definitely, Bhashini is going to help us.”

The proliferation of video content on the web has raised the need for transcription. For this purpose, there is Chitralekha, also developed by AI4Bhārat for Bhashini.

Chitralekha is an open-source platform tool for video subtitling across various Indic languages. It supports multiple input sources, such as YouTube or local storage.

The platform automatically creates time-stamped transcription cards, which can be edited, and enables translating the transcription into English and 12 Indian languages.

At a time when so much learning happens over video, a platform like Chitralekha can make learning accessible to anyone regardless of the video’s original language. It is being used to make the higher education course content under NPTEL available in various Indian languages.

The IIT Bombay initiative Project Udaan is similarly helping overcome the language barrier in the Indian higher education system.

Led by Professor Ganesh Ramakrishnan, Udaan is enabling the translation of textbooks and learning materials across English and all Indian languages. Translation from English to Marathi and Malayalam is currently underway.

The project is expected to benefit the more than 65 per cent of over 1 crore students appearing every year in Class 10 and 12 exams of various school boards who are from non-English-medium schools.

Using their machine translation framework, Udaan has been able to speed up the process of translating technical books, as acknowledged by the All India Council for Technical Education (AICTE).

For Bhashini, voice is the way forward across areas — payments, banking, education, health care, and retail, to name a few.

“We believe that voice-enabled technologies are going to be the future, they are going to be the next big revelation,” Professor Umesh of the Speech Lab, IIT Madras, has said.

While language translation over voice is already a reality, making payments using voice is a particular area of interest for Bhashini. Digital payments are, after all, soaring in popularity, with the value of digital transactions in India expected to reach $135.2 billion by the end of 2023.

The National Payments Corporation of India (NPCI), an umbrella organisation for all retail payment systems in India, is reportedly working with AI4Bhārat to develop a system for voice-based merchant payments and peer-to-peer transactions in Indian languages.

This will enable feature phone users to benefit from digital payment innovations like the Unified Payments Interface (UPI), currently enjoyed only by smartphone users.

“We are working hard to actually ensure that we are in a position to make UPI speech-enabled. ‘Dus rupay pay karo (pay Rs 10)’ is the idea and you get a response by voice,” Nag says, adding, “We are not far away.”

Similarly, Bhashini is working with the Reserve Bank of India (RBI) to enable voice-based banking, and with the Open Network for Digital Commerce (ONDC) for voice-based retail.

Bhashini is also looking to make the national telemedicine platform eSanjeevani multilingual.

eSanjeevani, which facilitates quick and easy access to medical professionals over the smartphone, is bridging the digital health divide. Bhashini can elevate it further by helping bridge the language and literary divide, as well.

Education is a crucial potential area of impact for Bhashini. The National Education Policy (NEP) 2020 recommends the use of the mother tongue as a medium of instruction in schools until at least Class 5, but preferably up to Class 8 and beyond.

Even under the Right to Education Act of 2009, it is stated that the “medium of instructions shall, as far as practicable, be in child’s mother tongue.”

Bhashini is a natural vehicle for these goals. “Making educational apps multilingual will be in line with NEP, and perhaps make the unskilled youth skilled,” Professor Murthy says, noting that India is expected to have the largest number of unskilled youth under 25 in 2029.

As for government services, India’s mobile governance app UMANG, short for Unified Mobile Application for New-age Governance, is already accomplishing direct benefit transfer using Bhashini APIs.

The day is also not far when one is able to make voice-based complaints on the grievance redressal app Centralised Public Grievance Redress and Monitoring System (CPGRAMS).

Beyond the fantastic impact on individuals, Bhashini might also play a significant role on the regional and global stage.

Prime Minister Narendra Modi spoke earlier this month about sharing Bhashini within the Shanghai Cooperation Organisation (SCO) — an intergovernmental organisation comprising eight member states that speak different languages.

While addressing the SCO Summit 2023 in Hindi, he said, "We would be delighted to share India's AI-based language platform Bhashini with everyone to remove language barriers within SCO. It can become an example of digital technology and inclusive growth.”

Such is the wide sweep of Bhashini — from enabling an individual at home to upskill using just their native tongue to facilitating critical communication among world leaders on the most pressing matters affecting a region or the world.

It can help build an India where knowledge is accessible to all, with its citizens truly included, empowered, and aatmanirbhar (self-reliant) in the real sense.

INDIA CHINA

Business Intelligence

How India Is Using AI To Build The Internet For Local Languages

Recent Posts

댓글