Artificial intelligence (AI) and data science (DS) centers are becoming ubiquitous in academic institutions around the globe. These centers serve to focus research efforts and bring together large teams to address important problems. AI centers in more mature research ecosystems tend to be multi-institutional, such as the Alan Turing Institute in the U.K. with 13 academic partners12 and Mila in Montreal with four academic partners and numerous industry partners.8 Often such centers are also focused on a specific theme, such as the 18 AI institutes funded by NSF.10 In contrast, the centers in India tend to be contained in only one institute—this facilitates the institute to identify AI/DS as a growth area and an area of interest to the Institute. The centers tend to be active in content creation, teaching, and consulting apart from their research activities.
In this article, we look at the activities of centers located at seven top Indian institutes namely the Kohli Centre on Intelligent Systems (KCIS) at IIIT Hyderabad (IIITH) established in 2015, the Robert Bosch Centre for Data Science & AI (RBCDSAI) at IIT Madras (IITM) established in 2017, the Centre of excellence in AI (CAI) at IIT Kharagpur (IITKGP) established in 2018, the Yardi School of Artificial Intelligence (ScAI) at IIT Delhi (IITD) established in 2020, the Centre for Machine Intelligence and Data Science (C-MInDS) at IIT Bombay (IITB) established in 2020, the NV AI center and KRIYA at IIT Hyderabad (IITH) established in 2020 and the Kotak-IISc AI-ML Centre at IISc in 2021. These centers are funded at various levels by alumni, philanthropic contributions, industry, and government. There is significant enthusiasm in setting up these centers and the funding for the individual centers range from 3M–12M USD over five to six years. A broader list of AI/ML centers is available online at https://dl.acm.org/doi/10.1145/3556634.
Why centers? AI&DS grew as a subfield of computer science (CS) and most CS departments house large AI groups that are active in research and offer hugely popular AI courses. Despite that most institutions opted for new AI centers to gain greater flexibility in focusing resources (funds, faculty, space) without being hindered by the need for balance among existing subfields of computer science in traditional departments. An AI-focused curriculum includes full-fledged courses on statistics, linear algebra, machine learning, reinforcement learning, robotics, natural language processing, computer vision, cognitive neurosciences, and AI ethics, which are generally rolled up into a handful of courses in a traditional CS program. A separate center also has greater propensity to attract researchers across multiple disciplines, further increasing the ground impact of the field across different industrial sectors.
As per a study conducted by NASSCOM in February 2022,13 the Indian industry currently has a demand-supply gap of 150,000 AI/ML tech talent, with demand outstripping supply by more than 50%. The AI&DS centers have taken up the role of training fresh talent to meet this gap. Academic programs offered by various institutions range from a minor degree to full-fledged undergraduate, master's, and doctoral degrees in AI&DS. Table 1 presents a compendium. Together these serve to both increase AI/ML exposure of students in existing engineering degrees, and in-depth specialization for others. An example of the former is the Interdisciplinary dual degree program (IDDDP) that allows existing IIT BTechs across any engineering discipline to get a dual master's degree in AI&DS after a few courses and an additional year-long thesis project. An example of the latter is a full-fledged BTech in AI program introduced by IITH. Full-fledged master's programs in AI and affiliated topics are offered by most institutions. The master's and Ph.D. programs in most centers are open to students across all engineering disciplines and are not restricted to only CS graduates. An important outcome is that students now have a pathway to switch disciplines, which is traditionally difficult in India.
Most AI-related master's degree programs gain instant popularity. In IITD, approximately 2,000 students applied for 40 seats in M. Tech/MSR in Machine Intelligence & Data Science in its first offering (2022). In the same year, approximately 220 students applied for the Ph.D. seats. For the undergraduate AI programs, incoming students are concerned about the implications of specializing too soon, and the corresponding CS/EE programs continue to be preferred over the BTech AI programs.
Academic outreach. Another notable activity of the centers is academic outreach designed to provide the benefits of AI&DS training beyond the handful that get admitted to their elite parent institutions. IITM recently launched a first of its kind Online BSc in Programming and Data Science. More than 12,500 students are enrolled in the program now and the enrollment is expected to reach 50,000 at steady state.
Another important role of the centers is to contribute toward reskilling working professionals in AI/ML. According to a recent study by NASSCOM,13 out of the total 2.14 million graduates in STEM in 2020–2021, only 10,000–15,000 STEM graduates had a curriculum with AI/ML underlining the major skill gap. Working professionals are enrolling in full-time master's program in AI or learning the skills through short-term training workshops and many AI/ML faculty conduct AI/ML training courses for employees of companies. For example, many armed forces personnel are undertaking the M.Tech AI program at IITH during their two-year sabbatical break.
An important outcome is that students now have a pathway to switch disciplines, which is traditionally difficult in India.
Faculty participation. Many of these centers are designed as interdisciplinary centers. While there are groups of faculty who work in AI/DS areas in CS and EE departments in all the institutes, these centers have faculty from multiple other departments who contribute to their functioning. For example, IITB-C-MInDS has around 78 affiliate faculty members from 18 departments and IITM-RBCDSAI has 32 faculty members from 12 departments. These centers are now looking to hire faculty who are primarily affiliated with the center, as well as engage with external researchers through various associate and adjunct roles.
Research activities. Research publishing in India is slowly ramping up (see Table 2) partly driven by the consolidation of efforts in various centers. The major AI subtopics of publication include deep neural architectures, learning theory, optimization, classification and regression, reinforcement learning algorithms, adversarial learning and robustness, and SNLP applications.
Table 2. Major research topics studied by Indian AI researchers in the last three years.8
The interdisciplinary topics on which there is active work includes computational biology, computational finance, industry 4.0 applications, AI/ML in healthcare, material design, and process engineering. While some centers have focused areas, such as the Kotak-IISc center that will work on finance and AI, the focus at other centers are dependent on the core strengths and the background of the faculty involved.
Given the lack of datasets tailored for the Indian condition, the centers such as those at IITM, IITB and IIITH are actively involved in creating and publishing local data as well such as Indian languages datasets6,7,11 and Indian traffic.3,9
The older centers have also worked on creating a larger social impact. For example, IITM-RBCDSAI is involved in projects such as GarbhINI (Pre-term risk categorization of pregnant women)1 and India data commons (India-level data aggregated from various sources).5 IIITH- KCIS has developed an Indian Driving dataset3 that will help in understanding and working on challenges related to autonomous navigation. IITM-RBCDSAI and other centers have also been contributing to the development of the national-and state-level AI policies, such as the National AI strategy published by the NITI Aayog (the primary government think tank on policy), sector-specific AI recommendations by various departments (commerce, defense, cybersecurity, among others) and the data policy for the state of Tamil Nadu.
Industrial interaction. The centers are becoming a one-stop solution for the varied needs of industries and are therefore becoming a major hub for various industry-led collaborations. These take the form of research projects funded by the industry, interfacing with startups and training programs aimed at working professionals. Given the interdisciplinary nature of centers, the companies from tech, healthcare, manufacturing, finance domains, and so forth are joining forces with centers to lead the AI adoption in their respective domains. These activities are dependent heavily on the academic and research program undertaken by the centers. One of the primary lacunae for strong academia-industry collaboration in India was the lack of exposure of the industry to academic expertise and research. Research centers are one way to enable more structured interaction between the two.
As with academic research centers elsewhere, there are two major challenges impeding the rapid growth of the centers in India. The first is the difficulty in attracting faculty and graduate students. The industry not only offers more attractive salaries but also exciting research opportunities and access to proprietary datasets. This is further exacerbated in India due to the much lower compensation in academia, especially for graduate students. The second challenge is the lack of large and sustained funding from the government or from the industry. This makes it impossible to undertake multiyear focused research projects. The India government has realized this and has announced multiyear support for several new centers that will start functioning in the coming years.2
The centers are becoming a one-stop solution for the varied needs of industries and are therefore becoming a major hub for various industry-led collaborations.
1. Damaraju, N. et al. Development of second and third-trimester population-specific machine learning pregnancy dating model (Garbhini-GA2) derived from the GARBH-Ini cohort in north India. medRxiv 2021; https://bit.ly/3d1NPSI
2. DST. 25 Technology Innovation Hubs across the country through NM-ICPS are boosting new and emerging technologies to power national initiatives; https://bit.ly/3yntQGd
3. IDD. India Driving Dataset; https://bit.ly/3Q4RahK
4. IKDD. Papers from India in core A* Data Science venues; https://bit.ly/3oIUqDR
5. India Data Commons; https://bit.ly/3PSJ9gr
6. Kakwani, D. et al. IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages. Findings of the Association for Computational Linguistics (Jan 2020), 4948–4961; DOI:10.18653/v1/2020.findings-emnlp.445
7. Kunchukuttan, A., Mehta, P. and Bhattacharyya, P. The IIT Bombay English-Hindi Parallel Corpus, 2018; https://arxiv.org/abs/1710.02855
8. Mila. Institutional Partners; https://bit.ly/3bnm8mO
9. Mittal, D. et al. Training a deep learning architecture for vehicle detection using limited heterogeneous traffic data. In Proceedings of the 10th Intern. Conf. Communication Systems & Networks (Bengaluru, India, Jan. 3–7, 2018), 589–294; 10.1109/COMSNETS.2018.8328279.
10. NSF. NSF partnerships expand National AI Research Institutes to 40 states; https://bit.ly/3PPxS0j
11. Ramesh, G. et al. Samanantar: The Largest Publicly Available Parallel Corpora Collection For 11 Indic Languages. Trans. Assoc. Computational Linguistics 10 (Feb. 2022), 145–162; https://doi.org/10.1162/tacl_a_00452
12. The Alan Turing Institute. Current partnerships and collaborations; https://bit.ly/3Q9Ksal
13. NASSCOM. India's Tech Industry Talent: Demand-Supply Analysis; https://bit.ly/3oLOvy2
©2022 ACM 0001-0782/22/11
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2022 ACM, Inc.