OpenAI launches IndQA, groundbreaking AI benchmark for Indian cultural understanding.

Beyond translation, OpenAI's IndQA benchmark uses expert-crafted questions to test AI's nuanced understanding of diverse Indian cultures.

November 4, 2025

OpenAI launches IndQA, groundbreaking AI benchmark for Indian cultural understanding.
In a significant move to address the linguistic and cultural gaps in artificial intelligence, OpenAI has developed IndQA, a new benchmark designed to evaluate how effectively AI models can comprehend and reason about the nuances of Indian languages and culture. This initiative represents a departure from traditional evaluation metrics that predominantly focus on translation or simple multiple-choice questions. Instead, IndQA is engineered to test an AI's deeper understanding of context, history, and culturally specific scenarios, a critical step toward making AI truly beneficial for a global user base. The development of this benchmark is a core component of OpenAI’s broader strategic investment in India, a nation it identifies as its second-largest and fastest-growing market.
The creation of IndQA was motivated by the limitations of existing multilingual benchmarks, such as MMMLU, which are becoming saturated, with top-tier models achieving near-perfect scores, making it difficult to measure meaningful progress.[1][2][3] Furthermore, these older benchmarks often consist of content translated from English, a process that fails to capture the unique cultural and linguistic intricacies of other regions. IndQA addresses this deficit by being built from the ground up with genuine Indian context. The benchmark comprises 2,278 questions spanning 12 languages—Bengali, English, Hindi, Hinglish, Kannada, Marathi, Odia, Telugu, Gujarati, Malayalam, Punjabi, and Tamil—and covers 10 distinct cultural domains.[4] These domains are extensive, ranging from Architecture & Design, Arts & Culture, and Everyday Life to Food & Cuisine, History, Law & Ethics, Literature & Linguistics, Media & Entertainment, Religion & Spirituality, and Sports & Recreation.[4] This comprehensive scope ensures that models are tested on their ability to reason about a wide array of topics pertinent to life in India.
A cornerstone of the IndQA project is its collaborative and rigorous development process. OpenAI partnered with 261 domain experts from across India, including a diverse group of linguists, journalists, artists, professors, and practitioners, to author the questions.[2] The caliber of these experts underscores the benchmark's depth; contributors include a Nandi Award-winning Telugu actor and screenwriter, a scholar of Kannada linguistics and dictionary editor, an International Chess Grandmaster, and a Tamil writer and cultural activist, among others.[1] Each question was drafted to be a reasoning-focused prompt tied to the experts' specific regions and specialities.[2] A critical step in ensuring the benchmark's difficulty and longevity was a process of "adversarial filtering." Every question was tested against OpenAI's most powerful models, such as GPT-4o and GPT-5, and only the questions that these advanced models failed to answer satisfactorily were retained.[4][2] This method ensures that IndQA has sufficient "headroom," allowing it to remain a challenging and useful tool for tracking future AI progress. Evaluation is not based on a simple right or wrong answer but on a detailed, rubric-based approach where each response is graded against specific criteria established by the domain experts.[4]
The launch of IndQA is deeply intertwined with OpenAI's expanding strategic commitment to India. The company has articulated an "India-first" approach, recognizing the country's immense potential with its vast population of internet users. This strategy extends beyond the new benchmark and includes recent initiatives such as offering a free year-long subscription to ChatGPT Go for all users in India, establishing its first Indian office in New Delhi to be closer to policymakers, and planning a major 1-gigawatt data center in the country. OpenAI has also engaged in collaborations with the Indian government's IndiaAI Mission, signaling a long-term investment in the nation's burgeoning AI ecosystem. By developing a tool that directly addresses the need for culturally competent AI, OpenAI not only aims to improve its models for Indian users but also positions itself as a key partner in India's technological future. The performance of current models on IndQA, with even the best scoring below 40%, highlights the significant work that remains in developing truly multilingual and multicultural AI systems.[5]
The introduction of IndQA carries substantial implications for the global AI industry. It sets a new standard for how AI capabilities should be measured in non-English contexts, moving beyond simplistic translation accuracy to encompass genuine cultural understanding. This could spur the development of similar, locally-grounded benchmarks for other languages and regions, fostering the creation of more equitable and globally relevant AI technologies. For India, the benchmark provides a clear metric to gauge the effectiveness of various AI models vying for market share, offering a tool to assess which systems truly understand the subcontinent's diversity. While the ultimate impact of IndQA will depend on its public accessibility and adoption as an industry standard, its creation marks a pivotal acknowledgment that for artificial intelligence to benefit all of humanity, it must first learn to understand it in all its rich, cultural complexity.

Sources
Share this article