Google's Gemini AI Makes History: Outscores Top Human in Elite IIT-JEE Exam

AI crosses a major human intelligence barrier, outperforming the top student on the notoriously difficult IIT-JEE.

July 2, 2025

Google's Gemini AI Makes History: Outscores Top Human in Elite IIT-JEE Exam
In a landmark achievement for artificial intelligence, Google's Gemini 2.5 Pro model has reportedly outperformed the top human candidate in one of the world's most challenging entrance examinations, the Indian Institute of Technology Joint Entrance Examination (IIT-JEE) Advanced. According to a technical report from ByteDance, which tested several leading AI models, Gemini 2.5 Pro achieved a remarkable score of 336.2 out of 360.[1] This score surpasses the 332 marks obtained by Rajit Gupta, the All India Rank 1 holder in the 2025 exam.[2][3][4][5][6] The event marks a significant milestone in the development of AI's reasoning and problem-solving capabilities, raising profound questions about the future of education, assessment, and the nature of intelligence itself.
The IIT-JEE Advanced is renowned for its grueling difficulty, testing students in physics, chemistry, and mathematics with questions that require deep conceptual understanding and complex analytical skills.[1] Acceptance rates into the prestigious IITs are typically below 2%, making it one of the most selective examinations globally.[1] The test administered to the AI models, including Gemini 2.5 Pro, was the 2025 Advanced paper.[1] ByteDance's evaluation was comprehensive, using image inputs to test the models' multimodal and reasoning abilities and assigning marks according to the official rules, which include penalties for incorrect answers.[1] For objective questions, each was sampled five times to ensure a fair average score.[1] The results placed Google's model at the top, followed by ByteDance's own Seed 1.6-Thinking model with 329.6 marks, Anthropic's Claude Opus 4 at 314.4, and OpenAI's o4-mini-high with 308.4.[1]
This is not the first instance of an AI model tackling the formidable JEE Advanced. Earlier reports highlighted an experiment by an IIT Kharagpur engineer where OpenAI's ChatGPT o3 model scored an impressive 327 out of 360 on a mock version of the 2025 paper, which would have corresponded to an All India Rank of 4.[7][8][9] That test was conducted under strict conditions, with the AI prompted to act as an aspirant without access to external tools or web searches.[7][8] These repeated successes by different AI models underscore a rapid acceleration in their cognitive abilities, moving beyond simple information retrieval to sophisticated, multi-step reasoning akin to human problem-solving. Other accounts have specifically pointed to Gemini 2.5 Pro's exceptional mathematical prowess, with one analysis claiming it achieved a perfect score on the mathematics section of the 2025 paper by methodically decomposing and solving each problem.[10]
The implications of an AI besting human candidates in such a high-stakes, intellectually demanding exam are vast and multifaceted. For the AI industry, it serves as a powerful demonstration of the progress in developing advanced reasoning capabilities.[10][11][12] Google itself has highlighted that Gemini 2.5 Pro is designed as a "thinking model" capable of reasoning through its thoughts before responding, leading to enhanced accuracy and performance on complex tasks.[13][12] This ability to tackle novel and complex problems, as seen in the JEE Advanced, signals a significant leap from earlier AI, which sometimes struggled with such challenges. The achievement validates the architectural improvements and training methodologies being employed by leading AI labs, pushing the boundaries of what machines can accomplish.
In conclusion, the performance of Gemini 2.5 Pro on the IIT-JEE Advanced exam is a watershed moment, transitioning AI from a tool for processing information to a potential partner in complex problem-solving. While the top human student, Rajit Gupta, secured his place at an IIT with a score of 332, the AI's slightly higher score of 336.2 heralds a new era of machine intelligence.[2][1] This development will undoubtedly fuel further research and investment into AI's reasoning abilities and its integration into various fields, particularly education. It forces a re-evaluation of how we assess human intelligence and prepares the ground for a future where human-AI collaboration could unlock unprecedented levels of innovation and discovery, while also prompting critical discussions about the role and regulation of such powerful technology in society.

Sources
Share this article