IBM Embeds Dual-Accelerator AI in Mainframes for Real-Time Fraud Defense

IBM's dual-accelerator AI transforms mainframes into real-time fraud defense and data insight powerhouses for finance.

July 8, 2025

IBM Embeds Dual-Accelerator AI in Mainframes for Real-Time Fraud Defense
International Business Machines Corp. is reinforcing the mainframe's central role in the global financial system by embedding powerful artificial intelligence capabilities directly into its newest hardware. This strategic move, centered around a dual-accelerator approach in its latest z-series mainframes, aims to empower highly regulated industries like banking and finance to conduct real-time fraud detection and gain deeper insights from the massive volumes of unstructured data they manage. The initiative highlights a significant shift in enterprise computing, where AI is no longer a peripheral function but a core component of mission-critical transaction processing. This evolution allows financial institutions to analyze 100% of their transactions for anomalies as they occur, a feat previously hampered by latency issues when moving data to separate AI processing systems.
At the heart of IBM's strategy is the Telum processor, the company's first to feature an on-chip accelerator for AI inferencing.[1][2] First introduced in the IBM z16 mainframe, the Telum chip was designed over three years to bring deep learning capabilities to enterprise workloads.[2][3] This allows for real-time analysis of transactions for applications in banking, finance, insurance, and trading.[2] The key innovation is the ability to run AI models at scale and with extremely low latency, directly where the data resides.[4][3] The z16, powered by Telum, can process up to 300 billion inference requests per day with a latency of just one millisecond.[3] This capability is crucial for financial institutions that have historically been able to analyze only a small fraction of their high-volume transactions for fraud due to the delays involved in sending data off-platform for analysis.[5][3] By integrating the AI accelerator on the same chip as the central processing cores, IBM has eliminated this bottleneck, enabling banks to detect fraud during the transaction itself.[1][6]
Building on this foundation, IBM has unveiled a next-generation dual-accelerator strategy with its upcoming IBM z17 platform. This new system will feature the Telum II processor and a new, separate AI accelerator card called the IBM Spyre Accelerator. The Telum II, built on 5nm technology, will offer significant performance gains with more cores, a higher clock speed, and a 40% increase in on-chip cache.[7] The integrated AI accelerator on the Telum II is expected to have four times the compute power of its predecessor.[7] The Spyre accelerator, delivered on a PCIe card, is specifically designed to handle more complex AI workloads, including those involving large language models (LLMs) and generative AI.[8] This two-pronged approach allows the on-chip accelerator to handle the high-speed, low-latency inferencing required for in-transaction fraud detection, while the Spyre card can tackle more computationally intensive tasks.[8] This "ensemble" approach enables the system to combine multiple AI models to achieve more accurate and robust results, a critical need for advanced use cases like anti-money laundering and sophisticated fraud detection.[9]
The implications for the financial industry are profound. Globally, an estimated 70% of financial transactions by value run on IBM Z systems.[5][10] The ability to apply AI to every single transaction in real-time could dramatically reduce fraud losses. One report commissioned by IBM estimated that if all institutions running on IBM Z utilized these new AI capabilities, they could prevent an additional $190 billion in fraud annually.[5] Beyond fraud, the technology opens up new possibilities for analyzing unstructured data—such as text from emails, documents, and customer interactions—to gain insights for risk assessment, compliance monitoring, and personalized customer service.[9] Furthermore, IBM is positioning the mainframe as a cornerstone of the hybrid cloud environment, allowing secure integration between on-premises data and cloud services.[6] This is particularly important for highly regulated industries that prioritize security and data sovereignty.[11][12] The systems are also designed with an eye toward future threats, incorporating quantum-safe cryptography to protect data from the potential of future quantum computers to break current encryption standards.[4][13][3]
In conclusion, IBM's integration of advanced, dual-accelerator AI technology into its mainframes represents a pivotal development for the global financial ecosystem. By bringing powerful AI inferencing capabilities directly to the transactional core, IBM is addressing the critical need for real-time analysis and fraud detection at an unprecedented scale. This move not only solidifies the mainframe's enduring relevance in an era of cloud computing but also transforms it into a forward-looking platform for AI-driven innovation.[14] For banks and financial institutions, this translates into a powerful tool to combat financial crime, manage risk more effectively, and unlock new value from their vast data reserves, all within the secure and reliable environment they depend on for their most critical operations.[5][10] The evolution of the mainframe into a high-performance AI engine ensures its central role in powering the world's financial transactions for the foreseeable future.[8]

Sources
Share this article