Research Finder
Find by Keyword
Is AI Cost Finally Controllable?
Gaudi 3 on IBM Cloud aims to deliver cost-effective AI, but will it?
Key Highlights
- IBM Cloud now offers Intel Gaudi 3 AI accelerators for production AI workloads.
- This collaboration aims to improve cost performance for enterprise AI deployments.
- Multiple deployment options are available, including VPC instances and container worker nodes.
- Integration with IBM watsonx.ai and Red Hat OpenShift is planned.
The News:
IBM announced the availability of Intel Gaudi 3 AI accelerators on IBM Cloud, aiming to provide a more cost-effective solution for deploying and scaling enterprise AI workloads. This offering is currently available in the Frankfurt and Washington, D.C. regions, with Dallas availability planned for Q2 2025. This collaboration seeks to address the growing cost concerns associated with AI infrastructure. Find out more here.
Analyst Take
While IBM doesn’t have the sheer scale and service breadth of the other large hyperscalers, the company does have a loyal following, especially in highly regulated industries such as banking and finance. The company is also increasingly developing full-stack apprach to Ai with Granite language models and InstructLab. Against this backdrop, the introduction of Intel Gaudi 3 AI accelerators on IBM Cloud represents a significant advancement in generative AI, directly addressing the financial challenges associated with its expansion. The IBM AI in Action 2024 report highlights AI’s role in revenue growth alongside the substantial infrastructure costs it incurs. The collaboration between Intel and IBM aims to redefine the economic landscape of AI deployment by offering optimized inferencing and fine-tuning at potentially reduced costs, appealing to enterprises focused on innovation and financial prudence. Positioning itself as a cost-efficienct options is a wise move by IBM and could be seen as a competitive advantage within the AI domain.
What Was Announced
Intel Gaudi 3 accelerators are now available on IBM Cloud, offering various deployment options to suit diverse enterprise requirements. Clients can utilize a standalone server within IBM Cloud Virtual Private Cloud (VPC), maintaining control over specialized software stacks supported by Red Hat Enterprise Linux AI image options. Gaudi 3 will be integrated as a worker node for Red Hat OpenShift AI clusters and Red Hat OpenShift on IBM Cloud by Q2 2025. Enterprises can also integrate their watsonx.ai software license into a Gaudi 3-based VPC instance, ensuring autonomy. Deployable architectures (DAs) are planned for the second half of 2025, bundling Gaudi 3 with watsonx software, VPC, and Red Hat OpenShift to streamline deployment. This comprehensive approach emphasizes flexibility and adaptability.
The partnership, leveraging Red Hat’s open-source expertise, aims to create an integrated platform free from proprietary constraints. Intel Trust Domain Extensions (TDX) on IBM Cloud Virtual Servers for VPC enhance security, crucial for data-sensitive environments. These features provide a strong foundation for enterprises managing AI scalability.
Looking Ahead
This announcement, for me at least, underscores the importance of cost performance in AI’s future. While Gaudi 3 claims affordability, its real-world effectiveness will be proven through practical application. This announcement reflects a trend towards specialized AI hardware in public cloud ecosystems, requiring versatility across stand-alone servers and containerized nodes. The integration of IBM’s watsonx.ai and Red Hat OpenShift is noteworthy here and seeks to create a unified experience between development and deployment.
Successful execution and timely delivery of promised milestones, particularly container worker nodes and deployable architectures, will determine the collaboration’s success. The timely and precise rollout of these advancements could provide a competitive edge for IBM, Intel, and their enterprise clients. When you look at IBM’s moves recently - the announcement with CoreWeave being an example - IBM is making moves in AI, albeit without much fanfare.
Steven Dickens | CEO HyperFRAME Research
Regarded as a luminary at the intersection of technology and business transformation, Steven Dickens is the CEO and Principal Analyst at HyperFRAME Research.
Ranked consistently among the Top 10 Analysts by AR Insights and a contributor to Forbes, Steven's expert perspectives are sought after by tier one media outlets such as The Wall Street Journal and CNBC, and he is a regular on TV networks including the Schwab Network and Bloomberg.