Zyphra Demos Training on AMD GPUs Powered by IBM Cloud

Zyphra released a technical report showing how Zyphra has demonstrated large-scale training on AMD GPUs and networking.

Source: Zyphra

The paper introduces ZAYA1, a large-scale Mixture-of-Experts (MoE) foundation model trained entirely on an integrated AMD platform (AMD Instinct GPUs, AMD Pensando networking interconnect & ROCm software stack) as a viable high-performance, production-ready alternative platform for AI training. Image courtesy: Zyphra

Latest Engineering Computing News

Latest Engineering Computing Resources

Cut Retrieval-Augmented Generation (RAG) Hallucinations by 50%

Most teams hit the same wall with enterprise AI: LLMs that hallucinate, pipelines that don’t scale, and infrastructure that’s harder to design than the models themselves.
What Is Intelligent BOM Management? A Guide to Smarter Product Development

Learn how intelligent Bill of Materials (BOM) management helps teams collaborate, reduce errors, and bring innovative products to market faster with cloud-based PLM tools.
More Resources

By DE Editors

November 24, 2025

Zyphra reports a major milestone in its AI infrastructure and model development with the release of a technical report showing how Zyphra has demonstrated large-scale training on AMD GPUs and networking.

"Efficiency has always been a core guiding principle at Zyphra. It shapes how we design model architectures, develop algorithms for training and inference, and choose the hardware with the best price-performance to deliver frontier intelligence to our customers," says Krithik Puthalath, CEO of Zyphra. "ZAYA1 reflects this philosophy and we are thrilled to be the first company to demonstrate large-scale training on an AMD platform. Our results highlight the power of co-designing model architectures with silicon and systems, and we're excited to deepen our collaboration with AMD and IBM as we build the next generation of advanced multimodal foundation models."

ZAYA1 represents a large-scale pretraining of an MoE model on an AMD platform, demonstrating that the AMD AI ecosystem is ready to power frontier-class AI development end to end.

Zyphra co-designed ZAYA1 around AMD silicon, introducing innovations such as an advanced routing architecture, compressed convolutional Attention (CCA), and lightweight residual scaling to achieve higher training throughput and efficient inference through improved expert use.

"Zyphra's work with AMD and IBM demonstrates how an open platform built on AMD Instinct GPUs and AMD Pensando networking can deliver breakthrough performance and efficiency for large-scale AI," says Philip Guido, executive vice president and chief commercial officer, AMD.

Building on prior collaborative work, Zyphra collaborated closely with AMD and IBM to design and deploy a large-scale training cluster powered by AMD Instinct GPUs with AMD Pensando networking (ethernet) interconnect. The jointly engineered AMD and IBM cluster announced earlier this quarter, combines AMD Instinct MI300X GPUs with IBM Cloud's high-performance fabric and storage architecture providing the foundation for ZAYA1's large-scale pretraining.

"As AI creates opportunities for enterprises to innovate, foundation models are key to unlocking accelerated development, efficiency and productivity," says Alan Peacock, general manager of IBM Cloud. "We are proud to deliver IBM's scalable AI infrastructure as the foundation for ZAYA1's large-scale model and are excited to continue collaborating with AMD on AI model development across our mutual clients."

The joint collaboration demonstrates how Zyphra's advanced AI research and optimized software stack, combined with the AMD platform powered by IBM's infrastructure through IBM Cloud can deliver the performance needed for frontier-scale AI model development.

For more information reference the technical report on arXiv, the Zyphra technical blog post, and AMD blog post.

About Zyphra

Zyphra is a full-stack, open-source superintelligence company based in San Francisco. Zyphra's core research thesis toward general superintelligence is focused on developing next-generation multimodal architectures for long-context reasoning, long-term memory, and continual learning. The company is building two products: Zyphra Inference Cloud - an API inference platform for Zyphra's multimodal models (language, audio and vision) and other open-source models; and Maia, an intelligent assistant for teams that enhances collaboration by bringing search, communication, and productivity tools together in one platform.

Sources: Press materials received from the company and additional information gleaned from the company’s website.

More about Zyphra

Latest in AMD

About DE Editors

DE's editors contribute news and new product announcements to Digital Engineering. Press releases may be sent to them via [email protected].

Follow DE
on Facebook
on Linkedin

Zyphra Demos Training on AMD GPUs Powered by IBM Cloud