Business
Zyphra Demonstrates First Large Scale Training on Integrated AMD Compute and Networking Powered by IBM Cloud
Zyphra today announced a major milestone in its AI infrastructure and model development with the release of a technical report showing how Zyphra has demonstrated large scale training on AMD GPUs and networking.
About this update from International Business Machines Corporation
Joint collaboration between Zyphra, AMD, and IBM delivers ZAYA1, the first large-scale Mixture-of-Experts foundation model trained entirely on an AMD platform using AMD Instinct MI300X GPUs, AMD Pollara networking & ROCm software. SAN FRANCISCO, Nov. 24, 2025 /PRNewswire/ -- Zyphra today announced a major milestone in its AI infrastructure and model development with the release of a technical report showing how Zyphra has demonstrated large scale training on AMD GPUs and networking. The paper introduces ZAYA1, the first large-scale Mixture-of-Experts (MoE) foundation model trained entirely on an integrated AMD platform (AMD Instinct™ GPUs, AMD Pensando™ networking interconnect & ROCm software stack) as a viable high-performance, production-ready alternative platform for frontier-scale AI training. Despite operating at a fraction of the active parameter count, ZAYA1-base (8.3B total parameters, 760m active) achieves performance comparable to leading models such as Qwen3-4B (Alibaba) and Gemma3-12B (Google), and outperforms models including Llama-3-8B (Meta) and OLMoE across reasoning, mathematics, and coding benchmarks. "Efficiency has always been a core guiding principle at Zyphra. It shapes how we design model architectures, develop algorithms for training and inference, and choose the hardware with the best price-performance to deliver frontier intelligence to our customers," said Krithik Puthalath, CEO of Zyphra. "ZAYA1 reflects this philosophy and we are thrilled to be the first company to demonstrate large-scale training on an AMD platform. Our results highlight the power of co-designing model architectures with silicon and systems, and we're excited to deepen our collaboration with AMD and IBM as we build the next generation of advanced multimodal foundation models." Mixture-of-Experts (MoE) models have become the foundational architecture for modern, frontier AI systems, using specialized expert networks that activate dynamically to deliver greater efficiency, scalability, and reasoning performance than traditional dense architectures. This paradigm shift defines today's leading frontier models including GPT-5, Claude-4.5 DeepSeek-V3 and Kimi2 all of which leverage MoE designs to expand capability while optimizing compute utilization. ZAYA1 represents the first large-scale pretraining of an MoE model on an AMD pl...
View stock analysis, news, and events for International Business Machines Corporation