| Page 76 | Kisaco Research

Arm Neoverse is designed to meet these evolving needs, offering high compute density, exceptional energy efficiency, and a strong total cost of ownership (TCO). As host processors, Neoverse-based CPUs integrate seamlessly with GPUs and AI accelerators to enable flexible, power-efficient, and high-performance deployments across heterogeneous AI platforms capable of managing the complexity and coordination required by agentic AI systems.

In this session, we’ll demo an agentic AI application running on an AI server powered by Arm Neoverse as the host node. The application coordinates multiple agents to accelerate decision-making and streamline workload execution. We’ll also highlight the advantages of running agentic AI on heterogeneous infrastructure, explain why Arm CPUs are ideal as host processors, and demonstrate how Arm provides a scalable, efficient foundation for real-world enterprise and cloud environments.

Author:

Na Li

Principal Solution Architect
arm

Na Li is Principal AI Solution Architect for the Infrastructure Line of Business (LOB) at Arm. She is responsible for creating AI solutions that showcase the values on Arm-based platforms. She has around 10 years of experience developing AI applications across various industries. Originally trained as a computational neuroscientist and received a PhD from the University of Texas at Austin. 

Na Li

Principal Solution Architect
arm

Na Li is Principal AI Solution Architect for the Infrastructure Line of Business (LOB) at Arm. She is responsible for creating AI solutions that showcase the values on Arm-based platforms. She has around 10 years of experience developing AI applications across various industries. Originally trained as a computational neuroscientist and received a PhD from the University of Texas at Austin. 

AI inference costs are high and workloads are growing, especially when low latency is required. We demonstrate NorthPole's energy efficiency and high throughput for low-latency edge and datacenter inference tasks.

Author:

John Arthur

Principal Research Scientist
IBM

John Arthur is a principal research scientist and hardware manager in the brain-inspired computing group at IBM Research - Almaden. He has been building efficient and high-performance brain-inspired neural network chips and systems for the last 25 years, including Neurogrid at Stanford and both TrueNorth and NorthPole at IBM. John holds a PhD in bioengineering from University of Pennsylvania and BS in electrical engineering from Arizona State University.

John Arthur

Principal Research Scientist
IBM

John Arthur is a principal research scientist and hardware manager in the brain-inspired computing group at IBM Research - Almaden. He has been building efficient and high-performance brain-inspired neural network chips and systems for the last 25 years, including Neurogrid at Stanford and both TrueNorth and NorthPole at IBM. John holds a PhD in bioengineering from University of Pennsylvania and BS in electrical engineering from Arizona State University.

Author:

Manuel Botija

VP, Product Management
Axelera

Manuel Botija is an engineer with degrees from Telecom Paris and Universidad Politécnica de Madrid. Over the past 17 years, he has led product innovation in semiconductor startups across Silicon Valley and Europe. Before joining Axelera, Manuel served as Head of Product at GrAI Matter Labs, which was acquired by Snap Inc.

Manuel Botija

VP, Product Management
Axelera

Manuel Botija is an engineer with degrees from Telecom Paris and Universidad Politécnica de Madrid. Over the past 17 years, he has led product innovation in semiconductor startups across Silicon Valley and Europe. Before joining Axelera, Manuel served as Head of Product at GrAI Matter Labs, which was acquired by Snap Inc.

Outdated x86 CPU/NIC architectures bottleneck AI's power, limiting true Generative AI potential. NeuReality's groundbreaking NR1® Chip combines entirely new categories of AI-CPU and AI-NIC into one single chip, fundamentally redefining AI data center inference solutions. It solves these bottlenecks, boosting Generative AI token output up to 6.5x for the same cost and power versus x86 CPU systems, making AI widely affordable and accessible for businesses and governments. It works in harmony with any AI Accelerator/GPU, maximizing GPU utilization, performance, and system energy efficiency. Our NR1® Inference Appliance, with its built-in software, intuitive SDK, and APIs, comes preloaded with out-of-the-box LLMs like Llama 3, Mistral, DeepSeek, Granite, and Qwen for rapid, seamless deployment with significantly reduced complexity, cost, and power consumption at scale.

Author:

Moshe Tanach

Co-Founder & CEO
NeuReality

Moshe Tanach is Founder and CEO at NeuReality.

Before founding NeuReality, he served as Director of Engineering at Marvell and Intel, leading complex wireless and networking products to mass production.

He also served as Appointed Vice President of R&D at DesignArt-Networks (later acquired by Qualcomm) developing 4G base station products.

He holds Bachelor of Science in Electrical Engineering (BSEE) from the Technion, Israel, Cum Laude.

Moshe Tanach

Co-Founder & CEO
NeuReality

Moshe Tanach is Founder and CEO at NeuReality.

Before founding NeuReality, he served as Director of Engineering at Marvell and Intel, leading complex wireless and networking products to mass production.

He also served as Appointed Vice President of R&D at DesignArt-Networks (later acquired by Qualcomm) developing 4G base station products.

He holds Bachelor of Science in Electrical Engineering (BSEE) from the Technion, Israel, Cum Laude.

Five Forces Reshaping AI Infrastructure in 2025

Over the last six months, we held two dozen closed‑door interviews with the people who pour the concrete, sign the power‑purchase agreements, and deploy the GPUs that drive today’s AI boom. They ranged from Fortune‑100 cloud operators and traditional utilities to private‑equity financiers and immersion‑cooling specialists. Taken together, the conversations reveal a market in hyper‑growth mode but constrained by physics - power density, transmission capacity, thermal limits, and by a brutally tight equipment supply chain. Five forces rise above the noise and will shape every capital‑allocation decision in AI infrastructure during 2025.