AI/ML

DDN, Nvidia team up to cut inference costs and boost GPU utilization

Published

DDN is aiming to make AI usable by enterprises and governments as they adopt inference and deploy AI-driven processes and pipelines as part of their operations.

It's supporting Nvidia hardware and software with its storage systems, providing the Horizon AI system control plane for enterprises, partners, and service providers, industry-specific AI pipeline blueprints, and partnering to provide sovereign AI.

Alex Bouzari, CEO and Co-Founder at DDN, said: "The next phase of AI isn't about buying more GPUs – it's about operationalizing them. Trillions in AI infrastructure investment must translate into real economic output, and that requires a unified control plane. DDN Horizon transforms fragmented GPU clusters into secure, multi-tenant, revenue-ready AI platforms. Whether for sovereign AI, cloud providers, or enterprises, orchestration is the multiplier that turns infrastructure into an economic engine."

Alex Bouzari.
Alex Bouzari

Inference is compute and storage-intensive and DDN wants to lower the cost per token by reducing GPU idle time. Its software-defined AI data services are now aligned with Nvidia's STX reference architecture and run directly on BlueField-4 data processing units (DPUs). With this, DDN says it delivers direct GPU-to-data paths that reduce latency, lower power consumption, and increase effective GPU utilization across both training and inference workloads.

DDN says its object-based Infinia OS accelerates inference performance and improves AI factory economics by eliminating data bottlenecks at scale, with:

  • Up to 27x faster KV cache loading with a distributed acceleration fabric and deep Nvidia Dynamo integration
  • Sub-millisecond latency that eliminates I/O stalls and maximizes GPU utilization at production scale
  • Double-digit reductions in cost per token, materially improving inference economics
  • Removes KV cache memory capacity as the critical path for large context windows and agentic AI workloads
  • Vera Rubin architecture goals, enabling up to 10x lower inference token cost

Infinia is available on Oracle Cloud Infrastructure via Oracle Cloud Marketplace, supporting production-ready deployment in minutes, DDN claims.

The alternative Lustre-based, EXAScaler parallel filesystem OS features:

  • Comprehensive multi-tenancy architecture for production AI factories
  • Per-tenant KMIP encryption, quota enforcement, and API-driven VLAN lifecycle management for secure isolation at scale
  • Self-service provisioning and full API control
  • Instant tenant onboarding, modification, and retirement via API call
  • Runs on customer-selected standard servers, decoupling performance from all-flash hardware dependency – meaning HDDs can be used as well as SSDs
  • Up to 10x more training throughput on infrastructure customers already own and control

All three offerings – the software-defined AI data services, AI-focused Infinia, and EXAScaler systems – will be generally available by the summer.

DDN's Horizon

Horizon is an orchestration platform that operationalizes AI-as-a-Service across the AI lifecycle with a unified control plane.

Horizon, DDN says, orchestrates compute (GPUs and CPUs), high-performance data services, networking and end-to-end AI pipelines to provide a Platform-as-a-Service experience.

It transforms large-scale GPU and data infrastructure into a secure, multi-tenant, revenue-ready AI platform, enabling service providers to move beyond basic GPU rental and deliver full AI-as-a-Service offerings through:

  • Self-service provisioning of compute, storage, and AI workspaces
  • Policy-driven governance and tenant isolation
  • Lifecycle management from training through inference
  • Integrated usage tracking, chargeback, and billing

Enterprise Horizon customers can use Horizon to deliver a unified private AI cloud model with centralized governance and policy enforcement, tenant isolation and observability, and transparent chargeback and cost allocation.

Horizon supports sovereign AI as governments and national operators can deliver locally hosted AI platforms for regulated industries. They can ensure data, models, and work products remain within national borders, and accelerate AI adoption across banking, telecom, public sector, research, and startup ecosystems.

DDN claims that, by simplifying provisioning at scale, its AI Horizon allows emerging Nvidia Cloud Partners (NCPs) to compete directly with hyperscalers. They maintain local control, differentiated service models, and can have revenue-ready AI clouds launch in weeks, not quarters.

IndustrySync pipelines

DDN launched two new IndustrySync Pipelines, validated, industry-specific AI blueprints, initially available in the Financial Services and Life Sciences verticals. IndustrySync Pipelines are pre-integrated, production-validated workflows that are available immediately for DDN Enterprise AI HyperPOD deployments, with available expansion to Nvidia DGX SuperPOD and HGX environments, as well as sovereign AI factories. Each pipeline is designed as a flexible, extensible foundation, built to adapt to specific environments, data, and workflows. 

They can be deployed on existing customer infrastructure in days rather than months, and are aligned to industry-specific business outcomes, running on DDN and Nvidia's validated, fully orchestrated technology stack.

The IndustrySync Financial Services (FSI) Pipeline enables:

  • Up to 150× faster risk simulations, converting overnight batch jobs into continuously updated intraday intelligence
  • Expected shortfall and risk regime calculations refreshed every five minutes across all GPUs
  • Faster response to market volatility with real-time capital and liquidity insights
  • Accelerated move from AI experimentation to production deployment – in days, not months
  • Deterministic performance and built-in governance to meet strict regulatory and risk requirements
  • Immediate infrastructure readiness with a data layer operational from day one

The Life Sciences Pipeline is a production-ready blueprint, built on Nvidia industry reference architectures and integrated with its BioNeMo, enabling research institutions to operationalize AI-driven drug discovery at scale. It is designed to:

  • Accelerate genomics, protein structure analysis, and foundation model workflows
  • Shorten time from data ingestion to model training and inference
  • Reduce time-to-insight for therapeutic candidate discovery
  • Enable multi-modal biomedical AI at production scale
  • Eliminate infrastructure rebuilds between experimentation and deployment
  • Complete deployment in days, not months

DDN is launching a 90-day IndustrySync Financial Services Early Adopter Program. Applications are now open here. Availability is limited.

It plans to expand IndustrySync Pipelines throughout 2026 with additional industry workflows, including expanded financial services use cases and intelligent video and surveillance pipelines.

Zadara and Aleria

There are two more DDN AI announcements, the first involving a partnership with IaaS provider Zadara to deliver high-performance AI infrastructure for sovereign clouds and multi-tenant AI factories. These are built on Nvidia Reference Designs for multi-tenant clouds. Zadara and DDN will combine Zadara's AI-optimized, cloud-native orchestration and GPU-aware infrastructure platform with DDN EXAScaler's performance for large-scale AI training and inference. In effect, EXAScaler will be the high-performance AI data layer for Zadara's AI Factory and sovereign cloud deployments. Target customers are service providers, telcos, and enterprises.

Secondly, DDN is working with sovereign intelligence platform supplier Aleria on the DDN and Aleria Sovereign AI Factory. This is claimed to be a complete reference architecture based on Nvidia's Vera Rubin DSX AI Factory reference design and its Omniverse DSX Blueprint. It should deliver sovereign, auditable AI infrastructure for governments and highly regulated enterprises.

DDN says EXAScaler and Infinia provide a unified file and object storage platform tuned as a high-performance AI data lake, delivering deterministic GPU utilization over BlueField-accelerated, DOCA-enabled fabrics with strict workload isolation and national data residency enforcement.

Aleria delivers the sovereign intelligence layer – a modular platform spanning Legal AI, HR Intelligence, Finance AI, Board Advisory, and compliance – deployable on-premises or in compliant local clouds with full auditability and no vendor lock-in. Omniverse DSX and DSX SIM function as the engineering control plane – simulating AI factory behavior under real-world load, validating power and thermal envelopes before construction begins, and defining operational boundaries that persist from simulation into live production. 

DDN and Aleria say governments and regulated enterprises gain the same maximum tokens-per-watt performance and accelerated time-to-revenue as the world's most advanced AI factories – while keeping all data and intelligence under domestic control, meeting national security standards, data residency requirements, and federal certification frameworks.

The Sovereign AI Factory is already production-validated. Deployments include maritime coastal surveillance in Indonesia, AI-powered cultural intelligence at the Louvre Abu Dhabi, and citizen-facing applications through the Halas app.

Ten sovereign sites are planned for deployment within the next 24 months, with the first three implementations finalized for operational deployment in 2026. Target regions include the UAE, Oman, Indonesia, Portugal, US, Brazil, and Europe. Vera Rubin and GB300 hardware allocations are already secured.

The DDN Sovereign AI Factory is available now as a reference architecture with a single price list – one SKU, one subscription, purchased directly from DDN. DDN manages the Aleria OS integration and the Nvidia relationship. 

Bootnote

Nvidia BioNeMo is a development platform for AI-driven biology and drug discovery. It includes open models, libraries, datasets, and NIM microservices for the AI lifecycle, enabling researchers and developers to build, customize, and deploy AI applications that drive the next experiment.