Druva uses graph relationships to mine metadata

Published

Interview: Druva has, uniquely, added data graph technology to organize backup metadata for use by its Deep Analysis Agents. These operate within the ambit of DruAI, its intelligent assistant. Various other data protection/cyber resilience suppliers have AI assistants; Cohesity with Gaia, Commvault’s Arlie and Rubrik’s Ruby are examples, as is HYCU’s use of Anthropic’s Claude. These are traditional, vectorizing large language model (LLM)-based systems, which Druva also uses in parts of its DruAI structure. Why did Druva break the mold and adopt up-to-now, little-used graph technology?

We sent Druva VP of Product for AI, David Gildea, some questions to find out the answer.

Blocks & Files: What was/were the triggers for using graph database technology to store Druva backup metadata? Other data protection suppliers didn't go this route.

David Gildea, Druva VP of Product for AI.
David Gildea, Druva VP of Product for AI.

David Gildea: Security investigations are about following connections: which user touched which file, under what permissions, on which system, and what changed next. That’s a graph problem. You can store metadata in different formats, but graph is the most natural way to capture those relationships and make them usable for AI agents that need context to answer questions reliably.

Three triggers drove the decision:

1. Security questions are essentially relationship questions.

The biggest questions in cyber resilience aren’t about a single backup or a single file. They’re about the connections across identities, permissions, workloads, policies, and time. When you’re trying to spot blast radius, suspicious access paths, orphaned accounts, or retention gaps, the “shape” of the data is the story. A graph is the natural way to capture and query those relationships quickly and consistently.

2. We want to give customers real-time speed and full context, not after-the-fact reporting.

Legacy “backup intelligence” usually means dashboards and reports that lag reality, plus a lot of manual stitching to get answers. We built Dru MetaGraph so metadata is queryable in place, continuously, with the context intact. That’s what makes it possible to go from “What’s risky?” to “What should I do next?”. That gives customers the speed they need during an incident, and it’s reliable and robust for day-to-day compliance and operational decisions.

3. Our SaaS-only architecture is what makes a metadata intelligence layer possible.

We built Druva as a cloud-native SaaS platform from day one, with each customer isolated in their own tenant. That lets us create a tenant-specific intelligence layer where metadata stays encrypted, stays current, and stays private within the customer boundary. 

If your architecture is split across environments and management planes, you can’t get the same always-fresh, unified view without adding layers of complexity and delay. We designed for the outcome: trusted answers at speed, which is absolutely critical during an incident.

Blocks & Files: Could DruAI do what it does with a graph database foundation?

David Gildea: A generic graph database can store relationships. That’s not the hard part. The hard part is turning backup metadata into a living, tenant-specific intelligence layer that’s continuously updated, governed, and ready for agents to query and act on inside the platform.

A graph database is a component. Dru MetaGraph is an end-to-end system. It’s how we collect, normalize, connect, and secure metadata signals across workloads and time, then make them instantly queryable for outcomes like compliance answers, lifecycle actions, and risk investigations. It’s not a “dump metadata here” repository, it’s the platform’s intelligence backbone. 

Druva DruAI architecture.
Druva DruAI architecture.

Current context changes everything. If your metadata view is built from exports, batch jobs, or periodic indexing, it drifts. And when it drifts, agents don’t just get slower. They get less reliable, especially in fast-moving investigations where the relationships matter.

Trust and boundaries are secure by design, not added later. Tenant isolation, encryption, policy-aware access, and tight controls are part of the architecture. That’s what lets you use AI to reason over sensitive metadata confidently without creating new exposure.

While others can build something “graph-like,” getting to trusted, always-current, tenant-isolated intelligence that can drive actions is the real challenge. That’s what Dru MetaGraph was built to do.

Blocks & Files: Will other data-protecting, cyber-resilience suppliers find they have to support graph database technology to provide similar or equivalent AI agent capabilities?

David Gildea: Any business that wants true agentic capabilities in cyber resilience has to support relationship intelligence, whether they use graph or not. AI agents for cyber resilience work best when they can connect identities, permissions, policies, copies, and lifecycle events at scale. 

But it’s not just about data models, it’s about end-to-end architecture. A lot of the industry is approaching AI as a layer you add on top. We think that’s backwards for cyber resilience. If the foundation isn’t rock-solid, agents create more noise. While better data delivers better results, it’s just as important that agentic behavior is predictable and reproducible, which means you need a stable data foundation. Other suppliers will have to build that foundation to support the AI agentic capabilities that customers need.

Blocks & Files: Does DruAI run on GPUs or CPUs - or both? And why?

David Gildea: DruAI runs on AWS Bedrock, so in practice it uses whatever underlying compute AWS provisions for the model and request, typically a mix of GPUs, CPUs, and specialized AI chips. We don’t run or manage dedicated racks of GPUs or CPUs ourselves, AWS abstracts that layer for us.

We chose that approach for three reasons. First, it keeps us from being constrained by fixed hardware, so we can scale up or down with demand without customers having to plan capacity. Second, it lets us stay model-flexible, so we can adopt newer or more efficient models as they become available without re-architecting infrastructure. Third, it gives us tighter control over cost and performance, because usage is metered and we actively govern which models get used for which tasks.

The short version is: AWS manages the hardware, we manage DruAI’s behavior and guardrails, and we track usage and spend closely so the experience stays fast, reliable, and economically sustainable.

Blocks & Files: Does DruAI, with its graph database technology, run into the KV Cache problems that affect vectorizing LLMs and agents?

David Gildea: The KV Cache problem was really one of insufficient context because they were trying to identify relationships by loading everything into cache. By defining the relationships ahead of time in the graph, it enables us to scale our analysis. 

Blocks & Files: How does Druva organize its agents? Do they have names? Is there a hierarchy? How is their usage priced? 

David Gildea: DruAI is organized around a capability-based, multi-agent design. Instead of one general-purpose assistant doing everything, we use specialized agents for specific capabilities, coordinated by supervisor agents.

At the core are functional agents with straightforward names:

  • Data Agents handle retrieval and analysis across Druva-accessible data sources, including filtering, aggregation, and summarization.
  • Help Agents are the product experts, grounded in Druva product knowledge, documentation, and best practices so it can guide users and inform other agents.
  • Action Agents execute approved actions, with human-in-the-loop guardrails to ensure security and control for high-impact steps.
  • In-depth Analysis Agents support deeper reasoning and investigation.
  • Generative UX Agents help generate UI outputs, like summaries or guided workflows.

There’s a supervisor agent that coordinates work, but the structure is intentionally flat. Agents can collaborate as needed, and high-stakes steps are gated with explicit user confirmation and oversight controls.

Underneath the agents, Dru MetaGraph provides the queryable metadata context that helps responses stay grounded and fast.

On pricing, today we don’t meter individual agent usage as separate line items. The goal is to make these capabilities broadly available to improve day-to-day user experience and outcomes. Over time, as we expand the set of outcome-oriented agents and workflows, packaging may evolve, but our intent is to keep it predictable and transparent, and not turn core assistance into a nickel-and-dime model.

Blocks & Files: Could AI agents be patentable software entities? Is this a practical idea?

David Gildea: Potentially, yes, but it depends on what you’re trying to patent. The most defensible patents in this area usually aren’t about the label agent, they’re about a specific, novel method: how an agent is orchestrated to solve a defined technical problem, how it grounds decisions in authoritative data, how it applies controls, and how it executes governed actions.

That’s where we see the real intellectual property. The underlying foundation models are widely available, but the hard part is the applied system around them:

  • Orchestration and control logic for breaking a request into steps, routing work to the right capability, and handling edge cases safely.
  • Grounding and verification methods that keep outputs tied to trustworthy sources like metadata context, rather than free-form generation.
  • Governed action patterns, including approval gates for high-impact operations and strong auditability.

Is it practical? In many cases, yes. Companies are already pursuing patents around concrete AI-assisted workflows and control methods, and there’s a reasonable path to protecting real engineering innovations. 

That said, the patent itself isn’t the main point. The bigger defensibility comes from the way the agent is implemented and operated inside a platform: the quality of the data context it can rely on, the controls around it, and the iteration required to make it consistently trustworthy in real environments. Our SaaS delivery model helps here in a grounded way: we can ship improvements consistently and keep behavior standardized, which reinforces an innovation moat over time, whether specific techniques are patented or kept as trade secrets.