De-dupe is his Data Domain

EMC and NetApp running scared

Frank Slootman is Data Domain’s fast-moving and fast-talking CEO. We had a chance to talk with him about whether de-duplication is a feature or a technology worthy enough to stand on its own, plus a couple of other topics.

The background to this is that suppliers such as EMC and others have indicated that data de-duplication is not a technology strong enough to justify storage products on its own. Instead it is a feature or attribute of particular types of storage such as disk-to-disk backup targets (D2D), virtual tape libraries (VTL) and disk-based archives.

Instead of buying a separate de-duplication product, such as one of Data Domain’s DDX arrays, you would buy a a VTL, such as the Clariion Disk Library, and find de-duplication, of course, was included with it.

B&F: What do you think of the ‘de-dupe is a feature’ concept?

Frank Slootman: Why do they say this? There are two reasons. NetApp started this campaign. EMC picked up on it. We came in with a clean sheet. The other guys come with legacy technology, VTLs for example, and have bolted it on as a feature.¬†They do this (de-duplication) typically as a post-process. In their world they’re right. It’s an enabling technology. In our world it is not.

The other reason is that Data Domain is the fastest-growing storage company in the past four years, faster growing than NetApp in its first four years. We’re an enormous threat here. Every quarter the problem gets bigger.

NetApp and EMC hate Data Domain; we’re growing like a weed. We did $45 million last quarter, meaning a near-$200 million run rate. In a couple of years it will be a billion dollar problem for them.

We don’t sell de-duplication; we sell storage. NetApp and EMC give de-dupe away and sell (expensive) storage.

B&F: What happens when a Data Domain array is full?

Frank Slootman: Data Domain arrays are expandable; a 6-shelf unit is coming. Data has a life cycle, typically a retention period of 60 to 90 days for most people. Then it goes to die and it gets archived. What do customers do if an array gets full? Buy another one. It’s like water filling buckets; get another bucket.

B&F: How does Data Domain regard MAID technology? (MAID, for massive array of idle drives, is a Copan-invented technology in which most drives in an array are powered down, thus saving lots of electricity and enabling much greater drive packing density.)

Frank Slootman: No, it’s not applicable in our backup environment. Our storage is very active, protecting the data, scrubbing it and checking it all the time. That’s why our systems are so resilient. In an archive environment MAID is possible.

B&F: Who supplies your basic drive arrays?

Frank Slootman: Xyratex builds our shelves. They’re doing a very good job, a fantastic job on disk drive reliability.

B&F: How does Data Domain regard 2.5-inch drive technology?

Frank Slootman: We’re fundamentally a software company. Hardware is about ten percent of our value proposition. People buy software on a stick – and the stick is hardware.

We view the advantage of a smaller form factor as beneficial for performance-oriented or transactional applications that optimize for I/Os per second. Think Oracle or Exchange. A smaller diameter will at the same RPM rate reduce the amount of real estate the head/arm has to traverse. However, these applications tend to write only a small fraction of the medium, typically the outer ring of the platter.

Data Domain optimizes for capacity, or dollars per Terabyte, we use every inch of the platter for capacity unlike some of our competitors whose architectures are disk-bound, and need many high RPM drives to sustain throughput, but not using the available maximum capacity. Smaller form factors have smaller motors and consume less energy which is in vogue these days. For Data Domain, we try to optimize the economics of capacity and so far that has been better done with the larger form factor.

‘Yes’; that’s the answer to the question posed at the start. De-duplication is a feature, but whereas legacy storage array companies have, or are, bolting it on to their products by OEM deals with companies like FalconStor or Diligent, or by acquiring the technology as EMC bought Avamar, Data Domain has de-duplication as a fundamental, designed-in-from-the-start feature. As such it is much better integrated with the array software’s design and architecture.

Data Domain doesn’t sell de-dupe; it sells storage that has de-dupe designed in from the start.

The company’s approach has a lot of traction and it is gaining customers and growing at a great clip. In two or three years it could be, as its CEO predicts and no doubt hopes, a billion dollar company. What on earth is going to stop that happening?


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>