three blocks
Datacore Software

Analysis

LTO-6 looking to have content addressability

posted on 30 June 2008 16:03


Enterprises want to search their tapes faster

What are the characteristics of tape that are needed when disk is the intermediate backup target and tape is the final resting place for data? Enterprises want to be able to find what information they have on tape much, much faster. The future characteristics of the LTO format should reflect this.

That's the message coming from Adam Thew, HP's StorageWorks Business Unit Manager in the UK. It's pretty much a given that the LTO-5 format will double LTO-4's raw 800GB capacity to reach 1.6TB. But it won't double LTO-4's I/O speed. The use of disk as the primary backup target means that the backup window time reduction pressure has moved off mid-range tape.

When LTO-5 comes out then the LTO Consortium wll have fairly detailed outlines for LTO-6 and LTO-7. We can be reasonably confident that capacities will continue to double with LTO-6 having 3.2TB capacity and LTO-7 6.4TB. But the main changes, in my assessment of what Adam Thew said, will be to add content-addressability.

LTO-4 already has a non-volatile memory unit in the cartridge and the LTO-4 format defines what needs to be stored in it. The likelihood is that LTO-6 will have an expanded memory-in-catridge capability and store much more metadata about the files on the cartridge.

However, this on its own won't help users find out what's stored in their LTO tape estate. What will, will be LTO tape automation devices that store that information themselves. We're looking at a 2-level index here. The first is in the automated tape device; the second is on individual cartridges.

IBM, HP and Quantum are discussing the LTO 6 and 7 formats now. Historically the LTO Consortium decrees an LTO-n tape format, with he next two generation formats pretty well defined too, and the three consortium vendors compete to supply tape drives and automated tape devices for that format.

There is no reason to suppose that this model, so successful up until now, will be changed. From HP's point of view an automated tape device is now the final stage in a backup and archive two-step. As Thew says: "It's a disk-to-disk-to-tape world now." Most customers buying an HP VTL (virtual tape library (or D2D backup system) have not got rid of their physical tape libraries.

This means that the physical tape environment in the future will likely have a deduplicating disk-based front end which carries out the backup to disk and deduplicates the data, gathering a tremendous amount of meta data 'on the fly' as it does so. This meta data, or some of it, could form the foundation for content addressability.

As the data on disk gets written to tape then a relevant subset of it could be written to the tape cartridge's flash memory.

The consortium members would take advantage of content meta data being stored on the cartridges in their own individual ways and compete at that level. The LTO format would define tape cartridge contents and formats but not tape dive or automated tape device characteristics.

Thew suggests LTO-5 is too close to deliver to have content addressability features added to it now. The implication is that LTOs 6 and 7 will be the vehicles for this with LTO-6 being the first opportunity to deliver it.

Another LTO future question is whether, in a final resting place world, it makes sense still to use traditional backup file formats? All Adam Thew would say to that is: "An interesting question."

Clearly LTO-5 will be backwards-compatible with LTOs 3 and 4 and LTO-6 with LTO's 5 and 4 but that doesn't preclude the consortium adding a new file format on the tape if it wished.

THew says that HPis shipping an increasing number of tape terabytes. Tape is not going away. It's green, it's removable off-site and it holds a huge amount of data very cost-effectively. Adding the ability to search for and retrieve information from it through content addressability would make it even more relevant in a business world that is rapidly becoming more subject to compliance regulations and more aware of legal discovery.

Timetables? Assume LTO-7 comes roughly two years after LTO-6 which will be two years after LTO-5 which is probably going to arrive as product some time like late 2009 or 2010.

Don't expect the feature on DAT, say on DAT 320, the next generation. SMES are storing more and moe data but don't face the same find-the-content pressures as enterprises.

[Chris Mellor.]


tags:  LTO-4 LTO-5 LTO-6 LTO-7 tape