Ediscovery needs smarter content searching
The military have ‘search and destroy’ missions but eDiscovery is about search and detect missions, and they can cost much more. A major eDiscovery case could involve the review of more than 10 million documents.
These are scanned by expensive lawyers looking for smoking guns relevant to the case they are pursuing or defending. It has to be lawyers because they know the case context and details; they will recognise things that are significant as they review hundreds of thousands if not millions of documents.
It’s the new goldrush with lawyers panning for gold but the lawyers cost huge amounts of cash. “Some 70 percent of the cost of litigation can be accounted for by the review period,” according to Jan Puzicha, the chief technology officer (CTO) for enterprise search company Recommind. Unlike the real goldrush where the 49′ers covered their own costs its the lawyers’ clients who pay the litigation fees and they are feeling the pain.
Recommind produces software that uses smarter searching technology enabling the lawyers’ billable hours on review to be radically reduced. It can pre-scan documents and discard irrelevant ones, only bringing promising material to he expensive lawyers’ eyes.
Company history and technology
Recommind was started up in 2000 by two computer science PhDs involved in artificial intelligence research looking at self-learning analysis of user click streams and static text. Their software joined disparate data sources into one model and looked at which users accessed which groups of documents. It then recommended other documents the users would find interesting.
The company graduated to enterprise search and has customers in the legal area and the media industry, such as Bertelsman.
The technology involves much more than keyword searching of documents. For example, a search of Amazon’s bookstore on ‘networking’ would produce thousands of potential hits, 151,532 actually. Recommind would subdivide these tens of thousands of possibilities into different categories based on statistical analysis. It would separate out social networking from computer networking and InfiniBand from Ethernet networking.
In the eDiscovery context Recommind Axcelerate eDiscovery software will look at a document store and index it. Users can then run searches of the store for particular words and phrases and contexts and the software will winnow the wheat from the chaff as it were.
The product is typically used by a defendant in a legal case to review documents, including e-mails, PowerPoints, Word files, etc. before handing them over in response to an discovery request. This cuts down the number of documents to be reviewed by lawyers and so cuts down their billable hours.
The software will group documents by similarity and pre-code them. It produces a 3 to 5 times speed up compared to people manually reviewing documents. The coding represents how responsive a document is to a discovery request and also whether it is privilieged or not. All responsive and non-privileged documents have to be delivered to the discovery lawyers. The responsiveness is a score based on statistical analysis of documents, which have relevant terms highlighted.
The basic idea is to cut down the number of documents delivered to the discovering lawyers, not to facilitate their search for material to help their case.
Puzicha says: “It’s the review process that’s driving the cost, not the hunt for a smoking gun.”
Does he see Autonomy Zantaz in his market? “We do. There is a bit of a cultural mis-match; Autonomy is a hard employer (whereas) Zantaz was a soft employer. People left Zantaz and joined Recommind.”
Returning to the lead role of the review process he said: “Zantaz hasn’t productised this (which is) odd. It’s 70 percent of eDiscovery costs.”
Recommind also has a litigation hold product for the collection and preservation of documents in the event of litigation. The technology is used to produce th smallest number of documents that need to be retained in an unaltered form and these are preserved or locked down, by being written to write-once-read-many (WORM) media for example. Typically access is limited to a company’s general counsel or that person’s delegates.
The need for litigation hold is a consequence of the rise and rise in litigation. Pharmaceutical companies are being sued around the globe. Lots of financial institutions are facing legal claims as a result of the US subprime mortgage and allied credit-crunch crisis. Puzicha says: “It’s only going to grow.”
E-mail filing and storage
A third product line is concerned with storing e-mails. Puzicha said: “It shares some of the same technology as search but the storage is standard filing with de-duplication -single instances – such as a signature. It’s our own de-duplication and is checksum-based, not being like Diligent or FalconStor de-dupe.”
The aim is for people to very quickly file e-mails with the Recommind software prompting them where to file it. For example, in a legal context e-mails would be stored by cases. Finance and HR would have different logical storage structures. This technique reduces the size of Exchange e-mail databases.
Recommind has potent technology to index and group documents, any kind of textual material, by using statistical analysis to discover relationships between documents and the ways they might be clustered together. It uses this technology to trawl through a company’s vast document store and produce a drilled-down set of documents in response to a discovery request. It also uses the same technology to build a locked down set of documents if a litigation hold is imposed by the courts.
A third use of the technology is, again, to reduce the size of a document store by intelligently filing e-mails in logical groups according to the context of the mail and by stripping out duplicated items such as signatures.
Recommind products can be viewed as intelligent searching and organising of enterprise content to minimise storage costs and reduce legal fees when faced with eDiscovery approaches. Traditionally, lawyers would sit down and strip mine a document repository. Recommind substitutes far more efficient follow-the-seam mining techniques and so stops client companies having to pay astronomical fees to lawyers engaged in the eDiscovery gold rush.