
- Posted on Dec 12, 2011 at 5:20 PM GMT by Stephen Ludlow in INFORMation Governance
- 3,517 views
- 0 comments
- Tags: auto-classification, federal government, information governance, rm, us
- Categories: Records Management
In the first part of this series, we looked at the impact of the Nov. 28 MEMORANDUM FOR THE HEADS OF EXECUTIVE DEPARTMENTS AND AGENCIES that gives government agencies four months to come up with a plan to improve records management by moving to electronic records management systems - and the argument as to why the Government needs to consider Auto-Classification.
This second part, we look at what should be the easiest argument to make for Auto-Classification, legacy content.
Legacy content is the poster-child for Auto-Classification. If Auto-Classification should be used for instances where humans or business process cannot easily classify content, legacy content epitomizes this. Legacy content is that vast, and growing volume of unstructured information that we have stored in various forms throughout the organization. Legacy content has typically been created by the good intentions of IT to support lines of business and to meet efficiency requirements. Unfortunately, it was also stored before the idea of retention and disposition was on anybody's radar. The most typical scenarios that we hear are:
- We have 150 Million emails in a legacy email archiving application
- We have 50 TBs of content on fileservers
- We have 5000 back-up tapes that we want to get rid of
Throw Away Legacy Content
Organizations are now looking at these legacy stores of information as a source of on-going cost, a source of potential cost should they be forced to discover the content, and a source of litigation risk where information that could have been legitimately deleted is still hanging around.
The problem is, this is not a simple process. In order to "throw something away" organizations need to ensure that the content is not subject to a litigation hold and ensure that the content is not subject to retention policies associated to compliance or internal policy requirements. This means evaluating information to determine where it fits in the RM schedule, and it also means evaluating if it pertains to a litigation hold. In a recent statement to the Civil Rules Advisory Committee, Microsoft stated that two-thirds of its 14,805 litigation holds pertained to matters that had not yet been filed. That is an astounding statement on a couple of levels, but the sheer number of litigation holds is staggering, although not likely atypical. Combine this with the RM programs that often involve hundreds of RM schedules, is it any wonder organizations are not accepting the risk of throwing anything away??
No Easy Button, but…
There is a solution for this conundrum. As usual, it is not just technology, but people and process too. A great starting point for what is typically being called Content Remediation is The Sedona Conference Commentary on Inactive Information Sources. The commentary provides straight-forward advice on the steps that can be taken to evaluate if information can be dispositioned. In the end, the process boils down to a risk vs. cost exercise, where the organization must justify decisions on content dispositioning based on their understanding of the information. A good guiding quote is that, "Generally, organizations will not need to physically review the entire contents of an inactive information store in order to make reasonable, good faith and defensible decisions as to whether it contains information that the organization may be required to retain."
This is where Auto-Classification will be critical. Rather than having to look at every document, Auto-Classification can provide the ability to associate content with the enterprise records schedule, and at the same time, use sampling and review to demonstrate good faith and defensible decisions. Auto-Classification extends the subject matter expertise of the Records Manager into huge volumes of information through the use of analytics and supported by testing and sampling. Obviously, there is a lot of work to be done on the process side, including documenting the steps taken, the risk assessed and the decisions made. Content remediation, supported by Auto-Classification will still not be easy, but the potential savings can be massive.
If the US Federal Government is serious about addressing their electronic records, they also need to be serious about the huge volume of legacy content that they will need to address. Auto-Classification needs to be part of the solution. As with any organization we talk to, the critical question is always, "Will this be any easier next year?"
Argument # 3 for Auto-Classification - The Cloud made me do it will follow soon.
Last updated Dec 13, 2011 at 9:59 AM GMT