INFORMation Governance
Information Governance, Archiving, Records Management, e-Discovery, and Compliance
- Posted on Mar 21, 2012 at 8:15 PM GMT by Elizabeth Kofsky
- 1,528 views
- 0 comments
Implementing an Information Program - Where to Begin?
I am not sure if this has anything to do with the latest heat wave in Ottawa, Canada – yes, we are experiencing summer-like weather in March. Trust me, I am not complaining, but spring is in the air and it seems like it’s making everyone wake up to a very important topic: where to begin with information governance?
I have received a flurry of inquiries over the past week, all with a common theme - where should our organization start? Something that is critical, and reiterated by many, is to ensure an information governance program lies with the Chief Information Officer and Chief Legal Officer, while involving records management and the key stakeholders from different business units. It is a team approach and not something that can be successful if only driven by one team. As opposed to writing an extensive blog post on all the key considerations, I have decided to talk about just one: start with the riskiest content.
One important element is to look at your riskiest content first as it’s not always possible to conquer all requirements at once. There is nothing wrong with a phased approach; proving success in small increments can prove to be a good approach.
One of the riskiest content pieces is email. Why? Simply because no one ever gets rid of it or if they do, it’s done on a whim with no consideration to its content. This type of activity can translate into an enormous exposure to risk due to the likelihood of smoking guns and spoliation.
So, can email be managed? Absolutley. I strongly encourage flexibility, specifically when it comes to classifying email – one approach will not fit all. There will be some users that want the ability to drag and drop; others will want to choose their classification from picklists or favourites, while others will want the system to automatically classify emails. All methods are encouraged as you want to make sure you provide a system the end-user will use.
As an example, take a look; well actually have a listen to how NuStar Energy addressed its email management challenges.
NuStar Energy L.P. is a publicly traded, limited partnership based in San Antonio, with 8,417 miles of pipeline; 89 terminal and storage facilities that store and distribute crude oil, refined products and specialty liquids; and two asphalt refineries and a fuels refinery with a combined throughput capacity of 118,500 barrels per day.
With operations in the United States, Canada, Mexico, the Netherlands, the United Kingdom and Turkey, the company uses OpenText Email Management for Microsoft Exchange to help reduce the cost and risk of mismanaged corporate email.
“Having all email in one location, being able to search in one place and put a legal hold in one location instead of potentially seven or eight, is huge for us on the legal end. It’s all managed by OpenText."— Clint Wentworth, Records and Information Manager, NuStar Energy.
There has been a lot of traction around the topic of auto classification. A very important element is to make sure the offering is defensible at the same time as being transparent. Defensible so that the organization can state to the courts and auditors the processes taking place to test, tweak and monitor the way the information is being classified. Transparent so there is no disruption to the way our businesses work.
Yes, there is at lot to consider when starting to implement an information governance program. A solid strategy, involvement from key stakeholders and addressing your riskiest content first is a sound way to begin!
- Posted on Feb 06, 2012 at 9:00 AM GMT by Elizabeth Kofsky
- 1,604 views
- 0 comments
As I just got back from LegalTech NY, where I spent many hours speaking with General Counsels and IT Directors about their business requirements, I wanted to share and repost a blog, which is relevant to those discussions.
The following is a guest post by Dave Martin - @The_D_Martin
Originally posted on Dave Martin's blog on SharePointProMagazine. Read the other posts in this series here.
Over the years one of the many things I’ve been involved with is governance. To most the word governance is synonymous with compliance, which is then in turn synonymous with records management. After that the focus becomes very specific. What I recommend people do when trying to understand how they should approach governance is to approach it as a strategy and make sure that strategy involves and intertwines three things: people, process and technology.
If this sounds familiar it was an integral part the first post I wrote in this series around understanding SharePoint from a big picture perspective. When it comes to governance specifically there is a certain part of this triumvirate that stands out: the people. We often run headstrong into governance deployments without really understanding who needs to be involved before the code hits the servers and processes are under way.
The very first step organizations need to take is defining that small group, who will steer the solution to and through implementation. Obviously IT pops up first as we look to define this working group, and they are unquestionably a very big part as they will be responsible for the technology doing what it needs to do. Another group that should also be considered a bit of a no-brainer is the group or department, or in many cases, the individual responsible for records. This person may be by title the compliance officer, records manager, IT security or legal counsel, regardless they are responsible for the information policy management of the organization. And lastly, but certainly not least we must include someone, or some group that represents the line of business worker, or end-user.
Surprisingly, I have seen this last group consistently excluded from the planning process. Not because they are a problem or difficult to work with, but because the people that are actually going to use the solution are often an afterthought, or as IT would consider them: the customer. DO NOT forget to include this group! At the end of the day they will literally make or break the deployment’s success causing problems for both those other groups at the table as they won’t understand the technology (frustrating IT) or they don’t execute according to policy (putting the company at risk).
Once we have the right contributors at the table we can start to define the governance strategy. When people are defining their governance strategy I always promote that they ask themselves a few key questions to help better understand what they want to do, who it will affect and what they need to do it. Once these questions have been answered a plan can be more easily defined.
The first question is: do you understand your content? This is very important and can also be made as a statement: know your content! We have content broadly spread across our environment, not just in SharePoint. If we are planning to move large portions of that content into SharePoint – file share replacement is one of the top uses of SharePoint – think about what you are moving over. Is this relevant data? Is this data that must live under compliance? Is this duplicate data? Is this active data?
This last question is an important one to consider in terms of SharePoint. SharePoint is an active content solution, and a relatively costly place to store content. If you are moving massive volumes of data into SharePoint it just does not make sense to move old, inactive content into SharePoint from a cost perspective. This content should move directly into an archive that lives on a lower and cheaper tier of storage. Once again we must consider “the who” for a second here. Even though we are moving content out of SharePoint and into a more cost effective compliant place we cannot forget that users should be able to access it or restore it (permissions pending) directly from SharePoint.
My next question is: what are your specific compliance requirements? This varies widely from company to company and industry to industry – every company has corporate policies specific to their internal requirements, and many companies have to adhere to industry regulations. SharePoint does a great job of managing the content in SharePoint as records, but does an even better job when supported by partners. As broad as SharePoint’s records capabilities are when it comes to supporting industry regulations and government guidelines like the Department of Defense 5015.2 (DoD 5015.2), physical records and records living outside of SharePoint’s native repositories a third-party add-on solution is a requirement.
And for my last question, we go back to “the who” again: How will we govern the people? Again, for most, information governance has to do with the information, but we must also be sure to govern the people if we are going to be successful. This question relates to how we are enabling people to leverage the core strengths of SharePoint, and this all starts with the creation of Sites and filling them with content. Organizations have to have a Site provisioning plan in place or they risk putting the organization as a whole at risk. Site sprawl is not just a myth, it is a reality, but it doesn’t have to be feared. Attaching a lifecycle and policies to a Site at the point of creation will ensure that Sites are connected to the data center and can be managed under the watchful eye of IT. Not only this, but we can now monitor those same sites and move them to the appropriate tier of storage once they have become dormant or inactive. Site provisioning allows organizations to permit the creation of as many or as few sites required all in a controlled fashion.
As you can see, understanding “the who” when defining your governance strategy for SharePoint is a pretty big deal. Not to downplay the value of process or technology, but to use an analogy: it is the person that drives the car down the right road, and it really helps when that person knows where they’re going. Just like a good governance plan for SharePoint, people who drive cars will get to their destination faster if they have good maps.
To find out more, join me on February 21st at noon EST where I’ll be participating in the webinar Extending SharePoint Across Your Information Infrastructure. You’ll learn key concepts required to turn SharePoint into a multifaceted, stable, and powerful IT tool set.
Dave Martin is Director of the Microsoft Solutions Group for OpenText. Email Dave at dmartin@opentext.com or follow him on twitter @The_D_Martin
- Posted on Jan 26, 2012 at 8:00 AM GMT by Elizabeth Kofsky
- 624 views
- 0 comments
If you’re like most law firms and departments, you likely have some pretty major concerns with how to manage the explosive growth of information and enormous costs and risks associated with unmanaged information. Sure, there is a cost to store all that information but it’s a small price to pay when compared to the legal and compliance risk and litigation costs that accompany the growth of that content.
If you’re looking for a bit more balance in your firm or department when it comes to creating, retaining and deleting your information, be sure to visit us at LegalTech next week in New York. We’ll help you assess the risk, value and cost associated with your information governance requirements.
Hope to see you in New York
LegalTech
January 30 - February 1, 2012
The Hilton New York
1335 Avenue of Americas
New York, NY
Booth #2205
- Posted on Dec 12, 2011 at 5:20 PM GMT by Stephen Ludlow
- 3,589 views
- 0 comments
- Tags: auto-classification, federal government, information governance, rm, us
- Categories: Records Management
In the first part of this series, we looked at the impact of the Nov. 28 MEMORANDUM FOR THE HEADS OF EXECUTIVE DEPARTMENTS AND AGENCIES that gives government agencies four months to come up with a plan to improve records management by moving to electronic records management systems - and the argument as to why the Government needs to consider Auto-Classification.
This second part, we look at what should be the easiest argument to make for Auto-Classification, legacy content.
Legacy content is the poster-child for Auto-Classification. If Auto-Classification should be used for instances where humans or business process cannot easily classify content, legacy content epitomizes this. Legacy content is that vast, and growing volume of unstructured information that we have stored in various forms throughout the organization. Legacy content has typically been created by the good intentions of IT to support lines of business and to meet efficiency requirements. Unfortunately, it was also stored before the idea of retention and disposition was on anybody's radar. The most typical scenarios that we hear are:
- We have 150 Million emails in a legacy email archiving application
- We have 50 TBs of content on fileservers
- We have 5000 back-up tapes that we want to get rid of
Throw Away Legacy Content
Organizations are now looking at these legacy stores of information as a source of on-going cost, a source of potential cost should they be forced to discover the content, and a source of litigation risk where information that could have been legitimately deleted is still hanging around.
The problem is, this is not a simple process. In order to "throw something away" organizations need to ensure that the content is not subject to a litigation hold and ensure that the content is not subject to retention policies associated to compliance or internal policy requirements. This means evaluating information to determine where it fits in the RM schedule, and it also means evaluating if it pertains to a litigation hold. In a recent statement to the Civil Rules Advisory Committee, Microsoft stated that two-thirds of its 14,805 litigation holds pertained to matters that had not yet been filed. That is an astounding statement on a couple of levels, but the sheer number of litigation holds is staggering, although not likely atypical. Combine this with the RM programs that often involve hundreds of RM schedules, is it any wonder organizations are not accepting the risk of throwing anything away??
No Easy Button, but…
There is a solution for this conundrum. As usual, it is not just technology, but people and process too. A great starting point for what is typically being called Content Remediation is The Sedona Conference Commentary on Inactive Information Sources. The commentary provides straight-forward advice on the steps that can be taken to evaluate if information can be dispositioned. In the end, the process boils down to a risk vs. cost exercise, where the organization must justify decisions on content dispositioning based on their understanding of the information. A good guiding quote is that, "Generally, organizations will not need to physically review the entire contents of an inactive information store in order to make reasonable, good faith and defensible decisions as to whether it contains information that the organization may be required to retain."
This is where Auto-Classification will be critical. Rather than having to look at every document, Auto-Classification can provide the ability to associate content with the enterprise records schedule, and at the same time, use sampling and review to demonstrate good faith and defensible decisions. Auto-Classification extends the subject matter expertise of the Records Manager into huge volumes of information through the use of analytics and supported by testing and sampling. Obviously, there is a lot of work to be done on the process side, including documenting the steps taken, the risk assessed and the decisions made. Content remediation, supported by Auto-Classification will still not be easy, but the potential savings can be massive.
If the US Federal Government is serious about addressing their electronic records, they also need to be serious about the huge volume of legacy content that they will need to address. Auto-Classification needs to be part of the solution. As with any organization we talk to, the critical question is always, "Will this be any easier next year?"
Argument # 3 for Auto-Classification - The Cloud made me do it will follow soon.
- Posted on Dec 08, 2011 at 9:06 AM GMT by Stephen Ludlow
- 1,727 views
- 0 comments
- Tags: auto-classification, federal government, information governance, rm, us
- Categories: Enterprise Content Management, Records Management
There was an interesting confluence of events last week.
- On Nov. 28, President Obama released a MEMORANDUM FOR THE HEADS OF EXECUTIVE DEPARTMENTS AND AGENCIES that gives government agencies four months to come up with a plan to improve records management by moving to electronic records management systems "where feasible."
- On Nov. 30, we announced the availability of OpenText Auto-Classification, the first Auto-Classification solution for Records Management with built-in transparency and defensibility.
We did not plan for these announcements to coincide, but it was definitely good timing. Beyond the obvious additional focus on Records Management the Obama memo has created, a significant argument could be made that the Federal Government will have to adopt Auto-Classification of content in order to be successful in its Records Management programs.In fact, there are at least three arguments:
Argument # 1 for Auto-Classification - We can’t afford not to
Argument # 2 for Auto-Classification - Legacy Content
Argument # 3 for Auto-Classification - The Cloud made me do it
Argument # 1 - We can’t afford not to
The memorandum says, "When records are well-managed, agencies can use them to assess the impact of programs, to reduce redundant efforts, to save money, and to share knowledge within and across their organizations. In these ways, proper records management is backbone of open Government."
So, with all of these obvious benefits, why is the memo needed at all?
Why aren’t federal agencies addressing Records Management on their own without prompting from the Whitehouse and NARA? The truth is, the US Federal government has been largely unsuccessful in managing electronic records, even though more than 90% of all records are now created electronically. In fact, about 95% of federal agencies fail to meet the statutory requirements for maintaining their records according to a NARA self-assessment survey.
Federal Government Agencies are finding out what many organizations struggling with litigation and compliance requirements have known for some time. Records Management is hard! Especially when the scope is expanded to include the plethora of content creating applications and document types that are being used today.
The survey notes a number of issues in the electronic records management programs in in many agencies and noted that these agencies:
- Do not ensure that e-mail records are preserved in a recordkeeping system;
- Do not monitor staff compliance with e-mail preservation policies on a regular basis;
- Have policies that instruct employees to print and file e-mail messages;
- Consider system backups a preservation strategy for electronic records, not distinguishing between saving and preserving electronic records;
- Consider compliance monitoring to be the responsibility of IT staff; and
- Are rarely or not at all involved with, or are excluded from altogether, the design, development, and implementation of new electronic systems.
In this list of issues, we see a number of indicators as to why Records Management is hard in US Federal Agencies. We see is that it is difficult to declare email as records, there are out-dated policies and no enforcement and a lack of archiving. It also indicates that Records Managers lack the ability to impact the selection of applications to ensure that they have Records Management, or Record Declaration capabilities. (BTW - Government Records Managers shouldn’t feel too bad as Forrester’s Brian Hill points out, this is pretty consistent with what is going on in private sector too.)
Many of these issues are a result of the difficulty in rolling Records Management out to the knowledge workers that are creating the records. These knowledge workers require:
- Applications where records can be declared and/or captured
- Training on how to classify records and the associated change management
- Monitoring and enforcement
The reason why Auto-Classification will be critical to making Records Management affordable is that the costs associated to training, change management, monitoring and enforcement will be at least an order of magnitude larger than the cost of the technology.
Auto-Classification addresses many of these costs by taking the knowledge worker out of the equation. Auto-Classification evaluates electronic content as it is captured, and using analytics and rules, can associate an RM classification. Auto-Classification reduces the requirement for training and change management, and most importantly, adds consistency and the ability to monitor and enforce records classification that is just not possible when we depend on knowledge workers to classify content.
Auto-Classification also addresses the single biggest fear associated with Records Management programs, which is the fear of failure. Having end users declare records, especially with high-volume, ad-hoc content like email requires significant change management. This fear of failure is one of the main reasons why many Records Management programs never get off the ground. Records Managers can use OpenText Auto-Classification, in combination with our leading RM and information lifecycle applications, to develop a programmatic, transparent and defensible approach to classifying content that does not transfer the burden of records management to knowledge workers.
Later this week, I will address
Argument # 2 for Auto-Classification - Legacy Content
Argument # 3 for Auto-Classification - The Cloud made me do it
- Posted on Jul 26, 2011 at 5:05 PM GMT by Jens Huebel
- 4,264 views
- 0 comments
- Tags: cmis, moreq, moreq2010
- Categories: Enterprise Content Management, Records Management
Overview
CMIS deals with accessing content in repositories for Enterprise Content Management and given that it tries to cover a broad spectrum of different kinds of repositories. One of the design goals was the capability of exposing existing functionality without the need for changing the underlying implementation. CMIS has built-in flexibility and mechanisms to discover the set of available features. You can even implement CMIS on top of a traditional file system. MoReq2010 on the other side is much more specific in scope. MoReq2010 deals with managing records and this has a bunch of implications on a repository. To perform records management tasks you will need functions that are not available in every CMIS implementation. For example CMIS 1.0 does not know about holds, classification or disposal schedules. On the other side MoReq2010 restricts certain operations or forbids them completely (update or delete operations for example). But still they have some design principles in common: Both standards define a set of services, sets of metadata and type definitions. Both standards also rely on XML for formal data representations. They differ in that CMIS defines two protocol bindings for SOAP and AtomPub as wire protocol. But there is no counterpart in MoReq2010. As primary mechanism for data exchange MoReq2010 defines interfaces and a data format for exporting and importing data (including mass transfers). Specific interfaces for export/import are not covered in CMIS. Let us first look at some more details and derive some options for potential use cases to get synergies from both standards.
Comparing the services
The second part [Part 2] of my blog series gave an introduction about the services in MoReq2010. Refer to this for an introductory explanation. Here we will start comparing what is available in CMIS, what is available in MoReq2010 and how it matches.
| MoReq2010 | CMIS |
| User Group Services | |
| Model Role Service | ACL Service / Policy Service |
| Classification Service | |
| Record Service | Navigation / Object / MultiFiling Service |
| Model Metadata Service | Repository Service |
| Disposal Scheduling Service | |
| Disposal Holding Service | |
| Searching / Reporting Service | Discovery Service |
| Exporting Service | |
| Versioning Service | |
| Relationship Service |
There is no counterpart for the UserGroupService in CMIS. User Management was considered to be out of scope for CMIS 1.0, because this is a complex topic on its own and there exist other standards for user management. Obviously this leads to some gaps. There is currently a discussion in the CMIS TC around this, so it may change in a future version of the spec [1]. MoReq2010 has specific requirements like unique ids for users and restrictions how to preserve information after removing users.
The ModelRoleService deals with permissions and defines a concrete permission model based on Access Control Lists (ACLs). CMIS has two concepts for permissions: PolicyService and ACLService. The ACL Service is designed to be open for different implementations and only defines some basic rights. The MoReq2010 model is dynamic in the sense that an administrator can define roles (and assign a set of callable methods in that role). The MoReq2010 permission model could be exposed via CMIS. However the dynamic nature of Moreq210 roles is a bit tricky and may lead to confusion for some CMIS clients.
The Classification service, Disposal Scheduling Service and Disposal Holding Service cover specific requirements for records management and do not have a counterpart in CMIS. However not every use case will require the full functionality of these services. Some of the more common use cases can also be covered via CMIS (see below).
The ModelMetadataService deals with metadata and property definitions which are Type Definitions in CMIS. In CMIS 1.0 each object has exactly one type. Moreq2010 is more flexible and also has a concept for templates. The CMIS Repository Service allows only read access to the types, but type creation is on the roadmap [2]. Another proposed extension called "Secondary Types" will give more flexibility in CMIS [3]. This would allow exposing even more of the MoReq2010 model. It would be possible and useful to have common and standardized type definitions for the MoReq2010 system metadata. Not all property definition fields can be exposed via CMIS (e.g. RetainOnDestruction, PresentationOrder, …) but this should not be a blocker and could be covered through CMIS extensions.
The SearchingReportingServices of MoReq2010 is covered by the CMIS Discovery Services, but both have different functionalities. MoReq2010 does not define a query language, CMIS does. Reporting capabilities are not covered in CMIS but could be built on top of the existing service. CMIS also does not have saved searches, but a specific object type might be used as a replacement. It would be interesting to build a MoReq2010 compliant implementation of the SearchingReportingService that relies only on the CMISDiscoveryService for the implementation. In this way one implementation could cover a wider set of existing repositories. This an area where open source implementations would be interesting.
Export and import is a similar story. There is no direct mapping in CMIS for this, but you can implement these services based on CMIS services. MoReq2010 has a lot of requirements on the completeness of data (event history, users, types, etc.). You would need a set of standardized MoReq2010 types in CMIS and it would be interesting to investigate how far you can get providing a common implementation. Probably it will be hard to get full compliance for any existing repository but also a sample implementation as template could be helpful.
For the CMIS RelationshipService there is no counterpart in MoReq2010. However there are concepts in MoReq2010 that can be modeled very well in CMIS using the RelationshipService. An example is the event history. The event history of an object could be modeled as an "EventHistory" relationship type. Having standardized types in CMIS would give us generic functionality in multiple repositories.
MoReq2010 does not support versioning. There are surprisingly little requirements in the specification about content handling in general. The notion of content parts is not supported in CMIS. CMIS only has one content element, but the standard is designed for more flexibility in future extensions.
Overlap or perfect match?
While records management standards covers all the functionality needed for a full blown records system there are many use cases requiring only a small subset of this functionality. Sometimes scenarios in an enterprise require interaction between multiple systems and involve records, but are restricted to certain aspects of a record.
A common example might be an ERP system producing records like invoices, personnel files, financial reports etc. The customer of such a system is interested in managing these records in a compliant way. The ERP system has the business context but is not necessarily itself also a records management system. In this case the ERP system has all the information to assign a MoReq2010 classification ("Invoices are kept for 7 years and use classification INV7") and can pass the invoice as a PDF file with metadata and classification to a MoReq2010 compliant repository. The ERP in this case never might be interested in applying a hold or scheduling a disposition. All the records functionality is exposed from the records system and the business system provides the context.
For use cases like this both standards can complement each other. CMIS is already available for many repositories and applications and has the advantage of providing a binding. Using the binding you simply can plug both systems together. Another system might just need read access to the document (let's say a call center where customers ask questions about their invoices). The list of available classifications might not change that frequently and can be exported and imported (or transferred as another CMIS type).
If a company has more systems managing records in place then they have to be kept in sync to guarantee the company's record policy. Here are many more aspects of a record are involved and the MoReq2010 standard addresses them much better than a generic standard like CMIS can. Such an application for example might want to list all holds, display pending disposal schedules or generate a report with statistics (all covered in MoReq2010).
CMIS may also help in another scenario as an intermediate step if a customer wants to start with classifying today but adds the record management system later. These are only a few examples but there are many use cases where CMIS can be used as a vehicle between applications and repositories where MoReq2010 guarantees the compliance.
Conclusion
There are overlaps between the two standards but this is not harmful. Instead it can help solving use cases based on existing CMIS implementations or taking benefit from the bindings. MoReq2010 then deals with the advanced records management scenarios requiring more knowledge about records management and guaranteeing compliance of the underlying back-end. Many ideas of these articles should be considered being initial ideas inspiring more thoughts and perhaps some objections as well. Nothing of this is written in stone. It will require more investigation and sometimes a deeper understanding of the spec. But I am sure that we will see a lot more interesting articles and use cases within the next weeks and months. Let's make the best out of the two standards by focusing on their strengths where it makes sense.
[Part 1] MoReq2010 And The New RM Debate http://blogs.opentext.com/vca/blog/1.11.623/article/1.26.933/2011/7/7
[Part 2] MoReq2010 A Look Into The Specification http://blogs.opentext.com/vca/blog/1.11.623/article/1.26.935/2011/7/7
[1] User Management in CMIS: http://tools.oasis-open.org/issues/browse/CMIS-724
[2] CMIS Type Mutability: http://tools.oasis-open.org/issues/browse/CMIS-699
[3] CMIS Secondary types proposal: http://tools.oasis-open.org/issues/browse/CMIS-713
- Posted on Jul 11, 2011 at 11:45 AM GMT by Elizabeth Kofsky
- 1,890 views
- 0 comments
Shaping the Connection between Records Management Principles and Information Governance
I have spent the last couple of weeks speaking with customers about their recent implementations. Due to my background and role at OpenText , the customers I engage with always have a Records Management component in their project. What I found remarkable (and am very happy that it’s actually materializing) is the real shift in the organizational culture surrounding Records Management and Information Governance. It won’t be a surprise to anyone that Records Management used to be seen as people in the basement managing dusty boxes – recently, it has found a new, more vital (no pun intended) role within the organization. I believe a key reason for this shift stems from the fact that organizations are realizing they must evolve and embrace Information Governance (of which RM, Archiving, and eDiscovery are key components). Not only are they embracing Information Governance, organizations are realizing the costs and risks that they will incur if they don’t embrace it.
I just read Barclay Blair’s latest executive brief entitled Justifying Investments in Information Governance which specifically calls out the sources of the costs faced by organizations should they fail to invest in Information Governance. This further substantiates the value of an Information Governance Program.
From recent customer calls, to my delight, I am seeing organizations take the Generally Accepted Records Management Principles (GARP) and apply them to ALL corporate information (meaning, not restricting the fundamental principles of Records Management to “official records” [which we all know only account for a small percentage of all enterprise information]). Looking at two of the eight principles–retention and disposition–I trust you will see the immediate value it brings to all your corporate information.
The principle of retention is about keeping information for as long as required, based on legal, regulatory, fiscal, operational, and historical requirements. This principle needs to be applied to all information so that your organization is mitigating all risks and minimizing all costs associated with information. How can it help? Just look at your shared drives—how many copies/duplicates are residing in your shared drives? How much storage is that taking up? How much does that information increase the risks and costs should your organization be involved in a lawsuit?
You can then add in the principle of disposition, which is about getting rid of information that is no longer required to be maintained by organizational policies or applicable laws. Being able to defensibly disposition information will minimize the risks and costs associated with all of the information. An easy example to demonstrate the value of disposition is to look at an inbox. How much information within the average inbox is not relevant to the business? How many copies of the same attachment does one normally have? How much information could have been deleted years ago?
Looking at these two principles and examples is just the tip of the iceberg. There is a lot more value for your organization by extending all the Records Management Principles to all your corporate information. There is no doubt in my mind, this approach will result in drastic improvements, on how you view and deal with information. I recognize that this will be a transition and will take time, but strongly encourage your organization to start the journey!
- Posted on Jul 07, 2011 at 3:29 PM GMT by Jens Huebel
- 3,589 views
- 0 comments
- Tags: moreq, moreq2010, recordsmanagement, standards
- Categories: Records Management
The first part of this blog series gave an introduction to the new debate about RM initiated by the release of the MoReq2010 specification. This part will be a bit more technical and give an overview about the spec, the ideas and how they were realized. A blog like this can only scratch the surface of such a complex spec. For a full understanding you will have to read the spec, perhaps not even all the 520 pages but some parts: Section 1.4 is an excellent introduction. Each service chapter starts with an overview about the purpose and scope followed by a more detailed list of requirements. The function definitions and data type definitions coming then are more reference material.
And the spec is really quite readable. There are lots of explanations; they start with explaining concepts, ideas and illustrating diagrams before going into the details. In this regard the committee has done an excellent job. One of the most discussed aspects of Moreq2010 is the attempt to achieve better modularity. How does this work and what is different in MoReq2010 compared to similar specifications?
The MoReq2010 core (and this is the only part available yet) specifies a set of services and methods each compliant repository must implement. Services are the foundation how different modules can exchange records or use functionality in other modules. This is the path to go away from having one monolithic application doing everything in a central repository. An example would be an "in-place RM" setup where the records system is outside of the business system. The business creates and stores the data and a separate system does the classification and disposal of those records.
To share data between systems MoReq2010 defines an XML schema and a mandatory requirement for a system to be able to export data according to this schema. The standard has detailed requirements which data have to be exported guaranteeing that all aspects of a record are preserved. For example this includes the history (events) and the user identification. The import capability is an optional feature. Keep in mind that for long retention periods like 75 years, multiple export/import cycles due to system changes are very likely.
Some services usually will exist only once in a deployment and are shared between the different modules. Others may occur in multiple incarnations. User management for example in many cases will be a shared service. Classification on the other side may be implemented in different schemas each one being a service.
The following sections give an overview about the services MoReq2010 defines including their key characteristics and functionality. For a complete description please refer to the specification.
1. User-Group-Service:
Manages users and groups, often may be done within or with the help of the corporate directory, but enforcing certain restrictions:
- Users and groups must have a unique id.
- Unique ids are never destroyed even if the user/group no longer exists.
- Any time even after delete it must be reproducible what groups existed and which user was in which group (provide a report for any historical point in time).
- Users and groups have a defined set of mandatory metadata
2. Model-Role-Service
Provides the means to manage access control in a repository. Roles contain a list of functions that can be executed and are assigned to user or groups in Access Control Lists (ACLs). ACLs are attached to records or can be inherited from a parent-child relationship (an aggregation classification). Implementation is optional, but if not available it must be proven that an equally powerful mechanism exists. Roles are divided in administrative and non-administrative roles that have different inheritance behavior. ACLs can include or exclude inherited roles. A System of Records must be able to tell what functions a specific user can use.
3. Classification Service
Classifications are used to associate a business context to a record. They are used to determine the disposal schedule. Every record has a class and only one class. A repository may support different classification schemes and then each scheme will get its own service. Classifications can be inherited. The informational model contains a more detailed description of a hierarchical classification scheme.
4. Record Service
The Record Service manages aggregation of records (in many repositories known as folders). A specialty in MoReq2010 is that an aggregation either contains other aggregations or records but not both. Aggregations are used to manage inheritance of metadata, ACLs and classifications. The maximum number of nesting levels of aggregations can be limited and an aggregation can be closed (means cannot add more entities). Records can be duplicated in multiple aggregations and a record can have sub-components called parts (for example an HTML page plus pictures). A repository must support moving records from one aggregation to another.
5. Model Metadata Service
The Model Metadata Service deals with the definition of metadata sets for records. Again implementation is optional but must be replaced with equivalent functionality if not implemented). MoReq2010 differentiates between system metadata (part of the standard and always mandatory) and context metadata (customer specific). Metadata definitions are grouped into templates and types. Each type has a unique type identifier and each entity in a system has exactly one type. Metadata definitions can have additional constraints like (optional, mandatory, or a maximum number of allowed values). They can also indicate if their values must be preserved when an entity is destroyed. Datatypes indicate if they are textual and have a language identifier. The standard does not define specific data types (like Date, String, Integer, …) but refers to the XML specification instead.
6. Disposal Scheduling Service
As the name indicates, this service is responsible for managing dispositions of records. MoReq2010 distinguishes between deletion and destruction. A disposal process never erases a record completely, but instead destroys it. Destruction means that certain information is still preserved (for example that the record existed, its id, when it was destructed and why, etc.). This is called a residual object then. When retention periods expire, disposal runs are scheduled and these trigger disposal actions. An action must not necessarily be a destroy operation it also can be a transfer operation or some kind of approval workflow. Retention periods have a start date and an end date. Start dates are triggered by event. Such a trigger event exists in many flavors. It can be a metadata change, a change in the parent aggregation, a certain date and so on. Each record always has only one disposal schedule, but this may change over time. An aggregation is destroyed when the last active record it contains is destroyed. A repository must send alerts when disposal actions are not carried out.
7. Disposal Holding Service
Holds are a well known concept in records management. A hold is a legal or administrative order to interrupt the disposal process in MoReq2010. A hold always must explicitly be removed ("lifted"). The Disposal Holding Service contains the methods to manage holds.
8. Searching-Reporting Service
Not hard to guess to guess this service defines the methods for reporting and searching. Searching can be done by metadata or (optionally) as text search in the content. The standard defines a lot of criteria for search and an interesting option to search all metadata text fields simultaneously.
- search by entity type, events, any(!) system or context metadata
- find records by metadata values
- combinations to do complex searches
Searches must reflect access control and search results must be paginated and returned in an ordered list. The standard does not contain a syntax for a query language.
A repository must be able to generate reports (detailed or summary) but makes no assumption about the output format.
9. Export Service
The export service is used to transfer data out of a system into an XML file (or a set of files). Reasons for exporting data can be transfer, migration or replication of data. The XML schema for the data format is not yet specified. Some use cases may prefer or require a partial export of records but this is not further specified and is optional. The service guarantees to preserve sufficient level of detail when moving data between systems (for example includes all events of a record and referred entities like users). Importing data is an optional feature. The service is not only used to transfer records, but also other entities like classes, users, roles, metadata definitions and templates.
Conclusion
MoReq2010 is definitely more than just another records management standard. The service based approach; the focus on modularity and the premise "keep it simple" are all new ways of specifying requirements for records management. But it is also obvious that it is not the silver bullet for everything. There are parts in the standard that will make assumptions about elements being deep in the core of a repository. If a specific vendor model does not fit into this approach it will be hard to implement conformance without breaking compatibility. Other features are not trivial to implement if they have to scale to millions and billions of records in really large systems.
There is also some caution needed to not make expectations that are unrealistic to achieve in the current state. MoReq2010 defines an abstract model with services and methods. It does not define any mechanism how to implement the services or how to access these methods. If this is a Java interface, a web service or something completely different is left open. This means there won't be a test script that you can run against any repository for validation. There won't be a way that you can simply plug different modules together and they can talk to each other. MoReq2010 is not like CMIS and I wonder if this is in scope for a future extension. It does not even define concrete list of input and output parameters for the methods. You won't get a generic client that can browse the pending disposal schedules, list the holds or do a federated query in multiple repositories. This looks a bit like having stopped on half of their way. The XML schema is also not yet available meaning that you can't even begin an implementation at the moment. I fear that this will lead to a lot of frustration when people start looking closer at the standard. But of course this does not mean that there is something fundamentally wrong with it. Many things can be specified on top or added later, but you have to know what you will get at the moment. The committee did an ambitious step. The specification has triggered a new debate about Records Management. Do not underestimate what has happened in the first few weeks after release.
And there are other standards that can fill the gap partially. Have you ever looked at CMIS? Do some of the concepts and terms from above sound familiar to you? Types, folders, ACLs? Is it conflict or a perfect match? A non-relevant replication of things defined elsewhere? I think at least a topic interesting enough to think about in another article.
What else can we do? There is one thing coming into my mind that really would help getting adoption. RM systems are a usually a big investment. It would be helpful to have an implementation to play with. To test, to see how it behaves to check, if and how such a model can be integrated to your system. Ideally something that you can adapt and change according to your needs. Not for production use but for everything else. Something that fills the gaps of the standard, not as a replacement, but as something that turns it into a runnable piece of code, provides a test system and the necessary tools. What am I talking about? Yes, an open source implementation. Ideally it would not come from a vendor but from a neutral organization. Illusion? May be, but also not impossible. There is a community already in ECM, perhaps MoReq2010 will get some attraction…
Part I: MoReq2010 And The New RM Debate
Part III: CMIS and Moreq2010: Good Match Or Just Overlap? http://blogs.opentext.com/vca/blog/1.11.623/article/1.26.954/2011/7/26
- Posted on Jul 07, 2011 at 2:30 PM GMT by Jens Huebel
- 4,454 views
- 2 comments
- Tags: moreq, moreq2010, recordsmanagement, standards
- Categories: Records Management
Recently a new version of the MoReq specification (MoReq2010) has been published [1] by the DLM forum.
MoReq2010 is different in many aspects from other standards in the records management area including previous versions of MoReq itself. This has led to an interesting debate about a new era in records management [2], [3] [4] [5].
Let's take a look at the specification: What it is about and how it is different to other records management standards?
Records Management is about managing the lifecycle of corporate records from the creation at the beginning, the flow through business processes, the preservation and archiving phase and finally the disposal. Records Management standards for document management systems have the purpose to guarantee a certain behavior of the system that allows enforcing rules how to manage your content. Those rules can have their origin in legal requirements, company or department policies. There are many aspects of such a system behavior, just to name a few:
- prevent deletion before an assigned retention period has passed
- guarantee access only to authorized users
- keep track of changes and an audit trail for each record
- reporting and query capabilities
- classification of records making them manageable on large scale
- following workflows and processes for disposition of records
RM standards are often similar on a high level but they differ when it comes to details or they focus on a certain application area like US DoD 5015.2 for the military government.
Previous attempts to standardize and certify repository behavior are often specified in text form. An example from the DoD spec 5015.2 version 3:
"RMAs shall provide the capability to define different groups of users with different access privileges. RMAs shall control access to file plan components, record folders, and records based on group membership as well as user account information."
Test cases then are concrete scenarios and look like this: Jan Rangel and Dan Martinez have access to all record categories and record folders. Users assigned the Local Records Administrator Role have permission to create folders in those categories to which they have filing access.
Over time you can imagine that demands and requirements have increased. The rule set gets more and more complex. The test cases focus on one specific use case which often is very different from those of customers in their daily business. More complexity leads to more complex implementations to more specializations, to longer specs taking more time to get finalized. And of course more complexity always means more risk for inconsistencies or implementation issues. Sometimes this has weird implications like features only implemented to pass certification or configuration switches always turned off except for certification.
Customers still often insist on certification according to specific standards. Not because it reflects their business needs but they want to get an approval from a neutral instance that the system follows records management rules. Not perfect for them but provides peace of mind and assurances for compliances.
Most often howeverorganizations exist in a world different than what the standards anticipate. RM standards traditionally define a system living in a closed world. There is one system managing all records and a records manager overseeing everything and having all under his/her control. Reality is much more dynamic. Organizations change and with that their structure and responsibilities. There are mergers and acquisitions bringing new RM and non-RMrepositories to manage. Any larger organization will have more than one document management system in place. There may be some specialized for the needs of a specific department. Others may be replaced or updated with newer incompatible versions. Vendors may change due to shifted priorities in their product lines. Some systems may be discontinued and disappear from the market. Very large organization must decentralize somehow to keep their systems manageable. All this heavily influences the organization of your records, but RM standards do not address these areas. Sometimes not even a full-blown RM system is required; put perhaps there is just a need to get a little bit of structure in your 100TB file shares without moving them entirely.
MoReq2010 is the first attempt to go away from such a centralistic and monolithic behavior. MoReq2010 tries to reduce the functionality to the required minimum. Instead of global behavior it defines a set of services, functions and data types plus their behavior. Some of them are mandatory others are optional. Later more specific modules with additional services or data types may be added (for example a module specialized for healthcare requirements). MoReq2010 does not insist to have everything in a single system. Classifications can be handled in a different system than where the content is stored. MoReq2010 defines how to export and optionally import data to get them from one system into another and to guarantee completeness and consistency. It does not assume necessarily that you have a single file plan for the entire company but it addresses how to share one between multiple systems. Services can be tested in an automated way. There will be no debate how to model a certification test case in your system and whether the UI is appropriate for that task. Instead you can run a piece of code performing method calls and checking conditions and finally indicating success or failure. You can distribute responsibilities between system calling methods and agreeing on data structures and formats.
So as a second example here is another excerpt from MoReq2010. In addition to a behavioral description MoReq2010 defines a service dealing with user management and a function to create a user. Compare this with the example from above:
3. User and Group Service
F14.5.179 User – Create
System Identifier: 2cde7448-6c71-4cff-988a-973e0701a824
Title: User – Create
Description: Create a user
Entity Type: User (E14.2.16)
Entity metadata modified:
• System Identifier (M14.4.100)
• Created Timestamp (M14.4.9)
…
Purpose: Access control , Event generation
The idea is: Keep it simple where possible. Focus on key areas and do not try to address every aspect in one system. Make data shareable and exchangeable but ensure consistency and well defined behavior.
Now the big question is: Is MoReq2010 the revolution in RM? Is this a new era or just another nice try of a theoretical attempt? And how does it work? Let's have a closer look at the spec. Well I guess this is another article…
Will they be successful? This is difficult to answer and the specification is just yet released. Customers will decide if they have addressed the important areas. Vendors will decide if their ideas are implementable with a reasonable effort. All this will take time and like with every other standard it is the chicken-and-egg problem at the beginning… There are many factors making a standard successful or not and not only technical ones. But the debate that has started is promising.
We would love to hear some comments from you: What are your expectations from the next generation records management system? How could a standard help you? What are your main pain points today?
And you don't want to read through the 520 pages of the spec? Then wait for the next part:
Part II: MoReq2010: A Look Into The Specification
Part III: CMIS and Moreq2010: Good Match Or Just Overlap? http://blogs.opentext.com/vca/blog/1.11.623/article/1.26.954/2011/7/26
[1] http://moreq2010.eu/
[2] http://thinkingrecords.co.uk/2011/05/06/how-moreq-2010-differs-from-previous-electronic-records-management-erm-system-specifications/
[3] http://www.realstorygroup.com/Blog/2162-Moreq2010-a-DOD5015-slayer
[4] MoReq2010 : Met with Enthusiasm and some Criticism
http://blogs.opentext.com/vca/blog/1.11.623/article/1.26.855/2011/6/7
[5] http://ecmtalk.libsyn.com/ecm-talk-007-the-launch-of-mo-req-2010
- Posted on Jun 07, 2011 at 6:00 PM GMT by Tracy Caughell
- 3,365 views
- 0 comments
MoReq2010 – met with Enthusiasm and some Criticism.
The Modular Requirements for Records Systems, MoReq2010 ,Volume 1 was published June 6, 2011. It is an achievement accomplished from herculean efforts of its author Jon Garde, but also, in no small part, a volunteer effort that has been met with both enthusiasm and criticism.
- Enthusiasm of the underlying concepts - the need to have standard structures in place for managing active content and content to be preserved properly for all time in all systems; criticism of ‘do we really need another standard?’, or ‘don’t we have a standard for that already?’.
- Enthusiasm for the concept of crowd-sourcing the review; criticism of not allowing enough time for review.
- Enthusiasm for the potential for feedback to be incorporated into the finished product; criticism for not including all comments.
- Enthusiasm for the idea of evolving the standard over time and allowing for versions and ‘pluggable’ modules; criticism for the confusion that will bring to a certification effort and tracking which versions of which modules and which products go with each.
I could go on and on about the highs and lows throughout this process, but one thing is clear – experts, RM Professionals, vendors, consultants and analysts around the world have taken notice. MoReq2010 is hot off the press, and one thing is clear – it will not go quietly into the night. Even the criticisms are welcome as this will help shape the evolution of the standard, and we are all welcome to take part in the evolution – get in the game, follow the DLM Forum, attend the sixth triennial conference in December 2011, sign up for the upcoming workshops, and keep the conversations going on Twitter with #MoReq2010.
More about my personal experience in this process later, but for now I am off to read the spec with great anticipation – did my comments make it in? how large were the changes? Perhaps a few highs and lows and enthusiasm and criticism are in my near future.
Tracy