Transparent and defensible Auto-Classification designed for Records Managers by the industry leader in Records Management

Corporate information governance practitioners know that defensible disposal of unstructured content is a key outcome of sound information governance programs. The trouble is that those same practitioners know that very few organizations actually actively dispose content that no longer has business or legal value.

So what’s stopping them? The simple truth is that instilling a sound approach to records management as part of the organization’s information governance strategy is rife with challenges.

Explosive content volumes, difficulty with accurately determining what content is considered a business records from “transient”, or non-business related content, eroding IT budgets due to mounting storage costs, and the need to incorporate content from legacy systems or merger and acquisition activity are but a few.

[slideshare id=10461437&doc=whatif-opentextauto-classification-111204203321-phpapp01]

The Classification Paradox: The Core Challenge

For records managers and others responsible for building and enforcing classification policies, retention schedules, and other aspects of a programmatic records management plan, the problem with traditional, manual classification methods can be summed up by what is called The Classification Paradox.

At the core, the issue is that content needs to be classified or understood in order to determine why it must be retained, how long it must be retained and when it can be dispositioned.  Managing the retention and disposition of information reduces litigation risk, it reduces discovery and storage costs, and it ensures organizations maintain regulatory compliance.

Paradoxically, classification is the last thing end-users want (or are able) to do.  Users see the process of sorting records from transient content as intrusive, complex, and counterproductive. On top of this, the popularity of mobile devices and social media applications has effectively fragmented the content authoring market and has eliminated any chance of building consistent classification tools into end-user applications.

Obviously if classification isn’t being carried out there are serious implications when asked by regulators or auditors to provide reports to defend the organization’s records and retention management program.

User concerns aside, records managers also struggle with enforcing policies that rely on manual, human-based approaches. Accuracy and consistency in applying classification is often inadequate when left up to users, the costs in terms of productivity loss are high, and these issues, in turn, result in increased business and legal risk as well as the potential for the entire records management program to quickly become unsustainable in terms of its ability to scale.

Solving the Mystery: Auto-Classification for a Defensible and Transparent Records Management Program

So what is the answer? How can organizations overcome the challenges posed byThe Classification Paradox? The requirement is clear: a solution that provides automatic identification, classification, retrieval and, ultimately, archival and disposal capabilities for electronic business records and transient records as policy dictates.The answer? OpenText Auto-Classification.

OpenText Auto-Classification is the next-generation solution that combines industry-leading records management with cutting edge semantic capabilities for classification of content. It eliminates the need for business users to manually identify records and apply requisite classifications. By taking the burden of classification off the end-user, records managers can improve consistency of classification and better enforce rules and policies.

More importantly though, OpenText Auto-Classification makes it possible for records managers to easily demonstrate a defensible approach to classification based on statistically relevant sampling and quality control. Consequently, this minimizes the risk of regulatory fines and eDiscovery sanctions.

OpenText Auto-Classification offers a best-of-both-worlds solution to what has been a series of extremely expensive and complex issues to solve.

In short, it provides a non-intrusive solution that eliminates the need for business users to sort and classify a growing volume of low-touch content, such as email and social media, while offering records managers and the organization as a whole the ability to establish a highly defensible, completely transparent records management program  as part of their broader information governance strategy.


Apply records management classifications as part of a consistent, programmatic component of a sound Information Governance program to:


  • Litigation risk
  • Storage costs
  • eDiscovery costs


  • Compliance
  • Security
  • Responsiveness
  • User productivity and satisfaction


  • The fundamental difficulties in applying classifications to high volume, low touch content such as legacy content, email and social media content
  • Records manager and Compliance Officer concerns about defensibility and transparency



Automated Classification: Automate the classification of content in OpenText Content Server inline with existing records management classifications

Advanced Techniques: Classification process based on a hybrid approach that combines machine learning, rules, and content analytics

Flexible Classification: Ability to define classification rules using keywords or metadata

Policy-Driven Configuration: Ability to configure and optimize the classification processwith an easy “step-by-step” tuning guide

Advanced Optimization Tools: Reports make it easy to examine classification results, identify potential accuracy issues, and then fix those issues by leveraging the provided “optimization” hints

Sophisticated Relevancy and Accuracy Assurance: Automatic sampling and benchmarking with a complete set of metrics to assess the quality of the classification process

Quality Assurance Workbench: Advanced reports on a statistically relevant sample to review and code documents that have been automatically classified to manually assess the quality of the classification results when desired