OpenText™ Knowledge Discovery (IDOL) provides a data analytics platform for enterprises who need to extract maximum value from all their text, audio, video, and image data, from any repository in any file format. Providing data extraction, data enrichment, precise search and knowledge discovery, Knowledge Discovery helps organizations discover valuable information they did not know they had, whilst also identifying compliance risk associated with document contents such as personally identifiable information (PII) using entity grammars.
With an unparalleled history in artificial intelligence and machine learning, Knowledge Discovery provides a unique set of optimized models to fit any application, accelerating time to value. Connecting and gathering information from diverse locations with over 160 connector types, including Public and Private Cloud and On-Premise, Salesforce, Microsoft® 365, Google Workspace and OpenText Content Solutions and can access and retrieve information and knowledge from over 2000 file types.
Check out the latest OpenText Cloud Editions (CE) announcement to learn more about the most recent release:
December 2024: What’s new in OpenText Knowledge Discovery CE 24.4
Organizations are looking to access all their enterprise data; but in a world where 90% of existing corporate knowledge is in unstructured formats, over 50% of organizations are not tapping into this aspect of their knowledge with any form of discovery. Customers can use Knowledge Discovery with new interfaces, including using natural language questioning to reach their information securely across all users.
This OpenText Knowledge Discovery (IDOL) 24.4 release includes various functional and performance improvements, new connectors, file format support, and many other additions.
New API added
Search Abstractor Rest API – created to support conversational question and answering with better context for AI generated responses. Conversation server, working with FAQ answers through Answer bank, Fact bank and now LLM’s remembers the context of questions within the conversations service in Knowledge Discovery. Adding to the great governance provided by curated answers.
Filtering support in the Search Abstractor adds parametric fields for pre-filtering of documents. The search results can provide microscopic view of either single or multiple documents through criteria for fine grain filtering. The searching can now utilise prior knowledge to allow focus on RAG document retrieval, by using parametric fields to build a subset of entities for RAG.
Working across text documents and images simultaneously in the search abstractor. Searching can now take criteria of both text and images as search criteria, and the results will also include similar images making Multi Modal search available.
Ingest & connectors
- Salesforce – Updating the connector to stay up to date with the changes in Salesforce.
- Additional Pre filtering for focused selection added to:
- OpenText Core Content Management
- OpenText Content Management
- IBM FileNet P8 Each document in FileNet can have multiple binary files which we now interrogate.
Text analytics
- Analytics: LLM based image matching – Image based transformers can be used to generate vectors for the Knowledge Discovery (IDOL) Index. Customers can now search for similar images based on not just the textual description but also the actual image content.
Media Server
- The Media Server as a NiFi Processor is now available with full functionality. This allow NiFi workflows and the ability to scale and manage the Media Server app though Nifi clustering.
- Demo application with abilities to process and analyze images and other functionalities is now available in the Media Server for ease of use.
File content extraction
- Filtering of Source Code comments
- Support added for Metadata output for;
- .mht files.
- QuickTime (.mov) files.
- Extended format detection, with support for 93 additional file formats.
- In HTML Export the pdf2sr reader to extract images from pages in a PDF file, you can now configure the size of the images to produce.
Deployment & other improvements
- HELM charts for Kubernetes provide a comprehensive set of options to assist complex deployments. With the 24.4 release we bring further enhancements to our HELM charts.
- Named Entity Recognition SDK: You can load multiple grammars and then dynamically turn off what it is not needed for a particular document, allowing the flexibility of swapping grammars with less degradation of performance in comparison to the loading unloading cycle of grammars. The new thing is a reducing of the overall impact of this process.
July 2024: What’s new in OpenText Knowledge Discovery (IDOL) CE 24.3
OpenText Knowledge Discovery CE 24.3 is a significant release for the third quarter of CY24. There is an increasing need for companies to reach a greater variety of data, which adds to the complexity of their search types. We allow all customers to achieve a suitable response in their enterprise-wide search across all levels of an organization.
Now part of Content Services, the combination of best of breed products will help organizations who are looking to access all their corporate knowledge through enterprise search. Bringing additional abilities to derive deeper insights, reach actionable conclusions quicker, and gaining a new level of investigative analytics, across content, teams, and projects. Document level security is also enhanced, with extra features to report and export securely and track data delivery end to end.
The Knowledge Discovery 24.3 release includes various functional and performance improvements, new connectors, file format support, and many other additions.
The main improvements in version 24.3 are listed below:
Ingest
- Microsoft Teams, Zoom, Cisco Webex and Google Chat connectors – New connectors allow customers to collect collaborative video meetings for retrospective analysis and/or content retrieval as part of aviator search.
Text Analytics
- Search abstractor – Customers will have better context provided by the RAG to pass on to the GenAI, therefore experiencing a higher likelihood of a correct answer being produced.
- Specific document retrieval – Customer can ask the system to find and return a specific document that matches their unique specification.
Rich Media Analytics
- OCR improvement – Improved OCR for better support of scrolling text. Customers will experience a higher accuracy of extracted text from video imagery.
- Speaker ID improvement – Customers will experience a higher accuracy of Speaker ID from video or audio files.
KeyView
- Added Audio and Video to the Export SDK.
- Added 30 new formats to File detection.
- Metadata API Improved to limit duplicated data.
- New .NET API.
- Added support for Python 3.12.
Solutions
- Knowledge Discovery Discover – Discover provides investigative analytics and advanced UI for searching and analyzing relations between objects, with project and team collaboration along with full oversight on the analytics process.
- Knowledge Discovery for Microsoft Exchange – Knowledge Discovery can be added as a BOT in MS Exchange allowing customers to directly access the functionality via simple natural language prompts emailed to a BOT email address in MS Exchange.
Deployment & Licensing changes
All components are now published to the Public Docker Hub repository to allow easier installation, maintenance and upgrades.
- Eduction Python EDK – Customers who prefer to program using the popular Python language can now use it to control our Eduction engine, simplifying their integration experience.
- Eduction for Windows ARM – OEM users of Eduction EDK can now deploy on Windows ARM systems.
- Additional license feature
- From CE 23.2, we are introducing an additional license feature – a “version.key”
- The file is available in the SLD portal for customers with an active support contract and is required for all Knowledge Discovery installations running CE 23.2.
- New key will be issued with every future Knowledge Discovery release.
April 2024: What’s new in OpenText Knowledge Discovery (IDOL) CE 24.2
Organizations are looking for conversational access to enterprise data; in a world where 90% of existing corporate knowledge is in unstructured data, over 50% of organizations are not tapping into this aspect of their knowledge with any form of discovery. Customers can use the new interfaces with Q&A using natural language questioning to deliver their information with data securely to all users.
This OpenText IDOL 24.2 release includes various functional and performance improvements, new connectors, file format support, and many other additions.
New solution added: Teams client
- Customers can access functionality using simple natural language prompts directly via the Teams interface.
Ingest & connectors
- Drupal Connector – Updated to support the latest API changes and allow data extraction from old and new Drupal versions.
- Google Workspace Connectors – Content extraction is now accessible from the major Google Workspace apps including Mail, Calendar and Chat using dedicated connectors.
- Web Connector 2FA – Updated to allow 2Factor Authentication.
- Additional new connectors – Stack Exchange; Moodle and OpenText eDocs.
- IDOL Media Server – Can now run within NiFi, allowing processing of media streams & sources.
Text analytics
- Analytics: Search abstraction – IDOL can now automatically decide which index type to use to achieve the best response. The functionality abstracts the complexity of search types, democratizing Enterprise search, and allowing all users to achieve a suitable response, across all levels of an organization.
- Multi-document summarization – Generative summary compiled from answers sourced from multi documents. The functionality provides richer answer to any question asked as it can be the combination of multiple parts sourced from different documents.
- Dynamic clustering of vector results to identify grouping – Customers can use the grouping of results to select a variation in documents returned rather than returning very similar documents with no added information.
- Analytics: LLM based image matching – Image based transformers can be used to generate vectors for the IDOL Index. Customers can now search for similar images based on not just the textual description but also the actual image content.
Media Server
- The Media Server as a NiFi processor is now available with full functionality. This allow for NiFi workflows and the ability to scale and manage the Media Server app though Kubernetes.
- Demo application with abilities to process and analyze images and other functionalities is now available in the Media Server for ease of use.
KeyView
- Tabular data detection for One Note & pipe separated text.
- OCR is now available on MacOS through the KeyView SDKs.
- Header & Footer are now configurable through Python APIs.
- Significant Performance improvements through Shared Memory Streaming.
- Additional Formats, Security & Compatibility improvements.
Deployment & other improvements
- HELM charts for Kubernetes provide a comprehensive set of options to assist complex deployments. CE 24.2 brings further enhancements to our HELM charts.
- Eduction SDK Post-processing access to match context.