Note: This blog is by guest author Nick Patience, Founder & Research Vice President of 451 Research. Nick is the firm’s lead analyst for AI and machine learning, an area he has been researching since 2001.
Almost every company wants to be data-driven. According to 451 Research survey data, 91% of organizations claim that at least some of their strategic decisions are driven by data, but only 14% say that nearly all decisions are made in this manner. Clearly, a gap exists between how data-driven enterprises are and how data-driven they want to be.
One way enterprises can become more data-driven is by incorporating more data types into their decision workflows. And one particularly good candidate is unstructured data.
Technically speaking, unstructured data describes any data assets that are not structured – i.e., highly organized information contained within a relational database. Email, word documents, video, and audio recordings are all examples of unstructured data.
In fact, many organizations are already aware of this type of data: 451 Research surveys show that 71% of enterprises indicate their unstructured data resources are growing more quickly compared to other business data. On its face, this fact makes a great deal of sense because unstructured data is pervasive within an organization. Imagine harnessing all the information traversing your organization’s email network to make more intelligent scheduling decisions. That’s just one example of the potential power of unleashing unstructured data.
The growth of unstructured data creates new challenges & opportunities
While analyzing unstructured data unlocks new use cases for data-driven organizations, its magnitude and ubiquity pose significant challenges.
For one, more data does not automatically translate into more insight. Information value declines over time but costs and risks do not. An outdated version of a slide deck is less useful than a newer one, but organizations must expend resources maintaining both.
To ensure data quality and relevance alongside growing numbers of data consumers, companies need to adopt better forms of information governance. Without such strategies, organizations will find themselves in a negative feedback loop: High costs and increased exposure risks create a competitive disadvantage, which decreases revenue and further augments costs and risk.
Another challenge enterprises face in terms of unstructured data assets is emerging compliance regimes spurred by the EU’s General Data Protection Regulation (GDPR). These laws put the onus on corporations to enable greater consumer control of their data. Compliance requires organizations to both understand what data they have on customers – regardless of its form – and act accordingly.
Machine learning is key to next-generation data governance
The scale of analytics required to analyze proliferating unstructured data in the enterprise can only be achieved with machine learning technology. In the context of an unstructured asset, machine learning works by identifying and extracting key elements (such as metadata) from the document and transforming its content into a more structured form, which can then be more readily searched or analyzed.
Enterprises are already incorporating unstructured data into their machine learning projects. According to 451 Research data, 42% already do so in their workflows, and this number is expected to rise to 74% in 2020.
The inclusion of unstructured data into enterprise decision-making processes will enable new and improved use cases. But these assets also challenge organizations to adopt better data governance models to ensure value. Machine learning technology provides companies with a means of unleashing these underutilized assets while mitigating the underlying risks.
To learn more about how to realize more value from unstructured data in the enterprise, check out our on-demand webinar with Nick Patience and AI & Analytics Product Marketing Director Zachary Jarvinen, “How to increase the value of your enterprise content with AI.”