By the year 2020, global revenues from big data and big data analytics solutions are expected to grow to $203 billion, from $130 billion in 2016. In today’s hyper-competitive business landscape, more and more executives are recognizing the importance of utilizing big data to make data-driven decisions and drive strategies. In fact, according to a survey of 599 business and IT decision makers, 69% believe that the biggest benefit of utilizing big data is creating better strategic decisions.
Big data analytics is made possible by data integration. By enabling access to data stored in disparate data warehouses, mapping changes from one enterprise application to another, and delivering real-time information to intended users, data integration enables enterprises to collect and clean big data coming from various systems for analysis.
Aside from big data analytics, data integration also enables other business benefits such as having a 360-degree view of datasets, faster collaboration across entire organizations, greater flexibility in selecting enterprise applications and systems, and streamlining and automating business processes.
But, what is data integration? What are the tools that enterprises today are utilizing to enable data integration? How do these tools help overcome today’s integration challenges?
What is Data Integration?
Data integration is the process of enabling users to access, deliver, and utilize data across entire organizations while maintaining its quality and integrity. It also enables changes made to data stored in one source to be reflected in other sources in real time. Gartner defines data integration as not only the process itself, but also the practices, the architecture, and the tools that enable it.
What are Some of the Tools Used Today to Enable Data Integration?
Data integration requires different tools depending on what sources need to be connected and the complexity of integration. However, selecting a data integration tool is not merely about choosing which tools offer the most features. The best approach to choosing the right integration tool is to first determine the enterprise’s integration requirements. Questions to ask include: What kind of source systems will the tool integrate, cloud-based or on-premises? What is the size of the enterprise? Does the integration require simple, lightweight solutions or does it require complex, customizable solutions?
Traditionally, data integration tools cater to different needs and submarkets. For example, a specific set of tools were required to integrate datasets and data warehouses, while another set of integration tools were required to integrate enterprise systems and applications. Integrating datasets and data warehouses was a submarket distinct and separate from that of integrating enterprise applications and systems. Then, there was also the submarket for business-to-business (B2B) data transfers.
Today, there is a convergence among these submarkets as enterprises are recognizing the need to look at data integration holistically. Enterprises are utilizing complete and comprehensive data integration tools that are capable of integrating datasets, warehouses, enterprise applications, systems, and even business-to-business IT architectures. They differ only in the technology they utilize, the types of systems they support and connect, and their capabilities.
Comprehensive data integration tools usually fall under one of the following two categories:
Enterprise Service Bus (ESB)
- ESB technology has been in use for more than 2 decades. An ESB is an integration tools that acts as a central hub to enable distinct systems and applications to communicate and connect with each other. The central hub has separate components for each integration feature such as transformation, routing, and security. This empowers enterprises to handle large integration tasks. ESB solutions are also highly customizable. One ESB solution can significantly differ from another ESB solution. This makes it ideal for complex integrations, such as integrating on-premises and legacy systems. Aside from configurability, each component can also be hosted anywhere within an enterprise’s IT infrastructure, making it well suited for expansion of an enterprise’s complex internal systems and architecture.
Integrated Platform-as-a-Service (iPaaS)
- iPaaS, a newer tool that is still based on older Open Source ESB technology ported to the Cloud, was developed to help connect on-premises and cloud-based services, applications, and processes. It includes connectors, maps, business rules, and transformation tools that allow the development of integration flows. Designed in response to the prevalence of cloud-based software and solutions, iPaaS solutions are best suited for lightweight and cloud-based integrations. iPaaS solutions utilize common lightweight web service protocols such as JSON, XML, B2B, and APIs, enabling enterprises and developers to deliver software and applications faster. iPaaS solutions are more agile, making them ideal for horizontal scalability, that is, integration with third-parties, partners, and ad hoc applications such as software-as-a-service (SaaS) solutions. The lightweight connectors and flexibility of iPaaS solutions are also suited to B2B data transfers, which require faster integration while maintaining certain data access restrictions.
It is important to remember that ESB and iPaaS are not competing technologies, but can be utilized to complement each other.
Overcoming Emerging Data Integration Challenges
New software, applications, and technologies continue to be developed at a fast pace. Enterprises are benefiting from them as they enable new functionalities and automate key business processes. However, these new software and technologies also pose new kinds of integration challenges. Data integration tools must equip enterprises to face these emerging challenges:
Challenge #1: Rise of API-Driven Integrations
- The cloud services market is expected to grow to $383 billion in 2020 from $209 billion in 2016. Driven by cloud-based applications, mobile applications, and wearable technologies, this growth is also giving rise to integration via application program interface (API). API provides a set of standards and protocols that enable sharing of datasets among applications and systems. As more enterprises adopt cloud-based solutions and gather customer data from mobile applications, they also need to use more APIs to integrate with their existing systems.
iPaaS tools help enterprises overcome this challenge by utilizing lightweight web service protocols that can handle APIs, and enable connection to a fast-growing number of cloud-based solutions. Both ESB’s and IPaaS solutions focus on offering pre-built APIs and adaptors that can be deployed by internal developer resources.
Challenge #2: Integrating Big Data Architecture
- The first requirement for enabling big data analytics is gathering enormous amounts of data for analysis. This means that enterprises need to dive deeper into their existing databases, data warehouses, and enterprise applications to collect datasets. After extracting data, enterprises need to import and load this data into big data platforms for analysis. Enterprises also have to ensure that changes made in originating sources will be reflected in the big data platform in real-time. Thus, integrating big data architecture creates a two-fold integration challenge of integrating existing systems and applications internally and integrating these internal systems with new big data analytics solutions, which are oftentimes external systems.
- Enterprises can overcome this two-fold integration challenge by utilizing both ESB and iPaaS solutions. ESB can help enterprises discover and gather data from existing systems through internal integration. iPaaS can help them connect the integrated internal data with external big data analytics solutions.
Challenge #3: The Shortage of Developer Resources to Handle Integrations
- Both ESB and iPaaS tools, with their promise of pre-built adaptors, are designed to enable Do-It-Yourself integrations both by developers in IT and by “citizen integrators” throughout the organization. This can enhance productivity in today’s market, with the rapidly growing number of data sources that must be integrated, but it also spells the rapid marginalization of these tools due to scarcity and rising costs of developers with hot integration skill sets. Just as the number of data sources will continue to increase unchecked, so too will the need for internal developer talent to build the integrations with ESB and IPaaS tools. These tools also require the time of these valuable IT and business resources that could be better directed toward gaining insights from data.
Challenge #4: Ensuring Compliance with Standards
- On May 25, 2018, enterprises and organizations that process personally identifiable information (PII) of residents in the European Union (EU) will have to comply with the General Data Protection Regulation (GDPR). The GDPR requires enterprises to review how securely they store and transmit their data. This is just one of many strict standards for data privacy and security. Others include Payment Card Industry Data Security Standards (PCI DSS), the Health Insurance Portability and Accountability Act (HIPAA) and ISO.
- With DIY tools such as ESB and iPaaS, the burden to demonstrate compliant processes with all of these standards falls squarely upon the organization and its resources. While the tools themselves may offer a level of compliance, the moment a developer puts an integration into motion, vulnerabilities are exposed. Multiply this by the effect of many disparate developers and an organization’s attempts to maintain enterprise-wide compliance for stringent frameworks can be quickly overrun.
Data Platform as a Service (dPaaS) – A New Approach to Data Integration for 2017 and Beyond
Opentext’s data integration architecture, the ALLOY™ Platform, takes data integration tools to the next level in solving today’s (and tomorrow’s) true integration challenges. In a category all its own, called Data Platform as a Service (dPaaS), ALLOY handles any patterns of integration and helps enterprises connect an ever-growing number of cloud applications and data sources. dPaaS frees users from the resource constraints of having to complete all of their integrations in house by offering integration as managed services. It also offers an inherently compliant platform and fully compliant integration and data management processes. The ALLOY Platform’s dPaaS model unifies integration and data management capabilities to provide a much better suited solution to integrate, transform, harmonize, manage and secure critical business data on-premises or in the cloud.