After attending our first Enterprise World, I have just one word to define it: intense. In my memory, there are a huge number of incredible moments: spectacular keynotes, lots of demos and amazing breakout sessions.
Now, trying to digest all of these experiences, collecting all the opinions, suggestions and thoughts of the customers that visited our booth, I remember a wonderful conversation with a customer about data mining techniques, their best approaches and where we can help with our products. From the details and the way he formed his questions it was pretty clear that, in front of me, I had a data scientist or, maybe someone who deeply understands this amazing world of data mining, machine learning algorithms and predictive analytics.
Just to put it in context, usually the data scientist maintains a professional skepticism about applications that provide an easy-to-use interface, without a lot of options and knobs, when running algorithms for prescriptive or predictive analytics. They love to tweak algorithms, writing their own code or accessing and modifying all the parameters of a certain data mining technique, just to obtain the best model for their business challenge. They want to have the full control in the process and it is fully understandable. It is their comfort zone.
Data scientists fight against concepts like the democratization of predictive analytics. They have good reasons. And, I agree with a large number of them. Most of the data mining techniques are pretty complex, difficult to understand and need a lot of statistics knowledge just to say, “Okay, this looks pretty good.”
Predictive models need to be maintained and revised frequently, based on your business needs and the amount of data you expect to use during the training/testing process. More often than you can imagine, models can’t be reused for similar use cases. Each business challenge has its own data related, and that data is what will define how this prescriptive or predictive model should be trained, tested, validated and, ultimately, applied in the business.
On the other hand, a business analyst or a business user without a PhD can take advantage of predictive applications that have those most common algorithms in a box (a black box) and start answering their questions about the business. Moreover, usually their companies can’t assume the expensive compensation of a data scientist, so they deal with all of this by themselves.
But, what can we do for you, data scientist?
The journey starts with the integration of distinct sources from databases, text files, spreadsheets or, even, applications in a single repository, where everything is connected. Exploring and visualizing complex data models with several levels of hierarchy offers a better approach to the business model than the most common huge table method. Having an analytical repository as a reflection of how the business flows, helps in one of the hardest parts of the Data Scientist: problem definition.
Collecting data is just the beginning, there is a huge list of tasks related to data preparation, data quality and data normalization. Here is where the business analyst or the data scientist loses much of their precious time and we are here to help them, accelerating time from raw data to value. Once they have achieved their goal of getting clean data, a data scientist begins the step of analyzing the data, finding patterns, correlations and hidden relationships.
OpenText Big Data Analytics can help provide an agile solution to perform all this analysis. Moreover, everything is calculated fast and using all your data, your big data, offering a flexible trial and error environment.
So the answer to my question:
OpenText Big Data Analytics can reduce the time during the preparation process, increasing time where it is really needed: analysis and decision making, even if the company is dealing with big data.