7 steps to getting started with your AI journey

Artificial intelligence (AI) & machine learning (ML) are the current buzzwords resonating across all businesses. They’re considered to be a game changers, potentially disruptive and are supposed to make life and business better for everybody in general. The effects of AI adoption will be magnified in the coming decade, as manufacturing, retail, transportation, finance, health care, law, advertising, insurance, entertainment, education, and every other industry transform their core processes and business models to take advantage of machine learning systems.

As business leaders start planning and strategizing on how to make best use of AI for potential new business models, process automation and cost efficiencies, they need to understand and realize that road to AI & ML is a journey and not a sprint.

To get started on the AI roadmap, every business leader should ask their team to don their thinking caps and give some thought to these 7 basic steps.

1. Defining a use case

As the saying goes ‘well begun is half done’. It is imperative that business owners and leads spend some time on clearly defining and articulating the business problem/challenge that they would like AI to solve. The more specific the goal, the better the chances of being successful.

For example, “I want to increase store sales by 10%,” as a goal is not good enough. A more concise goal like “I want to increase store sales by 10% by monitoring demographics of foot traffic coming in,” goes a long way in articulating the goals and sets the right objectives across the board.

2. Verifying data availability

Once a use case definition is in place, the next step is to validate if the current processes and systems in place are capturing and tracking the data that would be needed to perform the required analysis.

To the surprise of many, a big portion of the overall time and effort will be spent on data ingestion and data wrangling techniques. Ensure you are capturing the right data, doing the right transformations, capturing sufficient data volumes and the right variables/features. Focus on data governance aspects as both the quality and volume of the data is critical for a successful outcome.

A variable or feature is a piece of measurable information. For example, age, gender, ethnicity, and yearly income for a set of people would be four possible features of that set.

3. Performing basic data exploration

While it is easy to get carried away and jump head first into model building exercises, it is crucial to go through a quick data exploration exercise to validate your data assumptions and understanding. Ascertain whether the data is telling you the right story based on your subject matter expertise and business acumen.

Data exploration also helps in understanding what the significant variables or features should/could be, and what kind of data categorizations should be created for use as input for your potential models.

4. Defining a model building methodology

Focus on the hypothesis rather than the end goal the hypothesis should achieve. Validate your hypothesis by running tests to see what variables or features are significant and will improve things.

Involve the business and domain experts. It is critical to get continuous feedback from your business and domain experts to validate your understanding and to ensure everyone is on the same page. The success of any ML model depends on successful feature engineering. To derive better features, a subject matter expert always adds more value than a fancy algorithm.

5. Defining a model validation methodology

You will need to define performance measures to help you evaluate, compare and analyze results from multiple algorithms. In turn, this will help you further refine the specific models. For example, if working with a classification use case, classification accuracy (number of correct predictions made divided by the total number of predictions made, multiplied by 100) would be a good performance measure.

You will need to divide your data into two data sets: a test set and a training set. The algorithm will be trained on the training dataset and evaluated against the test set. This may be as simple as selecting a random split of data (60% for training, 40% for testing) or may involve more complicated sampling methods.

Again, involve the business and domain experts to validate the findings and ensure you are on the right track.

6. Automating & production roll out

Once the model is built and validated, it’s time to roll it out into production. Start with a limited roll out, say by store or branch, and get continuous feedback from your business users on the model behavior and outcome. After a dry run of few weeks or months, roll out to the broader audience.

Select the right platform and tools to automate the data ingestion processes and put systems in place for result dissemination to appropriate audiences. The platform should be able to provide multiple interfaces based on the end users’ know-how. For example, business analysts may want to do further data analysis based on the model results while casual business users just want to interact with data via dashboards and visualizations.

7. Updating the model periodically

Once a model is published and deployed for use, you’ll need to monitor continuously and understand its validity so you can update the model as needed.

There are plenty of reasons for models to become out of date: the market dynamics change, your company changes, and your business model changes (especially if the model is successful). Models are built on historical data to predict future outcomes, as market dynamics shift away from your historical ways of doing business, the model’s performance would deteriorate. Keep asking yourself “What process will I follow to keep updating the model over time?”

OpenText^™ Professional Services can guide you and your team through these building blocks and help define a successful go forward AI strategy for your organization via their Magellan^™ Cognitive Strategy Workshop offering. Benefit from our experience in working with various organizations across many industries in gleaning data insights. The OpenText Professional Services team includes data scientists and architects with expertise on big data, machine learning, text mining and algorithms.

OpenText Magellan is a flexible artificial intelligence (AI) and analytics platform that combines machine learning, advanced analytics, and enterprise-grade business intelligence (BI) with the ability to acquire, merge, manage, and analyze structured and unstructured big data. The platform combines open source machine learning with advanced analytics, enterprise-grade BI, and capabilities to acquire, merge, manage and analyze Big Data and Big Content stored in your Enterprise Information Management (EIM) systems. Magellan enables machine-assisted decision making, automation, and business optimization.

This post is part of an ongoing series on machine learning. Learn about how to leverage Apache Spark for development or programming algorithms.

OpenText

OpenText, The Information Company, enables organizations to gain insight through market-leading information management solutions, powered by OpenText Cloud Editions.

See all posts