Under the umbrella that is Artificial Intelligence (AI), Natural Language Processing (NLP) has come a long way from symbolic AI emerging in the mid-1950’s, through statistical models like logistic regression to multilayer networks which we now call deep learning. Yoshua Bengio, Geoffrey Hinton and Yann LeCun, three deep learning pioneers and researchers, recently published a paper highlighting recent advancements in their field. One of these ‘schools of deep learning’ is Transformers, a neural network design that has been at the heart of language models such as Google’s BERT and OpenAI’s GPT-3.
Challenges in Deep Learning
Deep learning is a technique for categorizing data using multilayer neural networks that is frequently compared to how the human brain operates. Raw data is supplied into neural networks through a series of input units. This might be in the form of images, sound sampling or textual content. These inputs are subsequently mapped to the output nodes, which decide which category the input data belongs to.
Deep learning models that receive a series of objects (words, letters, picture characteristics, etc.) and output another sequence of items are known as sequence-to-sequence models, they have shown great deal of success in tasks such as machine translation. A neural machine translation model is composed of an encoder and a decoder. The encoder goes over each word in the input sentence and combines the data into a vector called the context. After processing the entire input sentence, the encoder sends the context to the decoder, and the decoder begins constructing the output sentence word by word. Both the encoder and the decoder are usually built on recurrent neural networks since it has internal memory.
Despite the fact that computer processing power and data availability have provided perfect conditions for deep learning, the range of problems that can be tackled by deep learning systems today is still limited. These systems are skilled at specialized tasks, but according to pioneers they are limited in the scope of problems they can solve. One specialized task area is natural language processing.
Bahdanau et al., 2014 and Luong et al., 2015 developed and refined a method known as “Attention” which significantly enhanced the quality of machine translation systems. The model’s Attention allows it to focus on the important sections of the input sequence as needed. Attention is a concept that helped improve the performance of neural machine translation applications.
The Transformer is a model that leverages Attention to accelerate its training and out-perform other neural machine models in specific tasks. It’s greatest advantage, though, is how well it lends itself to parallelization. One of the benefits of Transformers is their ability to learn without the use of labelled data. Unsupervised learning allows Transformers to build representations, which they may subsequently use to fill in the blanks on incomplete sentences or produce meaningful text in response to a prompt.
Sentiment Analysis using Transformers
OpenText™ Magellan™ delivers a ready-to-use Artificial Intelligence platform which includes machine learning, data discovery, text analytics, and sophisticated visualization and dashboarding capabilities. Using the Magellan Notebook, a pretrained BERT model (Bidirectional Encoder Representations from Transformers) can easily identify emotion based on textual content.
To analyze financial news articles and classify their sentiment as Positive, Negative or Neutral, we use FinBERT, a pre-trained NLP model, freely available from huggingface.co/models. It was developed by fine-tuning the BERT language model for financial sentiment categorization using a huge financial communication corpora.
In Magellan Notebook, the code snippet below imports the necessary Python modules as well as the pretrained model (FinBERT), tokenizes the input text, and feeds the tokenized inputs into the model, which produces final layer activations that are converted into probabilities using a softmax function.
Fig 1: Using FinBERT in Magellan Notebook – processing a sample news article, relating to a Facebook stock price drop, our model results in a Negative Sentiment with max probability of 0.9695.
Fig 2: Processing another sample news article, about a stock price increase for Ford, our model results in a
Positive sentiment with a probability of 0.5562
Tell me more
Numerous freely available pre-trained transformer based models are available today for natural language processing (NLP) or even natural language understanding (NLU), including Google BERT and OpenAI GPT 3.5, to perform tasks like understanding text, performing sentiment analysis, answering questions, summarizing reports, or even generating new text. To derive value from these models and use them in machine learning applications, the work becomes refining and enhancing them to drive the desired business outcome.
OpenText Professional Services advises, guides and assists organizations with Artificial Intelligence and Transformers for NLP and NLU applications to gain insights, automate processes and optimize business workflows. Our approach is to combine Transformers with OpenText™ Magellan™ Text Mining, which provides simple semantic analysis of unstructured, text-based content. Learn more about OpenText AI & Analytics Services.
Author: Sridhar Sambarapu, Data Scientist, AI & Analytics Consulting Team