What can BERT, by Google, do for analytics?
“Wouldn’t it be nice if Google understood the meaning of your phrase, rather than just the words that are in the phrase? – Eric Schmidt”
Being human or thinking and responding like a human is what technology is trying its best to achieve with bots sounding more personalised with human like responses. Such is the influx that we have successfully migrated to reading the polarity in a text by using these various technologies. Language processing has undergone a paradigm shift in the sense that we are faced with Neuro-Linguistic Programs (NLP) which are preloaded models that summarizes linguistic knowledge. Bidirectional Encoder Representations from Transformers (BERT) is a pre trained deep learning natural language framework which has given considerable impetus to the natural language processing tasks viz. sentiment analysis, textual entailment, text classification and disambiguation of words.
BERT is a new method of pre training language representation developed by google. It is a general language understanding model. Being pre-trained on Wikipedia and Book Corpus, it helps solve many NLP tasks, such as
1. Sentence level classification
2. Question answering
3. Token level classification
Clearing up a few terms for you:
1. Bi-directional – It refers to the way or the direction in which the algorithm reads the text to conduct its analysis. In other words, when the model is run, the text can be analysed both forward and backwards. This gives deeper sense of language context and flow. Earlier text was analysed unidirectionally as left to right or right to left which suffered accuracy. For example, the words, “river bank” can have a financial association if not analysed by a bi-directional model.
2. Transformers – It is non directional and allows to analyse contextual relation between words. The transformer works to understand the relationship between all the words in the sentence regardless of their position. It has an encoder and a decoder and is a library that helps the model gain accuracy.
BERT was also trained to guess masked words to predict sentences. In Masked Language Modelling, the objective is to guess the masked tokens. Language models were created to solve the task of filling in the missing words-whether they be polysemous or homonymous.
The Masked Language Modelling of BERT has application in
1. A case where the spoken text is not clear and yet the system is able to identify the words being said. This model helps address the problem of different pronunciation.
2. A case where the audio of a discussion is being registered and the topics need identification and bucketing. When a system can identify a topic change it has a better chance to respond accurately.
BERT text analytics also helps with next Sentence Predictions. This is, exactly as the name suggests, a model that allows the system to identify relationships between sentences to suggest the next logical response while ensuring contextual integrity.
The Next sentence prediction finds its use in:
1. Chatbots that are extremely helpful as they know exactly the right things to say at the right time. They are lifelike and a pleasure to interact with.
2. Finding the right results even when the user gets the spelling or the context of the key words wrong.
Advantages of BERT
1. BERT text analytics model can be used not only to classify but also to predict and translate, to summarise and improve the understanding of NLP
2. BERT sentiment analysis can be used for entity recognition or speech tagging
3. BERT tries to create knowledge that goes beyond reading by anticipating the words
4. BERT offers several generic models which can be uploaded and calibrated to specific cases
Limitation of BERT
1. BERT being bidirectional is hard to train.
2. Exploring the surrounding text around words is computationally expensive
3. BERT’s training is expensive due to its transformer aspect
4. The model has to be fine-tuned to the desired task.
BERT and many other NLP are the game changers with its ability to fine-tune sentiment analysis. This algorithm works to improve human language understanding for humans and has proven its effects on SEO (Search Engine Optimisation). BERT may be Bert in name, not in nature.
“You shall know a word by the company it keeps – John Firth”
Authors: Jerrin Thomas,Benila Jacob and Sunil Kumar