NLP or natural language processing is seeing widespread adoption in healthcare, call centres, and social media platforms, with the NLP market expected to reach US$ 61.03 billion by 2027. In this article, we will look at how NLP works and what companies can do with it.
What is Natural Language Processing?
Natural Language Processing (NLP) is a branch of computer science designed to make written and spoken language understandable to computers. The language that computers understand best consists of codes, but unfortunately, humans do not communicate in codes. Well, maybe a little ;), but people prefer to use natural language. NLP is ‘the technology of natural language processing that has the ability to transform text or voice into coded, structured information based on an appropriate ontology’, according to Gartner. In this article, we look at what is Natural Language Processing and what opportunities it offers to companies?
Machine learning is the brain behind NLP
Writing rules in code for every possible combination of words in every language to help machines understand language can be a daunting task. That is why natural language processing techniques combine computational linguistics– rules-based modelling of human language – with statistical analysis– based on machine learning and deep learning models. These statistical models serve to provide the best possible approximation of the real meaning, intention and sentiment of the speaker or writer based on statistical assumptions.
Machine learning relies heavily on data to make these assumptions. Without data, artificial intelligence can’t learn. A corpus of text or spoken language is therefore needed to train an NLP algorithm.
Applications of Natural Language Processing
NLP controls well-known computer programmes, including translation programmes such as Google Translate or DeepL, voice assistants such as Siri, Alexa or Google Assistant, or chatbots such as Billie from bol.com or the Allerhande chatbot. But there are also less well-known applications that rely on NLP. In the healthcare sector, for example, NLP technology is used to generate insights from previous patient data as per NHS. Unstructured data in healthcare can be accurately organised with NLP to generate insights for patient treatment or to improve predictive analytics about patient health. Below are some of the areas in which NLP is possibly used:
- Automatic translation from one language into another
- Summarising text (useful for extracting relevant text from large studies, for instance)
- Speech recognition or transcribing spoken language to text (text to speech)
- Translating written text into spoken language
- Sentiment analysis – how positive or negative the language is
- Text classification – assigning predefined categories to text documents
- Question and answer – understanding the meaning of questions and giving answers
- Search-question analysis and content analysis – determining what a person’s intentions and needs are when interacting with a machine (chatbot, search engine, voice assistant)
- Spam detection (detecting words, grammatical errors)
Difference between NLP, NLU and NLG
Natural Language Processing is not a single technique but comprises several techniques, including Natural Language Understanding (NLU) and Natural language Generation (NLG). These three techniques work hand in hand.
Natural Language Understanding
Whereas NLP is mainly concerned with converting unstructured language input into structured data, NLU is concerned with interpreting and understanding language. The grammar and context are also taken into account so that the speaker’s intention becomes clear. NLU uses AI algorithms (artificial intelligence algorithms) for the purpose of natural language processing in AI. These algorithms can perform statistical analyses and then recognise similarities in the text that has not yet been analysed.
People say or write the same things in different ways, make spelling mistakes, and use incomplete sentences or the wrong words when searching for something in a search engine. With NLU, computer applications can deduce intent from language, even when the written or spoken language is imperfect. NLP potentially looks at what was said, and NLU looks at what was meant.
Natural Language Generation
‘Natural language generation (NLG) is the process of transforming data into natural language using artificial intelligence.’ according to the Marketing AI Institute. NLP is the generation of text based on structured data. Therefore, NLP can also be used the other way around by placing the responsibility for communication with the computer and not with the human using NLP tools. For example, NLP can create content briefings and indicate which content should be covered when writing about a certain subject. This can even be done for different expertise levels or different stages of the sales funnel.
How does Natural Language Processing work: 6 phases of NLP
NLP consists of several phases. The first phases are largely focused on converting text into structured data, while the later phases are more focused on extracting meaning. This process can be divided into 6 phases:
1. Pre-processing phase
Just like plucking the feathers from a chicken and cutting it into pieces, this phase is about stripping the text of all unnecessary elements so that the algorithm can better digest it later on. This means, among other things, removing accents, HTML tags, capital letters, special characters, converting written numbers to their numerical form, etc.
Tokenisation, which involves converting text into smaller units (tokens), plays an important role in this.
The removal and filtering of stop words (generic words containing little useful information) and irrelevant tokens are also done in this phase.
Morphological or lexical analysis
This phase focuses on the structure and construction of words. Several techniques are used, including stemming and lemmatisation. This analysis aims to reduce the number of stored tokens as much as possible. So if there is already a token for the verb ‘to cook’, then rules can be created to also associate the noun “cook” and its conjugation ‘cooks’, for example. And if a verb is conjugated, the root can be derived.
In this phase, sentences are parsed according to formal grammar. By indicating grammatical structures, it becomes possible to detect certain relationships in texts.
This is the process of extracting meaning from a text. With the help of semantic analysis, computers can derive connections between words, sentences and context. For this, NLP uses a number of building blocks: entities, concepts, relations and predicates. These building blocks are automatically extracted from a text using a trained algorithm.
Discourse integration looks at previous sentences when interpreting a sentence. For example, in the sentence, ‘Céline likes dogs a lot. She has about ten.’, discourse assigns the word ‘she’ to ‘Céline’.
The last phase of NLP, Pragmatics, interprets the relationship between language utterances and the situation in which they fit and the effect the speaker or writer intends the language utterance to have. The intended effect of a sentence can sometimes be independent of its meaning. For example, the sentence ‘It couldn’t be better!’ can also mean things are going badly.
What companies can do with NLP
Why is NLP also useful for companies that do not offer a search engine, chatbot or translation services? Because with NLP, it is possible to classify texts into predefined categories or extract specific information from a text. Classification or data extraction can help companies extract meaningful information from unstructured data to improve their work processes and services. Here are some examples.
1. Data Extraction
Data extraction helps organisations automatically extract information from unstructured data using rule-based extraction. One example would be filtering invoices with a certain date or invoice number. Or perhaps automatically analysing email attachments or filtering data by subject line. This can also be useful for making corrections to the extracted information.
2. Topic Classification
Sorting text into predefined categories based on content (also called topic classification) is an application of NLP that is useful for a company’s customer service. Tickets or emails from customers are automatically classified and put into different categories like ‘price information’, ‘complaint’, and ‘technical problem’. This helps organisations to improve their workflows and provide better customer service because the customer is immediately directed to the right employee/department.
3. Sentiment analysis
Another example of an NLP application from which companies can derive value is sentiment analysis. Sentiment analysis is used to read the emotional charge of a text without having to read the text. This is useful, for example, when analysing social media posts, emails or customer reviews. Tracking customer opinions is essential for providing good service, but also for market research or for tracking the reputation or progress of a brand.
Sentiment analysis is also used for research to get an idea about how people think about a certain subject. And it makes it possible to analyse open questions in a survey more quickly.
4. Intent classification
This is the classification of text based on customer intent. It is possible to use this to classify customer emails or behaviours on a scale ranging from not interested to interested. This makes it possible to proactively reach customers who may want to try a product or send the right sales email at the right time.
It is clear that Natural Language Processing can have many applications for automation and data analysis. It is one of the technologies driving increasingly data-driven businesses and hyper-automation that can help companies gain a competitive advantage. In future, this technology also has the potential to be a part of our daily lives, according to Data Driven Investors.