Natural Language Processing (NLP), or Natural Language Understanding (NLU), is a technology that can identify the meaning in phrases that are either spoken or written in natural language. It can determine the meaning of the phrases that humans naturally speak or write.

Siri, Alexa, Google Home, etc all use this technology to understand the meaning behind spoken phrases.

It’s important to understand that there is a difference between speech recognition and natural language processing. Speech recognition enables the computer to translate speech (in verbal form) to text in written form. The text that it produces is simply the spoken phrase transcribed to text.

NLP is the technology that takes this raw text and deciphers the meaning. What I mean by “meaning” here is actually known as the “intent” in the jargon of NLP. Essentially the intent is the intention behind the uttered phrase.

For example, if I say “I want to fly to Dubai”, my intent might be identified as “BOOK_FLIGHT”. That means my intention in uttering this phrase is to book a flight.

The technology does not just know that that is my intent, the intent is set up by the creator of the chatbot. It is actually fairly easy for a chatbot creator to learn how to use NLP.

In this example, the creator would set up an intent by naming the intent and then giving a list of phrases that would be associated with that intent.

They would define the intent as “BOOK_FLIGHT”.
They would then associate the following phrases with the intent by adding them to a list of phrases for that intent.

The list of phrases would look something like:

I want to fly to Paris
I need to book a flight
Book an international flight
Fly from New York to LA
Fly next week Wednesday

It should be clear that the above phrases are examples of phrases used to book a flight. Typically for a production system, the language processing technology would need around fifty different examples to accurately extrapolate to different phrases that people might use to book a flight.

It is important to also notice the words highlighted in bold. These words would change for every flight booking. One person wants to fly to Paris, another to New York, another to Dubai.

These words are parameters that are relevant to the book flight intent. In this simple example, the bot creator may define four parameters associated with the flight booking intent such as:

From City
To City
Departure Date
Return Date.

The natural language understanding not only identifies the intent, but it can also identify the parameters in the text.

For example, if the user says:

I want to book a flight to London on the 8th June

The NLU technology will identify the intent as “BOOK_FLIGHT” and will also identify the parameters as follows:

From City
To City: London Airport (LHR)
Departure Date: 8th June 2019
Return Date.

With this information, a developer can easily get the bot to ask to follow up questions to get the missing values from the user and then initiate the flight booking process. This process of getting the user to provide the missing parameters is called slot filling.

It should be noted that sometimes NLP and NLU are used interchangeably and sometimes NLP is said to mean the understanding of one-off statements, whereas NLU is said to mean a broader understanding of a number of statements within the context of an interaction.

How does Natural Language Processing work behind the scenes?

I’m not going to tell you exactly how it works, but I just want to share with you a message of slack from my colleague just in case it helps you.

Developer X: 3.30 am
I think having our own NLU is the way to go in the future. Out of the box NLU with no fluff.

Best of all, we can truly innovate and push the boundaries if we control the algorithm.
All these NLU engines do is:
Raw Text -> Stemming/Lemmatize -> Bag of Words -> TF-IDF -> Naive-bayes classifier
SO rudimentary

I hope that helps you, it didn't help me!

What NLP technology is not

NLP technology is not human-level understanding of natural language. There have been tremendous breakthroughs in AI due mostly to the recent abundance of data and computing power. AI, however, is far from approaching generalized intelligence.

NLU is an important technology that works extremely well for one-off commands (in the case of the voice assistants) and very simple conversations where the domain is well understood. The longer the conversation goes on, however, and the more general it is, the more likely it is that the AI is going to get lost.

While it can be impressive that bots understand spoken and written phrases, developers need to be highly cognizant of the context in which these bots are used. They only want to deploy bots in controlled environments where the questions or commands the bot is likely to get have been programmed into the bot. The bot is only really good at understanding the intents behind single phrases, it is not good at following the meaning through a conversation (except a very narrow conversation about a specific topic where hopefully all the possible paths for the conversation are known upfront). Like for a flight booking “conversation” for example.

NLP Use Cases

The fact that NLP is limited in its use does not mean that it won’t change (or is changing) the world in terms of the way software is used.

We have written extensively in the past about the flaws of chatbots which stem from the limited interface (either voice or chat) on which they are used and the limitations of the understanding of the AI itself. The use cases where chatbots work really well are limited right now and discovery of new functionality is hard.

While it may take potentially a long time for AI to “understand” conversations at a human level, the applications of natural language understanding technologies are extremely broad and important.

Voice interfaces are going to become more and more important in future as the speech recognition technology improves, the performance of phones and the networks improve, as devices are built with voice at the forefront of the technology and as behaviour patterns change to use voice.

All this new voice technology will be built on the power of speech recognition and natural language processing. There don’t need to be huge improvements in either of these technologies for the voice component of software to become ubiquitous.

Natural language processing will become the foundation for a new era in software and it always a critical component of any bot framework.

It is important now to understand where NLP fits into the process of interacting with a chatbot. Let's use a voice bot as an example, but note that the only difference between communicating via text and via voice is that voice needs a speech recognition step at the beginning and a voice synthesis step at the end (to convert the text into words spoken by the device).

  • The first step is speech recognition as we discussed previously.
  • The second step is the NLU converting the text transcribed by the speech recognition software into an intent.
  • The third step is applying some kind of logic to the intent to determine how to respond. The component that applies the logic is called the Dialog Manager in the jargon of chatbots. The logic in the Dialog Manager can be programmed explicitly by a programmer or can be applied by some sort of AI algorithm. The holy grail, of course, it that an AI algorithm could analyze all similar conversations and then know how to respond automatically.
  • While the voice assistants, on the whole, are pretty good at this step and respond automatically, as we have discussed at length, they are very limited outside their specific domain of expertise.
  • Once the incoming message has been processed, the Dialog manager will provide the next component in the chain, the natural language generator with a response. The response here is similar to the intent in that it is a label for a response that then needs to be translated into natural language that can be easily understood by humans.
  • The natural language generator converts the response into natural language in the form of text.
  • The final step in the process is speech synthesis. The natural language text is passed to the speech synthesis component which then speaks the text out loud.

This is the classic process for the voice and virtual assistants. It should be noted however that voice is a very inefficient means for delivering information compared to graphical interfaces in many cases. This means that in many cases, and more often in future, the dialog manager response will be in the form of an action, and in many cases that action will be to show something rather than say something.

In our view, showing something rather than saying something is going to be the future of communication with chatbots, however as you can see, whatever happens, NLP will remain the key enabling technology.

Finally, I want to mention that NLP is very easy to set up. You don’t need to be a developer to set up NLP and in fact, it is not the best use of a developers time. It is true that normally a developer is needed to program the action that takes place after an intent it recognized, but anyone can set up the NLP especially the phrases associated with each intent. You need to agree at the start which intents will be set up and what naming convention you will use, but once these are set up, anyone can add phrases to the intents.