Photo by Towfiqu barbhuiya on Unsplash
Have you ever used a smart assistant (think something like Siri or Alexa) to answer questions for you? The answer is more than likely “yes”, which means that you are, on some level, already familiar with what’s known as natural language processing (NLP).
NLP is the combination of methods taken from different disciplines that smart assistants like Siri and Alexa use to make sense of the questions we ask them. It combines disciplines such as artificial intelligence and computer science to make it easier for human beings to talk with computers the way we would with another person. This idea of having a facsimile of a human conversation with a machine goes back to a groundbreaking paper written by Alan Turing — a paper that formed the basis for NLP technology that we use today.
Different components underpin the way NLP takes sets of unstructured data in order to structure said data into formats.
Specifically, these components are called natural language understanding (NLU) and natural language generation (NLG). This article aims to quickly cover the similarities and differences between NLP, NLU, and NLG and talk about what the future for NLP holds.
Data scientists and artificial intelligence experts can use NLP to turn sets of unstructured data into formats that computers can convert to speech and text — they can even go so far as to create responses that are contextually relevant to a question you ask them (think back again to virtual assistants like Siri and Alexa). But how, exactly, do NLU and NLG fit into NLP?
One similarity that all three of these disciplines share is that they work with natural language, even though they all play separate roles. So what’s the difference between all three?
Think of it this way: whereas NLU seeks to understand the language that we as humans speak, NLP tracks down the most important bits of data to structure it into things such as numbers and text; It can even help with malicious encryption traffic. NLG, meanwhile, takes sets of unstructured data to create narratives that we can understand as meaningful.
Natural language understanding relies on artificial intelligence to make sense of the info it ingests from speech or text. It does this to create something we can find meaningful from written words. Once data scientists use speech recognition to turn spoken words into written words, NLU parses out the understandable meaning from text regardless of whether that text includes mistakes and mispronunciation.
NLU is important to data scientists because, without it, they wouldn’t have the means to parse out meaning from tools such as speech and chatbots. We as humans, after all, are accustomed to striking up a conversation with a speech-enabled bot — machines, however, don’t have this luxury of convenience. On top of this, NLU can identify sentiments and obscenities from speech, just like you can. This means that with the power of NLU, data scientists can categorize text and meaningfully analyze different formats of content.
Whereas natural language understanding seeks to parse through and make sense of unstructured information to turn it into usable data, NLG does quite the opposite. To that end, let’s define NLG next and understand the ways data scientists apply it to real-world use cases.
When data scientists provide an NLG system with data, it analyzes those data sets to create meaningful narratives understood through conversation. Essentially, NLG turns sets of data into a natural language that both you and I could understand.
NLG is imbued with the experience of a real-life person so that it can generate output that is thoroughly researched and accurate to the greatest possible extent. This process goes back to the literature written by Alan Turing that we mentioned earlier, and it’s essential to getting people to believe that a machine is holding a believable and natural conversation with them, regardless of what topic the discussion is based on.
Organizations can use NLG to create conversational narratives that anyone across that organization can make use of. For example, NLG can be a huge boon to experts working in departments such as marketing, human resources, sales, and information technology — NLG is most commonly applied to business intelligence dashboards, automated creation of content, and more efficient analysis of data.
Although NLP has plenty of modern-day business applications, many organizations have struggled to embrace it widely. This is due in large part to a few key challenges: information overload, for one, plagues businesses on a regular basis and makes it difficult for them to say which sets of data are important amidst a seemingly endless sea of more data.
Additionally, businesses often require specific techniques and tools with which they can parse out useful information from data if they want to use NLP. And finally, NLP means that organizations need advanced machines if they want to process and maintain sets of data from different data sources using NLP.
Although challenges are preventing the majority of organizations from adopting NLP, it seems inevitable that these same organizations will eventually adopt NLP, NLU, and NLG to allow their machines to maintain believable, human-like interactions and conversations. There is, therefore, a significant amount of investment occurring in NLP sub-fields of study like semantics and syntax.
To sum up, everything we’ve covered in this article: NLU reads and makes sense of natural language, and NLG creates and outputs more language; NLU assigns meaning to speech and text, and NLG outputs language with the help of machines. NLU extracts facts from language, while NLG takes the insights that NLU extracts in order to create natural language.
Be on the lookout for huge influencers in IT such as Apple and Google to keep investing in NLP so that they can create human-like systems. The worldwide market for NLP is set to eclipse $22 billion by 2025, so it’s only a matter of time before these tech giants transform how humans interact with technology.
Nahla Davies is a software developer and tech writer. Before devoting her work full time to technical writing, she managed — among other intriguing things — to serve as a lead programmer at an Inc. 5,000 experiential branding organization whose clients include Samsung, Time Warner, Netflix, and Sony.