What is NLP?

I’m often asked, what is it that you do? When I mention the word NLP, frequently the next question is what is that? So here is a blog post that has both the short and the long versions of my answer.

Here is the short answer. NLP stands for Natural Language Processing. The “Natural” in Natural Language Processing refers human language in the human and computer languages dichotomy. “Processing” really means automated computer processing. In short, NLP is automated programs that try to understand or generate pieces of human language. NLP often uses AI techniques.

Examples

Machine translation

Here are some simple applications of NLP in our everyday lives:

  • Spellcheckers: programs that check spelling and grammar, identify mistakes and then offer suggestions.
  • Suggested words in text messaging: in order to predict the next word for your message, the program processes lots of text and computes probabilities of the words that come after each word you type.
  • Alexa, Siri and other voice assistants: these use complex NLP techniques to understand what is being said and then perform the requested action.
  • Online search: there are many ways how NLP is used in search. For example, one way is to ensure that “Avengers showtimes” will show up in the search results even when we search for “Avengers show time”.
  • Google translate and other machine translation apps: in order to translate from one language to another, the translation tools build neural network models from large collections of text in pairs of languages.

What else can NLP do?

Sentiment analysis of customer reviews

Everywhere there is a large collection of text, NLP can likely be applied to it. There are many ways to do it, and here are a few:

  • Extracting information from documents. Many companies deal with large sets of documents that they use in their daily processes. For example, recruiting companies collect resumes and then extract information from them, such as name, years of experience, education, contact information, etc. Using NLP, it is possible to build a system that will parse the resume, extract the needed information and record it in a database.
  • Extracting features and topics, for example in customer reviews and social media mentions. Customers often talk about different aspects of a product or service they are reviewing in a single review; for example, price, delivery, customer service, quality, durability, etc. With natural language processing it is possible to build a system that will collect reviews about a product or service and extract the different features that are being talked about in the reviews.
  • Analyzing text for positive and negative feelings. There are NLP tools that let us build systems that determine if the text is positive, negative, neutral, or a more nuanced set of emotions. For example, customer review features can be analyzed for positive or negative sentiment to reveal how customers feel about different features of a service or product.
  • Classifying documents. Many times, there is a need to classify a set of documents into different subgroups. For example, companies may rely on news as their source of information, and with NLP it is possible to create a program that will divide the incoming news stream into relevant and irrelevant news pieces, and further divide the relevant ones by topic, such as technology, economy, politics, etc.

As you can see, NLP is usually applied to specific tasks, as opposed to understanding and creating language the way humans do. These two are very hard to achieve.

How does NLP relate to AI?

Artificial Intelligence

AI, or Artificial Intelligence, is a set of techniques where the computer uses existing data to learn a function, then applies it to future data to produce answers to various problems. AI can be applied to many different domains, such as facial recognition in images, automatic transcription of audio into text, financial fraud detection. In that sense AI is used in NLP like in any other domain. NLP also uses techniques outside of AI. NLP programs can be built with three different approaches:

  • Rules. For a given NLP task we can write a set of rules that will produce a result that is close to what is being desired. For example, if we are tasked with separating an incoming news stream into different topics, we can write a set of keywords for each topic and use that as a predictor. While this is the simplest approach, it is very labor intensive, and usually it is hard to cover all the presenting cases.
  • Machine learning. With machine learning we can devise an algorithm that will learn itself which news texts should be classified as a particular topic. In order for the algorithm to work, the incoming items first need to be represented as a set of features, and in this case, it could be the list of all the words in the article, the title, bolded words, numbers, proper nouns mentioned, etc. Also, we need a representative sample of the documents. This is one of the most used techniques nowadays. For example, the task of dividing a news stream into different topics can be easily achieved using this technique.
  • Deep learning. When the representative set of documents being processed is large enough (usually a lot larger than what’s needed for feature-based machine learning), deep learning can be applied. Here a neural network (named so because it tries to mimic the way human neural processes work) learns the function needed to perform the task. The advantage of deep learning is that engineers building the system do not need to spend too much time figuring out the features to represent the incoming items. However, the amount of data and the computational power required are very large. For example, the most accurate results in machine translation (such as Google translate) are achieved using deep learning.

These last two techniques are considered part of Artificial Intelligence.

Did this article help you understand the main aspects of Natural Language Processing? Let me know in the comments.

If you are a business with a large collection of text documents and you would like to automate the manual process in place, feel free to contact me at zhenya@practicallinguistics.com.

Leave a Comment

Your email address will not be published. Required fields are marked *