Text Analytics: Entity Extraction and Resolution
Entity extraction is a core capability of text analytics, so I thought I’d step through an example of how it works.
Note: Learn about the all the different text analytics capabilities in my last post Text Analytics: A Spectrum of Enrichment Techniques.
Consider the following text that talks about Apple and Berkshire Hathaway. Entity extraction lets you pick out the entities in the text (companies, persons, locations) and then help you understand how they are related to other.
In the example above, the text analytics process pulled out several entities, including Apple, a company, Steve Jobs, a person, iPhone, a product, Berkshire Hathaway, a company and Warren Buffet, a person. It was able to connect that Apple, Steve Jobs, and the iPhone are related to each other, as are Berkshire Hathaway and Buffet.
When you apply downstream analytics to this, you can further identify the relationship between Apple and Berkshire Hathaway (they are an investor). Without text analytics, the system couldn’t make this connection.
The better the entity extraction and sentiment analysis, the better the signal that goes into these downstream analysis tasks.
Unifying Content Through Text Analytics
Text analytics is a fundamental capability of cognitive search. It lets you connect disparate data sources – structured or unstructured – derive insights. Without text analytics, you couldn’t bring unstructured content into the search.
Think about it this way. You want to know what customers are saying about your products, especially the negative comments. With the help of text analytics, you can examine customer emails, looking for negative mentions of your products. Then you connect that information to the actual products you own (stored in a structured repository). This example takes a product-centric view of the problem. You could easily switch it around and connect the customer sentiment to the products the customer owns, taking a customer centric view of the problem.
Without the ability to connect structured and unstructured content, you’re missing out on the power of cognitive search.