Machine Learning-based Relevancy
Search engines index millions of pieces of information, structured and unstructured. But simply indexing information isn’t enough to give a user the results they need when they perform a search.
The Need for Relevancy
The goal of relevancy tuning is to help a user get the best results for a given query they are trying to run. Relevance is telling the search engine how to best sort the information in its index to ensure search results match search queries as closely as possible. It’s the process of bringing the most relevant information to the top of the result list.
Most search engines have the ability to tweak and tune the relevancy model. For example, you can tell the search engine that if the search term is in the document title, give the document so many points, if it’s in the body, give it so many points, if the document is newer so many points, and so on. It’s the process of weighting different parameters to ultimately determine which documents rise to the top of the list.
For Knowledge Management, relevancy gets difficult. Consider the hundreds or even thousands of different parameters you could have. There are documented cases of different sites and different companies having thousands of different variables that go into, or that could go into their relevancy calculation.
There's simply no way for a human to look at all that information and figure out what is the right way to weight all of the different inputs to achieve the desired result. Think about a business user going into the search system and saying here are 5,000 of the queries I care about the most, and here are the results I expect to come back for each of them. Who actually has this kind of list?
You could achieve results just by seeing what people click on or what they purchase or what documents they read - tracking their behavior and using that as a proxy for some inherent business value. But, essentially, you end up in the same situation: for this query, this is the best result, for this query, these are the best results number one through five.
Performing relevancy manually is simply impossible to in most cases.
Machine Learning-based Relevancy
All search engines have a relevancy model, but most are simplistic. Attivio's Cognitive Search Platform is different in that it provides the greatest expanse of options to tweak the relevancy of the model.
Using machine learning relevancy tuning, Attivio can analyze and correlate hundreds of parameters and variables and reverse engineer the algorithm used to score the documents. It also reports on the accuracy of the model and can apply user feedback (implicit and explicit) to improve the model.
Machine learning-based relevancy tuning reduces the amount of human effort required for tuning and reduces the number of errors that often happen from manual tuning.
The machine learning model takes data and all the values for each of those documents: values such as how old a document is, how big it is, where did the query match, and so on, and essentially reverse engineers the algorithm that you would use to score the documents to achieve those results. And it reports on its accuracy.
What’s more, you can have a business user tweak the results, or if you have continued click tracking and things like that you're able to feed back more and more information into the model to effect relevancy downstream.
The nice part about machine learning-based relevancy is, again, you don't have anyone looking at spreadsheets with thousands of columns. Relevancy gets better for the user, users are happier, and that promotes engagement, sales, or whatever you're looking to drive from a business standpoint.
But There’s Even More to Cognitive Search
Attivio provides the fundamental options to tune relevancy, but what separates it from the rest is the amount of signal it can generate for the model builder using additional input data from other core capabilities such as natural language processing (NLP), entity extraction and text analytics.
For example, think about a news story. It might be more relevant based on the number of politicians mentioned in the news story or the number of economic indicators. Those are outputs from natural language processing that we would capture and make available to the algorithm builder.
Attivio doesn’t provide a single relevancy model either. Attivio supports multiple types of applications and models such as HR and Sales, each with different perspectives on relevancy.
For more about Attivio's approach to machine learning, read the 5-Minute Guide to Machine Learning.