All About Machine Learning in Cognitive Search
Recent research shows that over 66% of employees are dependent on search in their daily work. But there’s a problem. Forty-one percent are frustrated with their existing search application.
Many enterprise search platforms offer task-based search, providing a simple search, analyze, decide and start over approach that provides no context between searches by an individual. Attivio provides a different approach. Attivio Cognitive Search and Insights takes search beyond purely indexing data by incorporating innovative technologies such as machine learning, natural language processing, and content analytics to derive better insights and knowledge.
Attivio's Cognitive Search empowers users through a continuous learning approach that analyzes and learns as it’s used. The result is a stateful search experience that takes the user’s context into consideration, improving relevancy and ensuring the user finds the right information.
So how does Attivio do it?
One key component of Attivio’s cognitive search solution is its machine learning capabilities. Machine learning enables the enrichment of data that improves relevancy without human intervention.
Data Enrichment Through Machine Learning
Attivio process raw data from enterprise systems, structured and unstructured - databases, file shares, websites and others - extracting standard metadata such as names, places, companies, language, etc.
Once the data and information are correlated across all systems and repositories in the enterprise, Attivio Cognitive Search moves to the next step performing data enrichment through machine learning. Data enrichment is the processing of the raw data to deduce additional insights and metadata such as the type of document, classification and outlier detection, and predictive data (characteristics that are inferred from the raw data).
Think of it this way, you have a lot of data to process, but only some of it is tagged and categorized, and possibly not properly. There’s too much data there to have someone manually make the required changes. Not only would the data potentially not get tagged appropriately, but it would also take a long time.
What if you were to allow a machine to learn from the data automatically? What if you didn’t have to manually program it with a set of actions to take or a set of patterns to look for? Data enrichment through machine learning is the best approach you can take to ensure metadata is applied properly, consistently and much quicker than humanly possible.
Attivio provides two types of data enrichment through machine learning.
With batch processing, the system is training by giving it a set of data (pointing it to folders with data) and letting the system analyze the data to understand what makes the data the same, and unique from other data. The system then builds a model and cross-validates that model against the subset of the data.
Batch processing takes the work away from users to classify their data, reducing the burden of classification and tagging during content creation. This process increases user productivity on the search side and improves the relevancy of search results.
Online Machine Learning
Online machine learning is a learn as you go approach to machine learning. With this approach, you don’t have a corpse of samples for the system to learn from, so the system learns as it’s used, detecting patterns and building the model on the fly.
One type of online machine learning is outlier detection. Outliers are instances where something is different enough to raise attention - use of certain keywords, department activity, time of day, and so on. Outliers support risk and compliance applications by detecting suspicious or fraudulent behaviors.
Building the Data Enrichment Model
Your organization handles a lot of data, and you need a way to organize it, index it and empower your employees to use it to make decisions.
Data enrichment is about letting the machine do the work that your employees would otherwise need to do. Maybe you have people whose job it is to tag every document produced, or maybe you require anyone who produces documents to tag their content with keywords and taxonomies.
This human-centered approach to classification and data enrichment may work for a little while. But your employees will complain about the time it takes, and they will complain about how some content is tagged incorrectly. After a while, things get lax, and employees aren’t as focused as they should be in their classification efforts. Tagging hygiene falls apart.
That doesn’t happen when you let machines do the work for you. Find out more about Attivio's approach to machine learning in this 5-Minute Guide to Machine Learning.