Search engines index millions of pieces of information, structured and unstructured. But simply indexing information isn’t enough to give a user the results they need when they perform a search.
The Need for Relevancy
The goal of relevancy tuning is to help a user get the best results for a given query they are trying to run. Relevance is telling the search engine how to best sort the information in its index to ensure search results match search queries as closely as possible. It’s the process of bringing the most relevant information to the top of the result list.
In the mid twentieth century, author (and librarian) Jorge Luis Borge published a short story called The Library of Babel. The story describes a huge library of many hexagonal rooms, each of which is filled with books. The problem was the books seemed to be nothing but pages of random characters. Readers could make no sense of it.
Many large organizations find themselves in a similar predicament with their digital archives. They have plenty of valuable information. But how to unlock its meaning, relevance, and insight?
Good question. What is the point? The point is to create measurable business value from enterprise data. Of course, before measurable business value comes insight. The Modern Data Architecture (MDA) recognizes that insight can lie hidden in data of all types—structured or unstructured, messy or modeled, historical or realtime.
The other day I Googled, “the problem with a modern data architecture.” Of course, at Attivio we’re big evangelists for an MDA, but it’s always interesting to see what the contrarians have to say. There were over three million returns, but none on the first two pages said a word about problems. Lots of articles about how to develop an MDA or how to optimize an MDA or why you had to have an MDA. You get the picture.
As I’ve mentioned in prior blogs, the biggest use cases we see in Hadoop these days come from the risk and compliance functions of large banks. Initially, many banks and other financial services institutions (FSIs) adopted Hadoop out of sheer necessity despite its early immaturity on the governance front.