From Marriott to the Democratic National Convention to Yahoo!, significant data breaches have become practically normalized. This is, of course, nonsense; privacy is a fundamental right and the fact that major organizations cannot guarantee your digital privacy is an enormous problem. While compromised email addresses and passwords are one thing, the recent crack of data and analytics company Ascension’s Elasticsearch-based database spilled more than 24 million banking and financial documents onto the web for an all-you-can-steal buffet. For companies, this should be a giant flashing red light that says, “Ensure your security is up to snuff.”
Some time ago, people looking for answers to solve business problems realized that the information they sought resided in different places. It could have been in a file system, on an intranet, on the web, or in a proprietary database associated with a specific line-of-business application. What could be done to make sure employees and customers had a way to search once and get answers back from any source? The initial answer was federated search, which, on behalf of the user, submits the query to multiple repositories, and returns results back in a list, sometimes consolidated, often not.
One of the biggest changes in the upcoming Sherlock release of the Attivio Platform is moving to a Hadoop-based architecture for big data applications. Hadoop provides us with a number of low level capabilities that we no longer have to manage at this scale, such as resource management (YARN), coordination (Zookeeper), storage (HDFS) and bookkeeping (HBase). One of the side benefits of YARN based resource allocation is the ability to scale up or down your system’s footprint with a few simple YARN commands. At Attivio, we’ve taken this ability and used it to scale up/down our index, for total flexibility when it comes to performance, cost or index management.
One of the foundational technology differentiators of the Attivio Platform is the ability to perform Query Time Joins of data across both structured and unstructured data. Last year we received our latest patent on an extension of that technology called a “Composite Join” and it has enabled us to deliver some awesome solutions for our customers.
The Query Time Join
Before we get into composite join, let’s take a step back. The concept of a join between two tables is well understood in the realm of databases. For example:
Policies are rarely something that get people excited, but when it comes to the enterprise, they are the foundation for every risk and compliance solution. More importantly, regulators around the world and in every industry, rely on, and in many cases, require corporations to maintain and enforce policies. Policies are what keep your data private, ensure a fair playing field, and generally keep the world a safe place.
Attivio has been at the forefront of secure search-based applications for the last 8 years. Using our patented query-time join capabilities we are able to store security information in the index separate from the content and use it to apply security filters automatically to ensure that end users only see the most relevant content they are allowed to see. This allows us to automatically preserve security permissions from sources such as SharePoint, Jive, Confluence and other content repositories.
Over at least the last decade, we’ve seen a steady rise in the demand for self-service BI and analytical tools. More and more organizations realize the business value of their data for growing revenue, acquiring and retaining customers, streamlining operations, and lowering costs.
Attivio’s latest patent covers one of the most interesting features in the Semantic Data Catalog and the one that always gets wows when we demo it – the automatic join finder. Under the hood, the technology replaces manual processes that could take hours or days with a quick, easy process that takes minutes.