Semantic Data Catalog

Find, understand, and unify disparate data across all enterprise silos

Request a Demo

Find and Understand Your Data

There’s more information scattered across today’s enterprise than any one person can grasp. With Attivio, you can effortlessly search all of your information, regardless of where it lives – in a data warehouse, a data lake, a file share, on third party servers, or elsewhere. The Attivio Semantic Data Catalog provides immediate visibility into the right information with:

  • An intuitive, business-friendly natural language and keyword search
  • An easy eCommerce-like shopping cart for data
  • Recommendations of the best data for your context

Attivio searches a universal catalog of all enterprise information and exposes all the hidden relationships buried within, enabling you to instantly discover what’s most relevant.

Unify Disparate Data Sources

There are countless ways to combine data and a seemingly infinite number of relationships to detect. Attivio aggregates and correlates disparate data sources – structured data or unstructured content – and creates a unified, shareable data model.

The Attivio Semantic Data Catalog automates manual, time-consuming data unification processes by:

  • Inferring connections across disparate data sources
  • Automatically generating data models
  • Allowing the user to adapt the model for their context

Attivio gives you the power to understand, correlate, and model all structured, semi-structured, and unstructured sources to speed data discovery and accelerate analysis.

Take Action

There are different types of data consumers, with different requirements and priorities for data models. Attivio provisions the data you need, the way you need it, while maintaining security and control. With the Attivio Semantic Data Catalog, you can consume data models according to your function:

  • Data Analysis: Provision the data model directly to a BI or analytics tool in one click -- no need to write the SQL query.
  • Stewardship: Curate and share virtual data marts with the users and groups of users that need them.
  • Data Management: Gather and select the data for hot, medium and cold storage.

Attivio delivers data, quickly and easily, to the users who need it.

The Power of Natural Language Search

When data scientists, analysts, or line-of-business knowledge workers search for data, they may not know exactly what they’re looking for or the extent of the data that exists. With the Semantic Data Catalog, they can use plain language to search for data sources. Attivio searches the titles, descriptions, values, and metadata generated from the organization’s data sources.

  • Recommendations: relationships between data sets to automatically identify the best data sources for an analysis.
  • Semantic search: uses a variety of signals to understand the user’s intent and handle ambiguity. For example, semantic search understands that when a user searches for “profit,” she would also want to find data sets that reference “net income.” Similarly, if a user searches for “NJ,” they would also return entries for “New Jersey.”
  • Autocomplete: displays relevant and popular search suggestions as users type, saving time and frustration. Attivio bases suggestions on user activity and data analysis, making the autocomplete suggestions highly targeted.
  • Spelling correction: increases the speed and efficiency of search. When an exact match isn’t available, Attivio identifies the closest logical alternative.
  • Lemmatization uses variations on words such as plurals, tenses, genders, hyphenated forms, and more. For example, a search for “running” would return matches for “runs” or “ran.”
  • Faceted search groups items returned from a query into the most relevant subcategories. Users can refine their search by drilling down into a particular group or facet.
  • Advanced syntax increases precision through techniques such as phrase search, fielded search, Boolean matching, and proximity search.
  • Fuzzy matching increases recall and allows for looser matching. Substring and approximate matches allow users to find data sources when they only have partial information or even incorrect information.

Graph Analysis

Attivio builds a graph of the fields and columns in data sets and evaluates how they are related based on factors including overlapping values, similarities in column type and naming, and co-occurrence of related entities.

Attivio produces the graph using probabilistic data structures and infers foreign and primary keys by analyzing the features of the columns and looking at intersections, column types, and data cardinality. Attivio graphs relationships across databases and data types. For example, Attivio can find references to locations and customers in unstructured content and graph their relationship to structured data sources such as customer orders.

Model Management

While Attivio’s graph analysis and dynamic modeling produces an optimal model, it’s possible that the user will have other ideas on how the data should be put together. For this reason, Attivio provides a user interface that makes it simple to edit the model. The user interface exposes all connections between data elements and allows even non-technical users to change the model, editing and removing links and defining new links.


The final step provisions data for data discovery and advanced analytic tools, or search-based unified information applications. Attivio’s SQL interface can provision data in a flattened data structure or as relational tables, whichever is required. All analytic tools “speak” SQL. It’s the lingua franca for interrogating, exchanging, and understanding data. Attivio generates the necessary SQL statements, with no coding required.

Because Attivio puts structure around unstructured data, any analytics tool that looks at an Attivio data model will see a database, even if the model contains a combination of structured, semi-structured, and unstructured data. For data not indexed in the Attivio engine, Attivio can federate the query to the source systems.

Dynamic Modeling

Once data sources are selected, Attivio evaluates the relationships between them and identifies missing links through sources not currently included in the set. Functioning like a GPS for data, Attivio finds the shortest and best path to connect all the data in a group. Attivio selects subsets of the graph and generates the optimal relationship tree for the sources in the graph.

Security and Governance

When crawling through the data sources to build the catalog, Attivio maintains all existing security and access controls. For example, when a user searches the data catalog for “Customer X,” only the results that are accessible for that user will be visible. If that user does not have access to the ERP system, those results will not appear in the list.

With a federated virtual catalog that enables both control and collaboration, Attivio adapts to governance processes. Automated and manual tagging of data sources and data marts in the catalog provide a flexible means for stewardship and controlled collaboration.

Case Study: Unifying Data for Agile Analytics

Global Partners LP is a Fortune 500 midstream logistics company and one of the largest independent owners, suppliers and operators of gasoline stations and convenience stores in the Northeast.

With the Attivio Semantic Data Catalog, Global unifies a range of disparate data sources, including store sales, accounting and other internal and external data sources.

With this integrated view, marketing and other departments have easy access to the data they need for building insights and uncovering revenue opportunities.

Read the Blog Post

Try Attivio's Semantic Data Catalog Solution Free for 30 Days

Install it in your own environment and use your own data (or use our hosted environment) to see how we can help you break the bottleneck between data sources and data consumers! You need a Windows or Linux server with 16GB of memory for the installation.

Try it Now

Data Catalog Resources