Have You Thanked Your Data Steward Today?
The other day I Googled, “the problem with a modern data architecture.” Of course, at Attivio we’re big evangelists for an MDA, but it’s always interesting to see what the contrarians have to say. There were over three million returns, but none on the first two pages said a word about problems. Lots of articles about how to develop an MDA or how to optimize an MDA or why you had to have an MDA. You get the picture.
But I did find something that relates very much to the ecosystem of the modern data architecture and the role of Attivio’s Semantic Data Catalog in navigating it. An August post on Cloudera’s vision blog suggested six principles for the modern data architecture. Here’s number five:
Information Through Data Stewardship: Time and time again I’ve seen enterprises that have invested in a Hadoop data lake start to suffer when they allow self-serve data access to the raw data stored in these clusters. Without the proper data curation modeling of important relationships, cleansing of raw data, curation of key dimensions, and measures, end users can have a frustrating experience – vastly reducing the perceived and realized value of the underlying data.
Many CDOs will tell you that a complete data democracy is a pipedream—a.k.a data anarchy. IT will always have some middle person (data steward) role between the data and business users. That role will vary, of course, depending on factors such as whether the company operates in a highly-regulated industry.
A data catalog makes it much easier for IT to function in that intermediary capacity. With a data catalog, IT can quickly identify which data is worth curating and create data marts to be shared and open to self-service inquiry. In other words, a data catalog can enable data stewards to avoid becoming data dictators. For more on this topic, see our white paper: A Semantic Data Catalog Drives Agility in the Modern Data Architecture.