Once Chief Data Officers have identified, confronted, and hopefully overcome the challenges at the bottomand in the middleof the Big Data stack, what’s next? As Andrew Brust of Datameer notes, “At the top of the stack, there are seemingly endless choices. Whether Enterprise BI stalwarts, BI 2.0 challengers, or big data analytics players, the number of vendors and their similar positioning makes it really hard for customers.”
If your organization is going to win on analytics, it needs to view all of its information as a strategic enterprise asset. This includes not just the 10% you know about, but the 90% of dark data that hides in information silos. There are big challenges on the path to surfacing all of your enterprise information for business intelligence. The biggest challenge is not in storing data, or in analyzing it, but actually finding the right data. But why is it so hard? Here are the top three reasons:
Forrester just released its latest Wave report. Unlike many on more mature technologies, this report on native Hadoop BI platforms included only six vendors, of which Attivio was one. In Forrester's estimation, there are no leaders in the market, just contenders and strong performers.
As Dan Woods points out in a recent article for Forbes, technology marketplaces cycle through predictable stages as they mature. He applies this insight to the component versus platform decision that organizations face when adopting new technologies.
As any developer knows, perfect software doesn’t just happen it, pardon the pun, “develops” over time. Developers engage in a seemingly everlasting iterative process involving bug fixes and changes that can last for the lifetime of an application. But writing the software is only half the battle; it must then be deployed.
For big data companies like ours that run software across distributed networks, this is no small task. In particular, a developer makes changes, runs tests, identifies errors or processing improvements to address, and then makes more changes.
I recently attended Hadoop Summit 2016 where not surprisingly there was a lot of conversation about topics other than Hadoop. For example, the importance of ecosystem partners to any Big Data solution.
It was a great conversation. Carey pointed out that although data scientists do spend a lot of time on analytics, they also spend just as much or more time "wrangling" their data environments and trying to find data and move it where they need it. And that's why EMC turned to Attivio and Zaloni. Check out the rest of the discussion.
When it comes to gathering the right data and finding the relationships that make that data more meaningful, there’s one role that knows how to do it best - the data steward. That’s why they are often referred to as data detectives.
The Trustee of an Organization’s Data
Data stewardship is an important role for an organization. The data steward is a trustee of an organization’s data. They don’t own the data, but with so many internal and external data sources available for use, the data steward’s responsibility is to understand what is available and how it connects to provide real value.
Business analysts and line-of-business (LOB) data users have plenty of robust, self-service BI tools at their disposal. What they often lack is a way to get all the most relevant data into those tools. In a TDWI Checklist Report, Dave Stodder, Director of TDWI Research for Business Intelligence, lists seven best practices for executing a successful data science strategy. Number five: Give Data Science Teams Access to All the Data.