Welcome to Attivio's Unified Information Access Blog. Join us for discussions on topics ranging from enterprise search solutions, information access insights, Agile software development methodology to programming with Java. We hope you'll find the articles informative and participate in the discussions by leaving a comment.
If you believe that, then I have a bridge to sell you
The media and blogosphere have been consumed by the troubled rollout of the Healthcare.gov health exchange website, with a long list of management and technology failures adding fuel to the political fires. The latest finger pointing in this seemingly endless saga lays blame on the selection of MarkLogic's NoSQL database as a key factor in the site's ongoing problems.
According to GigaOm, "...the issue seems to be that it's harder to find database admins and other techies that know NoSQL databases inside and out whereas there's a ton of existing expertise in SQL databases from Oracle, Microsoft and IBM."
The NY Times put it even more bluntly: "Another sore point was the Medicare agency's decision to use database software, from a company called MarkLogic, that managed the data differently from systems by companies like IBM, Microsoft and Oracle."
The implication here is that the site's problems would have somehow been avoided, if only a more traditional database had been used instead.
While I have great respect for both of these publications, does anyone really believe a better solution to a project involving many disparate sources of information, complex logic, and a dynamic interface, which must be built in a very short timeframe would have been to select IBM, Microsoft or Oracle? The idea that legacy mega-vendors have the agility required for a project of this scope is absurd, as the states of Oregon, Pennsylvania and the US Air Force have all recently learned the hard way.
Let's take a look at the real issues at play here.
Selecting a NoSQL database like MarkLogic, or more precisely in this case, an XML database, means that all of the Healthcare.gov data sources would have to be converted to XML. Of course that's a monumental task, but it's no more difficult and time consuming than the arduous extract, transform and load (ETL) processes required by traditional relational databases because of their fixed schema. The enormous time and cost associated with ETL is precisely why new technologies are emerging.
The only real advantage the legacy vendors hold over XML technology is that most every developer already knows SQL, versus the XQuery interface being used with XML, which is known for its steep learning curve. But there are newer, more agile options available, which fully support SQL, but don't require the months or sometimes even years of manual effort it takes to complete complex legacy ETL processes.
The bottom line is that choosing the "same old, same old" legacy relational database technology is by no means the safest path. Success can only be realized by deeply analyzing the particular problem you are trying to solve, understanding the nature of all the information that is needed to solve it, evaluating the skills of the people involved, and exploring newer, more innovative approaches that might offer a more direct path. Then – and only then – can you be sure to select the technology that best meets your requirements.
Every organization today has the potential to achieve new business success using breakthrough Big Data insights. This golden opportunity was actually predicted back in 1998* by Peter Drucker, the father of modern business management.
Peter Drucker correctly foresaw that the next business revolution would redefine the very meaning and purpose of information. And while he did not refer to "Big Data" by that name, he also clearly envisioned the power of big data variety, stating that leveraging diverse information sources would be critical to achieving new business success.
We are in the midst of the Big Data information revolution today, but how did we get here? Back in 1950, the computer was the new "miracle" that Drucker and his colleagues expected would revolutionize business strategy and decision-making for top management. "We could not have been more wrong," Drucker later wrote. Instead, computers and information technology revolutionized operational tasks – particularly accounting – with "near-zero impact on the management of business itself."
Managing operational tasks has remained the primary focus of IT over the last five decades since. As a result, said Drucker, modern day IT can efficiently manage a company's worldwide manufacturing and service operations, but has had far less impact on such decisions as what markets to enter or what existing or new products to offer. Peter Drucker foresaw that this was all about to change:
For top management, information technology [since 1950] has been a producer of data [for operational tasks]... Business success is based on something totally different: the creation of value and wealth.
This requires risk-taking decisions... on business strategy, on abandoning the old and innovating the new... the balance between the short term and the long term... between immediate profitability and market share. These decisions are the true top management tasks.
And in one enterprise after another, [business leaders] have begun to ask, "What information concepts do we need for our tasks?"
Drucker went on to suggest some answers to this key question, which sounds remarkably similar to today's growing calls to focus Big Data initiatives on enabling Big Data variety:
[The most important] task in developing an effective information system for top management: the collection and organization of outside-focused information... customers and non-customers; competitors and non-competitors; markets... demographics... the share of income that customers spend [within] their industry... The more inside information top management gets, the more it will need to balance it with outside information – and that does not exist yet. Within the next 10 to 15 years, developing this data is going to be the next information frontier.
Thankfully, the enabling technology Drucker had envisioned now does exist: unified information access technology, led by Attivio's Active Intelligence Engine (AIE). By freely integrating and joining insights drawn from a wide variety of diverse data and text-based information, organzations are transforming silos of information into what Drucker called "a producer of new and different questions and new and different strategies."
[We have been] working with Attivio since 2008. [Attivio's] unstructured JOIN capability opens amazing possibilities. We are now building an information warehouse for an organization with many thousands of dedicated information researchers. This warehouse will integrate more than ten structured and unstructured systems into a single query screen that can combine any information type. This system will contain hundreds of millions of documents and structured records.
The relational SQL paradigm is so deep in our thinking, that we "see" information in terms of tables and columns. Once I stopped thinking in tables and started thinking in flexible documents and [Attivio's ad hoc JOIN] capability, information silos [simply became] integrated.
And then there will be organizations – hopefully yours – that successfully seize today's Big Data golden opportunity as Drucker envisoned. If you are eager to put all of your information sources to work to achieve game-changing business insights, unified information access is ready to get you there.
*) All direct quotes from The Next Information Revolution by Peter Drucker, published in Forbes ASAP (24 August 1998; emphasis added).
To successfully achieve the promised rewards of Big Data, you must rethink the meaning and purpose of information, as Peter Drucker called for. You must apply a wide variety of information, inside and outside your organization, to enable new insights focused on new business value creation. This white paper will brief you on the new technology essential to realizing that value: unified information access.
Big Data can transform business thinking, if the business transforms how it thinks about Big Data.
That might sound like a Zen koan, but it's the key to gaining breakthrough insights: You must look beyond the limitation of what you think can be accomplished – and instead think about and ask for what you wish you could derive from the data you have available.
And yet, many organizations surprisingly fail to apply such fresh thinking to their Big Data initiatives – and end up suffering serious project failures.
There are three primary areas of misguided thinking – "Big Data blunders", if you will – which you must dispel in order to make new business insights a reality. Left unchallenged, these blunders will lead directly to ill-advised initiatives that will fail to deliver meaningful business value:
Blunder #1: Reacting from a "FOMO" perspective. Fueled by a Fear Of Missing Out, many organizations dived headfirst into Big Data infrastructure projects so they wouldn't "fall behind." One survey reported in MIT Sloan Management Review noted the soaring popularity of Big Data led some executive committees at large companies to issue mandates to managers along the lines of, "we don't know what this big data thing is, but we better be attacking it immediately."
Such knee-jerk reactions have resulted in boil-the-ocean projects like blindly building out Hadoop clusters vaguely targeted to take 12 to 24 months – with NO thought invested in actual use cases explaining how that will help boost revenue, cost savings or competitiveness! FOMO-driven decision making is clearly a one-way ticket to Big Data failure.
Blunder #2: Focusing primarily on volume. My colleague Randy McLaughlin recently observed that the term "Big Data" has so many competing definitions, that they limit the usefulness of the term. Early definitions, for example, equated "Big" with "volume." This definition, incomplete at best, still persists; many still mistakenly think of Big Data as synonymous with Hadoop.
That's a problem, because focusing heavily on volume will lead to "big mistakes." That's the warning from a recent Harvard Business Review blog article: Does Bigger Data Lead to Better Decisions? The authors cite long-standing research that shows decision makers will often selectively use and interpret information for self-enhancement or to confirm existing beliefs. Existing sacred cows of conventional corporate wisdom are unlikely to be challenged by merely pumping up data volume.
Perhaps that's the primary reason why, to quote BusinessWeek, "of the countless companies trying to leverage vast amounts of data, only a few have been truly successful" (emphasis added). The solution to this issue is not to "re-engineer decision making processes" as that article suggested, but rather, re-engineer the organization's strategy – away from volume as the primary technology focus and towards managing variety!
Blunder #3: Failing to focus on primarily the variety of information. The HBR article authors also noted that "big volume" is actually old; financial services firms have had big volume for decades. What's really new today is the variety of information sources, which enables new business insights.
The article points out that diverse business teams are more creative than homogeneous groups; diverse data merged together confers similar benefits. "So perhaps we shouldn't be talking about Big [volume] making decisions better, but about Diverse Data connecting the dots using new technologies, processes, and skills."And connecting those dots is most rapidly accomplished through a unified information access platform.
Imagine, for example, integrating, correlating and analyzing transactional databases together with customer likes and dislikes expressed on social media, websites, email, IM chats and call center notes. The result: a true 360 degree view of the customer solution that delivers a new level of actionable customer insights to reduce customer churn while maximizing customer service, loyalty, and successful up-selling and cross-selling. That's the business-transforming power of Big Data variety.
It's important to note evidence is mounting that organizations are starting to "get" that the real game-changing payoffs will be realized through successfully managing information variety. For example, the Big Data survey I mentioned earlier also found the large corporations surveyed were all about "managing the variety of data and… all about integrating information from diverse sources… Across the board, that was really the primary focus of how firms wanted to use big data, and that included incorporating unstructured data."
So, if your organization is not yet exploring managing variety as your primary Big Data business value driver and primary technology focus, make it a priority to do so now – before your competition does.
For your organization to successfully utilize Big Data to drive new business success, your Big Data strategy must focus on achieving well-defined business goals. Doing so will reveal the top priority for your Big Data technology infrastructure is managing Big Data variety - which is best enabled using a true unified information access platform.