I chose to start with W. Edwards Deming’s famous quote since this post is all about Data; why we need it and why we don’t have enough of it, which may seem surprising since all anyone talks about these days is the bigdata we are swimming in. However there is data and there is data.
To steal a second famous quote this time from Thomas Edison “Genius is 1% inspiration and 99% perspiration”, we accurately describe bigdata analytics; a whole lot of messing with data, interspersed with a bit of data science. In fact we could honestly say “Bigdata analytics is 1% data science and 99% data wrangling”. Wrangling? According to Oxford to wrangle is “to have a long and complicated dispute” or “to round up, herd, or take charge of”, and for anyone in the data analytics space this perfectly describes our love-hate relationship with data. Before we can even start with the fun data science stuff we argue about its semantics, its accuracy, and it’s provenance; we gather it and try (often unsuccessfully) to take charge of it. It’s a perfect analogy, although with data its more like herding cats than cattle.
But I digress. The point I’m trying to make, albeit circuitously, is that making data accessible and usable for analytics should be the #1 priority for any application that wants to join the bigdata revolution. If we believe that “Data is the new Oil” then we need to do a much better job of getting it out of the ground so we can refine it and start using it to run our businesses.
If we look at traditional business intelligence which analyzes and reports on systems of record (structured or transactional data), application developers have gotten really good at ensuring their solutions collect, persist, and secure this data and provide comprehensive data APIs for reporting purposes. They’ve also made fairly decent inroads at ensuring data interoperability so that data from different applications can be integrated and analyzed. For example, connecting product inventory, sales contracts, financial transactions, and warranty records. They get that their data APIs are as important a feature as anything an end user sees on the screen.
However when it comes to systems of engagement (social, collaboration, or communication systems) it’s the wild wild west. The #1 question customers ask me is “How do I measure the impact of engagement on business outcome?”, to which I ask “do you capture any engagement data associated with your business outcomes?”, to which they invariably answer “No”. However companies are using dozens of applications to facilitate engagement between their employees and with their customers; applications that must be storing this engagement data in some fashion. So there is something amiss. In most cases its a lack of published Data APIs to make it easy for Users to extract engagement data from these systems. Even social networking platforms which are designed for bigdata analytics, are often sloppy about the Data APIs they provide; sometimes because they don’t want to share the data and other times because they don’t recognize the valuable meta-data they are sitting on. There has been an overfocus on Content APIs over the last few years with everyone gone mad on Sentiment Analysis, and Data APIs have suffered as a result.
Since this is becoming a bit of a ramble, I’m going to get to the punchline and end with a Challenge to application developers.
Engagement data is a totally untapped goldmine that will allow the business to answer a whole set of previously unasked questions about the business (how vs. what?). However users, application developers, data scientists need simple ways to get access to this data. Any application that collects engagement data of any description (it may be telephone calls, email exchanges, instant messages, blog comments, status updates, file shares, etc.) has the potential to contribute to this new generation of analytics applications.
You could be the coolest kids in school, just show us what you’ve got!