If I’ve learnt anything about analytics over the last decade its that your analysis is only as good as the data that feeds it, and this has never been truer than with people analytics. If you want to understand how your business works, then you need to understand how your employees (or customers) interact with each other and the business. The business doesn’t just happen, people make it happen! To build such a people analysis system you need to be able to capture interaction information from the variety of systems your company uses. It may be your enterprise social network, e-mail, calendaring, or business processes or applications.
Over the last year I’ve been actively working on building an engagement analytics solution on top of a generalized and extensible Enterprise Graph. As I’ve been working on the integration of interaction information from different types of systems of engagement, its become increasingly evident that most application vendors are locking up (intentional or otherwise) key interaction data and not making it easily available for business analytics.
What exactly do I mean? For this post I’m going to focus specifically on social networks as they seem to be the biggest culprits since more traditional enterprise systems are used to having to provide access to their underlying database tables.
Most the social networks provide streaming APIs to provide application developers access to their data, however most (if not all) the firehoses are content-centric and NOT people-centric. When I am building an Enterprise Graph the person has to be the first class citizen. I need a complete view of what the person has done and what has been done to them. I want to know about every interaction they make in the system(s) so I can build out a complete and accurate graph.
Using Twitter as an example, I want a real-time stream that tell me when and what Marie is tweeting or deleting, retweeting, replying, favoriting, following, unfollowing, listing, editing, etc. I want a complete transaction report. Instead what I get from Twitter is a firehose of content creation records with meta-data. So I may see a tweet with Marie listed in the contributors field and 10 in the favorites field, but I then have to jump through hoops to find out who the favoriters are or to get regular updates on when new people favorite the tweet. And I have to use a whole other set of APIs to capture the rest of the verbs I care about. It’s essentially a pain in the posterior. And this isn’t me picking on Twitter as it’s not alone. This content centricity seems to be standard practice with most, if not all, of the social networks, public and enterprise.
Why is this such a disaster? Because it pretty much makes it impossible to get a complete picture of all CRUD (create, read, update, delete) events for any user. This means that you are struggling to build a complete and accurate enterprise graph, and back to my original point “analytics is only as good as the data that feeds it” so incomplete data means incomplete analysis and inaccurate insight.
Now whether this is intentional on the part of the social networks I can’t say, although I do suspect there may be an element of “the data is our value proposition” with the public social networks and “we want you to come to us for the analysis” on the part of the enterprise vendors who frequently provide their own analytics solutions at a premium. Whatever the reason, its totally unacceptable especially for companies heavily relying on public social networks or paying big money for their enterprise network solutions.
So my word of advise… if you are planning to invest in an enterprise social network make sure they give you access to 100% of the interactions that happen in the system. No exception! They may have some great analytics OOB and tell you that you don’t need to get the data out. Don’t listen! While today you may not want to apply your own flavor of enterprise-wide people analytics or blend it with other enterprise data, you don’t want to be 3 years down the line and then realize that your social vendor is holding your data for ransom.