Why Social Media Analysis is a Big Data challenge, irrespective of how narrow your query.

I recently had a discussion with a colleague on whether or not social media analysis is really a #bigdata challenge. While we can all agree that the Twitter firehose is unquestionably a bigdata problem, most companies don’t really care about the entire firehose and only want the subset of the stream that addresses their particular query. Which leads them to ask “do I really need a bigdata solution?”, to which my answer is a resounding YES!

{ and if you are using a SaaS solution, as opposed to on-prem, then you most definitely need to make sure that your provider is not just narrowly harvesting the subset of the stream you are querying }

So why am I selling a bigdata approach? Am I just another techno geek looking to push more technology at a customer who is already overwhelmed with what they have, or do I have a genuine business reason for recommending that they think about this problem from a bigdata perspective? I would like to think I am the latter, and my reason for advising a more comprehensive view of social analysis is…

The Hathaway Effect! Ok, that’s a bit tongue in cheek :-), but … Last year and again earlier this year, the supposed correlation between Anne Hathway’s media presence and Berkshire Hathaway’s share price hit the news. Whether there is in fact such a correlation notwithstanding, the scenario (in the abstract) does highlight a brittleness in current social media analysis, namely…

=>  Just because it looks the same doesn’t make it the same …
       … and local context is not always enough to disambiguate!

=>  Not all opinions are created equal …
       … and no opinion can be evaluated in isolation!

=>  Don’t make a business decision on the misunderstood word of random strangers!

Today’s practice of avoiding the BigData Challenge by just syphoning off a sub-set of the social media stream is like overhearing part of a conversation “out of context”, without having any idea who is talking or who is listening. It’s kind of barmy when you think about it.

Therefore, even if we don’t want to grab the entire firehose (or equivalent for web content) we minimally need to re-harvest content based on the syphoned stream so that we can piece together the entire stream of consciousness (related to the conversation in question and to the broader conversations of the people involved) and start to put some context around the discussion. This “ripple harvest” allows us to have a more complete perspective (“fill the gaps”) and build the social network that will allow us to measure influence, expertise, impact, reach, agenda, affiliations, supporters, detractors, etc. related to the people that are creating the conversation on whose opinions we are making business decisions!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: