Data on the cloud, a double-edged sword!

Last week I was chatting with some folks about the ongoing shift of data to the cloud, specifically talking about the challenges that this presents for search (which are many). Then a couple of days later I was reading a Forbes article from Joe McKendrick (@joemckendrick) entitled “Why Owning Software or Data ‘No Longer Makes Sense” which talked about the shift away from local data to the cloud hosted kind, with the implication that people no longer need local data. While I generally agree with the principle, I don’t believe that this transition is going to happen without significant pain, trade-offs for consumers, and a number of missteps by cloud providers. I anticipate one of those missteps is going to be the area of personal search.

Personal Search is the process of searching within one’s personal space of digital information. and was traditionally referred to as Desktop Search in the good old days of the PC. It was able to search not just every file you created, but essentially every single file you even looked at since everything you read gets copied to your desktop, thanks to the browser cache. So the desktop became your personal memory and desktop search allowed you to root around in your head and find stuff. It wasn’t perfect, but it was the best we had. And compared to today… it wasn’t half bad.

Today what I find increasingly curious is that personal search seems to have completely disappeared as a topic of discussion. Before writing this blog post I was rooting around the Internet to see what people are saying on the subject. Answer: Nothing! The only references were to Google, which doesn’t have personal search it has social search. It’s as though the trend of moving your data to the cloud somehow makes personal search irrelevant, whereas that couldn’t be further from the truth. If you think it was hard to find stuff when it was all located on your own PC, just imagine how much fun its going to be when your data is now scattered across dozens of cloud services across the Internet or corporate Intranet.

When people are looking for information, they are frequently looking for something they already touched before. The greatest complaint I hear from people across both consumer and enterprise domains is NOT that they can’t find content in general (although that is also a problem in many cases, particularly across the Intranet), but that they can’t find THEIR content; And I don’t mean the content they’ve created I mean the content they’ve touched, no matter how light the touch (read, saved (for later), bookmarked,…). Let us be very clear, Personal Search is NOT Internet or Intranet Search. Until such time as these web-based search engines can index every single thing I touch, from across Internet and Intranet (in the case of the Intranet search engine) and public or private, they won’t be able to compete with traditional desktop search.

And until such time as that issue is addressed, moving all my content to the cloud will be extremely unattractive from an information findability perspective.

5 Comments to “Data on the cloud, a double-edged sword!”

  1. Great thoughts. Absolutely true. Certainly a personal driver to post “stuff” in as few places as possible (and choose places with good search and long retention).

    I particularly like the thought about finding things we touch once. It resonates with my obsession about the importance of tagging. If If I tag everything I touch in a small number of places (e.g. one inside and one outside the firewall) then searching is centralised to those places. Actually, a standard for searching federated tag clouds might be easier to get adopted than generic federated search.

    It’s not a complete solution, and requires behaviour change, but it might be part of an answer to what will certainly come to be seen as a problem.


    • Good point! And something as simple as a browser plugin could do the tagging automatically so when you open / cache a page it dynamically generates a tag for whenever you are too lazy :)


      • Well, showing a list of potential tags with one click to add them. Not sure about automatic tagging. Some pages you would rather forget. To avoid results being swamped with rubbish, you need to set a bar for how interesting something is before you tag it. I am currently at 3,666 on Connections (i.e. work related) and 2,274 on Delicious (private).

        On second thoughts, automatically tagging pages where I upload or complete a form would be a good idea. That is where I have stored content – which fits with personal search issue.


  2. The problem is that you never know what you might want to find 3 weeks from now. How often have you thought “Heck I vaguely remember seeing something somewhere that mentioned Blah”? So I wasn’t suggesting that the “autotag” would need to be a visible tag per-sei, but it could be a tag that’s used to simply catch a personal list of everything you touched and rely on ranking to use your other social actions (tag, share, favorite, …) to sort out the order.

    This addresses the 100% recall challenge while at the same time giving you precision. So I could specifically look for all the files I tagged with analytics last week OR I could just search on any page that I may have touched last week that included the word analytics and rely on the ranking to sort out the order.


    • Marie, makes sense. I guess it becomes another category of tag, alongside things you tagged (explicity) and things other people (in your company, network or just in general) have tagged. Controlling the domain of the search across those categories would help filter results (e.g. depending if you are looking for things you tagged or just things you saw)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: