11/26/12

Why You Can't Find Anything At Work

Back in the wild cyber-space days of the early 1990s the metaphors we used to describe our online tools were thrilling. We used web-browsers called Navigator or Explorer, we found our way in the real world using MapQuest, and we searched for content along the information super-highway using engines called Magellan, AltaVista, or Northern Light.  During this time it was not uncommon to spend hours browsing the internet. But most people didn't browse for enjoyment, they set out to find the phone number for their local pizza joint and ended up reading about the life cycle of the African Tsetse fly.  We browsed because there was no easy way to find exactly what we were looking for.  Favorites, or Bookmarks, weren't simply reminders of something interesting to read later but electronic insurance that once we found something useful we'd be able to get back to it again.

This bizarre, illogical, exciting, spontaneous, and often frustrating world began to change in 1996 when two Stanford students had a brilliant insight. They surmised that the number of links that pointed to a particular site might be a good indicator of how popular that site was and that the popularity of the site might be a good indicator of how valuable the information contained therein might be to a searcher.  They gave their mashup of textual search and link popularity the name of a ridiculously large number (misspelled) and poof! gone were the great adventurous metaphors.  You no longer explored, navigated, or even searched or browsed the internet, you no longer spent time or worked to find what you needed, you just Googled it. And if Google didn't find it, it probably didn't exist.

Coincidentally, the rise of Google corresponded to the electronification of the workplace.  Files, documents, memos, and forms were all replaced by their electronic counterparts.  Rather than creating everything on paper in triplicate with copies sent to catalogers and archivists, we began to store electronic versions in document management systems, shared network drives, keychain thumb drives, email inboxes, intranets, extranets, collaboration portals, and knowledge bases. We stored at least as many different document formats as we had locations. We had faith in the promise of Google, or their in-house equivalent, to sort it all out later.

In truth, Google had it relatively easy.  They had one type of document (HTML) and one ranking metric (popularity, as calculated by PageRank).  That alone was enough to catapult them beyond their search competition. But of course they didn't stop there. Google hired the best engineers, built the best infrastructure, and greatly expanded their computing power.  Then they built or bought internet tools that made your online life easier and they gave them away for "free" in return for your personal information.  With GMail, Google+, Google Drive, YouTube, Blogger, and many more, Google now knows darn near everything about you.  They incorporate all of that information into your own personalized search algorithm which returns unique results tailored just to you. You just type in a few moderately specific words, Google does it's magic, and chances are pretty good you'll find what you're looking for.

Users have come to expect - even demand - Google simplicity, Google speed, and Google quality. Which brings us to the current state of Enterprise Search. Enterprise Search tools have more in common with pre-Google search engines than they do with Google. They don't know very much about you and they don't tailor your results accordingly. They rely on your own infrastructure, computing power, and IT technicians to support their product. Most importantly, they don't have a single, simple metric, like PageRank, with which they can easily filter the results for relevance. Instead, they rely on algorithms which weigh the prevalence and proximity of search words in the indexed content to determine relevant results. This is roughly the equivalent of determining the most powerful family in town by the number of entries in the phone book with the same surname. To be fair, despite their own Enterprise Search Appliance, Google hasn't made huge in-roads in the enterprise either.  I suspect that's because it's actually much harder (and less lucrative) to do Enterprise Search well than it is to index the entire internet.

Still, we need to help our users find relevant content within the enterprise, so what can we do?

First, we need to start by managing the expectations of the users.  "We're not Google. You'll have to be much more specific about what you're looking for if you hope to find it. You may even need to learn (gasp!) how to perform Boolean operations, or at least take the time to use the check-box filters we've provided."

Secondly, we need to admit that despite the remarkable success of Google, search is not obviously the best way to find all content.  We could probably learn a lot from those forms we used to fill out in triplicate.  Habitually sending copies of content that needs to be indexed, cataloged, and archived to people whose job it is to help us find it later isn't a bad idea, it just got superseded by our affair with technology.  Maybe instead of search we should focus more on new ways to use technology to help those people do their jobs.

And finally, I want to challenge any potential young Larry Pages out there to come up with a simple idea like PageRank for enterprise content. It will probably seem extremely obvious in retrospect and I promise it will make you fabulously wealthy.

Unless I think of it first.

Bookmark and Share

5 comments:

David Hobbie said...

Ryan--

Great point about the importance of setting appropriate expectations about search in the enterprise.

Google knows a lot about who is searching because it can usually track your searches from a specific address, and also, as you note, because of all the great free stuff it gives you with which you can email, blog, and collaborate.

The enterprise also knows a lot about you also. It knows attorneys' practice area, staff role (2nd year associate / equity partner), matters billed to, clients, documents you've authored, and many other kinds of data. I've seen enterprise search leverage this when directed at people as a content set, that is, in expertise location. I haven't seen the kind of customization by role, office, practice area, and expertise, perhaps because Google-like economies of scale are not available and perhaps because obtaining decent legal relevancy is really hard all on its own.

John Hightower said...

You are right on point. To start an argument--or not, this is EXACTLY what the main problem is with Westlaw Next and Lexis Advance.

We're sold that each of the search engines will find "EVERYTHING," when they don't.

For those who have not heard this before, sometimes a classic Westlaw or Lexis natural-language search is better at locating something.

DavidJ said...

Our Firm uses HP/Autonomy's WorkSite with Express & Miner Search capabilities enabled. Despite all of bad financial news developments, I have found that the weighted [apparently based on personalized search patterns] Google-style search capabilities of Express Search combined with the ability to group by Author [or other profile content] to be very useful. While this isn't enterprise search (search is only against the DMS) it does cover most of the content our Legal users need. Since Express search was implemented demand for Enterprise search across other enterprise data resources has decreased. We've looked at IUS and Lexis Search Advantage but as of yet haven't found that they were either too pricey for net benefit or that they didn't meet our performance expectations (security, speed, quality results).

Tony Chan said...

Since Google cannot anticipate who's searching what in any given moment, it literally has to index everything for searching. Enterprise search does not necessarily have to do that. As long as the users can find relevant work-related items in context, we should be in pretty good shape. That said, I think an effective enterprise search system should be based on WORKFLOW-- the system knows who the user is (user's role within the organization) based on his/her internal network log in and adjust the searchable content / data set accordingly. I believe this strategy would yield more precise / relevant search results, in contrast to indexing and making vast data sets universally searchable for very user.

Josh Liu said...

Very interesting point. Just like what Ryan said, I believe that "people" could be the search central point, and all the contents should be context around "people" in the enterprise environment.

We are working on a startup, helping professionals share, find and capture knowledge. Perhaps we could give your feedback on what we are working on?

Josh

 

© 2014, All Rights Reserved.