Back in the wild cyber-space days of the early 1990s the metaphors we used to describe our online tools were thrilling. We used web-browsers called Navigator or Explorer, we found our way in the real world using MapQuest, and we searched for content along the information super-highway using engines called Magellan, AltaVista, or Northern Light.  During this time it was not uncommon to spend hours browsing the internet. But most people didn’t browse for enjoyment, they set out to find the phone number for their local pizza joint and ended up reading about the life cycle of the African Tsetse fly.  We browsed because there was no easy way to find exactly what we were looking for.  Favorites, or Bookmarks, weren’t simply reminders of something interesting to read later but electronic insurance that once we found something useful we’d be able to get back to it again.

This bizarre, illogical, exciting, spontaneous, and often frustrating world began to change in 1996 when two Stanford students had a brilliant insight. They surmised that the number of links that pointed to a particular site might be a good indicator of how popular that site was and that the popularity of the site might be a good indicator of how valuable the information contained therein might be to a searcher.  They gave their mashup of textual search and link popularity the name of a ridiculously large number (misspelled) and poof! gone were the great adventurous metaphors.  You no longer explored, navigated, or even searched or browsed the internet, you no longer spent time or worked to find what you needed, you just Googled it. And if Google didn’t find it, it probably didn’t exist.

Coincidentally, the rise of Google corresponded to the electronification of the workplace.  Files, documents, memos, and forms were all replaced by their electronic counterparts.  Rather than creating everything on paper in triplicate with copies sent to catalogers and archivists, we began to store electronic versions in document management systems, shared network drives, keychain thumb drives, email inboxes, intranets, extranets, collaboration portals, and knowledge bases. We stored at least as many different document formats as we had locations. We had faith in the promise of Google, or their in-house equivalent, to sort it all out later.

In truth, Google had it relatively easy.  They had one type of document (HTML) and one ranking metric (popularity, as calculated by PageRank).  That alone was enough to catapult them beyond their search competition. But of course they didn’t stop there. Google hired the best engineers, built the best infrastructure, and greatly expanded their computing power.  Then they built or bought internet tools that made your online life easier and they gave them away for “free” in return for your personal information.  With GMail, Google+, Google Drive, YouTube, Blogger, and many more, Google now knows darn near everything about you.  They incorporate all of that information into your own personalized search algorithm which returns unique results tailored just to you. You just type in a few moderately specific words, Google does it’s magic, and chances are pretty good you’ll find what you’re looking for.

Users have come to expect – even demand – Google simplicity, Google speed, and Google quality. Which brings us to the current state of Enterprise Search. Enterprise Search tools have more in common with pre-Google search engines than they do with Google. They don’t know very much about you and they don’t tailor your results accordingly. They rely on your own infrastructure, computing power, and IT technicians to support their product. Most importantly, they don’t have a single, simple metric, like PageRank, with which they can easily filter the results for relevance. Instead, they rely on algorithms which weigh the prevalence and proximity of search words in the indexed content to determine relevant results. This is roughly the equivalent of determining the most powerful family in town by the number of entries in the phone book with the same surname. To be fair, despite their own Enterprise Search Appliance, Google hasn’t made huge in-roads in the enterprise either.  I suspect that’s because it’s actually much harder (and less lucrative) to do Enterprise Search well than it is to index the entire internet.

Still, we need to help our users find relevant content within the enterprise, so what can we do?

First, we need to start by managing the expectations of the users.  “We’re not Google. You’ll have to be much more specific about what you’re looking for if you hope to find it. You may even need to learn (gasp!) how to perform Boolean operations, or at least take the time to use the check-box filters we’ve provided.”

Secondly, we need to admit that despite the remarkable success of Google, search is not obviously the best way to find all content.  We could probably learn a lot from those forms we used to fill out in triplicate.  Habitually sending copies of content that needs to be indexed, cataloged, and archived to people whose job it is to help us find it later isn’t a bad idea, it just got superseded by our affair with technology.  Maybe instead of search we should focus more on new ways to use technology to help those people do their jobs.

And finally, I want to challenge any potential young Larry Pages out there to come up with a simple idea like PageRank for enterprise content. It will probably seem extremely obvious in retrospect and I promise it will make you fabulously wealthy.

Unless I think of it first.

Image [cc] nengard

On Monday, ReadWriteWeb contributor, Richard MacManus, wrote an article called Why Topic Pages Are The Next Big Thing. The article starts out with something that might look very familiar to a librarian… especially one experienced in cataloging structure.

Chronological and real-time consumption of content just doesn’t work anymore. It’s time for topic pages to add a layer of organization on top.

MacManus argues that the way products like Facebook, Twitter and even blogs are consumed in a Last In, First Out method doesn’t match the overall needs of those consuming these products. “The time for topic pages has come,” writes MacManus. Most librarians would probably agree, and say that the time for topic pages has come, again.

The examples that MacManus uses for showing the trend is once again moving toward topical rather than chronological organization are Medium and Pinterest. Both of these products are ‘visual’ aggregators of information, and I’m afraid that the underlying message of why topical organization is important might be lost because of these examples. Topical organization based on visual cues are very cool to the eye, but this type of organization isn’t limited to this type of stimulus. While I was reading this article, I began thinking of KM projects that could benefit from the type of topical structure that projects like Medium and Pinterest are attempting to do. Come to think of it… maybe KM could adopt some of the visualization found in these ideas, but that’s a topic for another post.

The other key ingredient that is mentioned in the article will also make catalogers smile. MacManus specifically points out the flaws in the current topical organization of products like Twitter, Flickr and Delicious is that these are “freeform” topic generating products. Products like Medium are attempting to control the topics, much in a way that AACR2 attempts to structure information resources into specific cataloging rules. In other words, the topics are generated from an organized list rather than just making up new tags every time someone uploads a new piece of information. It seems that the narrowing of subject headings is where MacManus thinks we should be going.

I was not the only person to make the link between what MacManus is advocating in his article, and the traditions found in library subject headings. Luc Gauvreau commented that this is nothing new at all:

And organizing informations by topics is really not new, libraries do that for centuries. And culture around the World always find a way to categorize, classifie and order their infos: it’s the only way to understand something, give meanings to the world. Dewey and Library of Congress classification, traditionnal bibliotheconomy are old, almost obsolete, but their [objective] is the good one. A date, chronology in itself is nothing, it must be related with topics and space (places) to mean something.

Of course, the same old problem that we run into here is that subject classification isn’t an easy process, and cannot be automated in a way that doesn’t end up causing more problems than it solves. If it were easy, or there was a way to automate the process, then we would have a way to apply Library of Congress classifications to the Internet and voilà, problem solved, and Internet organized! Perhaps there is some happy, forgive the pun, Medium, here that the new Topic Pages products will find. I’m sure if they need help in getting there, the creators of these products can contact their local library cataloging departments for guidance.

Image [cc] yuan2003

Let’s face it. Social Networks work fine when you’re sharing information with your friends, or even with peers within your industry subset. Social Networks at your place of work, however, usually don’t work very well at all. There are probably a thousand reasons why this is, but I think one of the biggest reasons is that people don’t really want to expose what they are doing at work to their colleagues. I know that on its face, that sounds ridiculous, but it seems to be true. Most likely, they don’t want to feel like they have to update their work status because it might come back to bite them later in an employee review. The whole act of covering your backside creates an environment where communications conduits such as work site social networks are viewed as counter-productive, when, quite frankly, these types of communications tools would actually increase productivity. So how do you build an environment that takes advantage of the daily activities of workers in a social network-like structure? HP Labs has one idea… build an automation process that updates their employees’ status automatically and create a social network that simply builds itself.

Mashable reported on HP Labs’ “Collective Project” this morning, and it made me wonder how, or if, this type of automated social network could work within a law firm. Here’s the basic structure of the Collective Project, all of the processes appear to be automatically created and adapted over time base upon the project’s internal algorithm and taxonomy structure:

  • Personal Profiles are created 
  • Preferences and Expertise is automated
  • Documents are profiled
  • Employees are connected to those files
  • Employees with similar interests can be identified
  • Document permissions can be customized to prevent unauthorized access
The idea behind this is to identify connections based upon “inferred expertise” according to HP Labs Israel director, Ruth Bergman. Bergman has used the Collective Project to identify co-workers with similar experiences and interests, and seek them out at conferences they are both attending. 
There are a lot of firms looking and implementing Enterprise Search tools right now. Could the idea of an “inferred expertise” system like the Collective Project be duplicated in these enterprise search systems? Could a defacto social network be created within a firm? How would attorneys and staff view such a system… Helpful or Big Brother? 
There may be a handful of firms out there that have thriving internal social networks, but there aren’t very many. Is the idea of having some type of automated social network something that would benefit the law firm environment? Now that I think about it, you’d probably have to call it something more generic like “inferred expertise database” to quell the paranoia that surrounds the “social network” term. There seems to be potential in creating something similar to the Collective Project within an enterprise search resource, but would the culture of the firm accept it? I’d like to say yes, but my gut’s saying no.

When it comes to search techniques, you’ll never find a more anal group than librarians. So when search tools start mucking around with the search short-cuts, we tend to shake our heads and mumble something like “it worked just fine the way it was… why do you keep changing it??” In fact, there are many law librarians out there that still think that “Dot Commands” in Lexis is truly the only way to conduct legal research queries. Now it is Google that has changed one of the basic search commands that we’ve used for years and replaced it with what amounts to a two-step process to replace a single one.

A couple of weeks ago, Google removed the “+” search command that allowed you to search for a specific spelling of a word or words (without Google correcting your spelling or stemming the word.) According to Google’s research, two-thirds of the commands that used the “+” designator were using it incorrectly, so they replaced the “+” with the either using a double-quote or by clicking on the new advance tool called “Verbatim.” (I’m actually thinking that someone in Google’s marketing department mentioned that they needed to get rid of the “+” command because someone might confuse it with their social media site name of “Google+”… but that’s probably just me being paranoid.)

I’m not one who argues against change, in fact I usually am one of the first to embrace these new features, but I have to say that I’m a little disappointed that Google dropped the one-step “+” command and replaced it with a two-step “double-quote” command (you have to close that quote!) and the three-step “Verbatim” resource (you have to run the search, then click “Show Search Tools” on the left-hand side of the results, and then click on “Verbatim” to get the results.)

Rest in peace my old friend “+” and long live “double-quote” and Verbatim… at least until some future round of enhancements comes along and finds that two-thirds of people are not using you correctly and finds a four-step process to make it easier.

A couple of weeks ago, I wrote about the differences between information and knowledge and the categorical mistake that even many KMers make by conflating the two. However, knowledge is often further sub-categorized into two realms, tacit knowledge and explicit knowledge. These can also get kind of confusing. Tacit knowledge exists only in the minds of the knowledgeable. It includes memories, ideas, concepts, and understandings. Explicit knowledge gets tricky because it is also a type of information. It’s a record of tacit knowledge which can be stored and retrieved just like any other type of information. Confusing, right? I’ll illustrate my point with the following scenario: You’ve racked up a large set of expenses on a recent business trip and you want to be reimbursed by your company. So you call Ted, the Accounting Manager at your firm, and ask him what steps you need to take to be reimbursed by the firm. Ted, explains that you need to fill out a reimbursement form, get it signed by a manager, and send the form, along with a copy of your receipts to the accounting department to be processed. You should receive your check in 4 to 6 weeks. “Oh, and by the way,” Ted says, “you can always go to http://reimburseme.myfirm.com to see these steps again.” You thank Ted, fill out your form and in 8 weeks you get your check. In this anecdote we have clear examples of tacit knowledge, explicit knowledge, and information.

  • Ted’s knowledge of the necessary steps to get reimbursed constitute tacit knowledge. It exists only in Ted’s memory and it is only retrievable by speaking directly with Ted. (Or someone else with the same tacit knowledge.)
  • Ted’s tacit knowledge has been transformed into explicit knowledge by recording the steps on the website. That explicit knowledge is information that is available and retrievable by anyone in the firm at any time.
  • The filled-out reimbursement form that you send along to accounting is not itself knowledge, it doesn’t describe a process and isn’t in any way actionable, it is simply information.

In my earlier post I described the DMS and Enterprise Search as primarily information management tools. They allow you to store and retrieve information across the firm. Since explicit knowledge can take the form of recorded information, it can also be stored in the DMS or on a webpage and can be retrieved with Enterprise Search tools. No one questions the business value of a document management system and most firms have some form of enterprise search in place to find information and explicit knowledge. But the vast majority of the knowledge that exists in any firm, is tacit . It’s only in the minds of your knowledgeable employees. Often, they don’t have time, nor the inclination, nor the incentive to transform their tacit knowledge into explicit knowledge and consequently that knowledge is only available to themselves and to their immediate circle of coworkers. Enter Social Networking tools. SN tools turn tacit knowledge into explicit knowledge that is storable, retrievable, and searchable. If Ted in Accounting is keeping a blog of the goings-on in his department, then a simple search can indicate to Alice in HR that accounting has dealt with an issue similar to the one currently vexing her department. The ESN tools have made knowledge, that would have otherwise remained tacit, explicit. Alice talks to Ted, learns from his experience, and solves her problem faster. Time, money, and resources saved. Bigger bonuses for everybody. But here is what I believe to be the definitive business case for ESN. These tools not only constitute a modern communications infrastructure, and near-magically turn tacit knowledge into explicit knowledge, they are also the equivalent of direct enterprise search for tacit knowledge locked in the minds of your employees. An in-house micro-blogging solution with moderate participation allows employees to mine the tacit knowledge of their co-workers across the enterprise. Even if Ted in Accounting isn’t keeping a blog record of his department’s activities, the micro-blog allows Alice in HR to find Ted’s tacit knowledge by asking simple questions: “Has anyone had a problem like this? How did you deal with it?” Even if Ted hasn’t jumped on the micro-blog bandwagon, someone in his circle of co-workers may see Alice’s question and point her to Ted. Alice has in effect searched the tacit knowledge of the firm and by doing so has created a bit of explicit knowledge that Ted in accounting is knowledgeable on a particular subject. If Ted jumps on board and answers Alice’s inquiry on the micro-blog network, or writes a full blog entry, or creates a wiki-page, then Ted’s tacit knowledge is now explicit and available to the entire firm. The ability to search the tacit knowledge of your staff and to simultaneously turn that tacit knowledge into explicit knowledge for future use. How’s that for a business case?

I know, I know.
What’s with the Google +1 ad to the right of the 3Geek blog?
As I so screechingly lamented in my 3/31 post, “So Anyone Out There ‘Liking’ Google’s ‘+1’?” , Google sent out emails advising them of the launch of the +1 buton today.
I’m talking out of both sides of my mouth.
So sue me.
I’m a lawyer. And a marketer. What do you expect?
So. Do you like us? Really, really like us? Then +1 us!

Google just announced its answer to Facebook’s “Like” button: the “+1” button.
What will happen is that the “+1” will display on your search results as long as you are logged in to your Google account and have opted to participate in the Google +1 Experiment.
I’m not so “like”-ing it. While investigating, I found out that I had to publicize my private Google account. And that all private Google accounts would be deleted by 7/31/2011. 🙁
As a suspicious, neurotic female, I am loathe to publicize my location information. Call me crazy, but even frumpy old me has had my share of quasi-stalking incidents: old boyfriends looming from the past, penal residents. You get my drift.
(As a side note, as a new female lawyer, I got my share of handwritten appeals mailed to my home address from Texas prisoners begging me to represent them. I quickly learned to unlist myself. As a newbie attorney, tho, it was quite a shocker.)
But I needed to check out this whole +1 craze for my job. I’m diligent that way. So I go check out my profile to make sure everything is copasetic. That’s when I realize that Google is going to make me go public.
Google states, “If you currently have a private profile but you do not wish to make your profile public, you can delete your profile. Or, you can simply do nothing. All private profiles will be deleted after July 31, 2011.”
WHAT?!?
I gotta come out of my nice, cozy, private world in order to play with +1? Heck, even Facebook let’s me lock down my account.
So you know what I did, right? Fake data. I’m ageless, virtual and able to not just bilocate but multilocate.
Some tips to stay quasi-private:
  1. Don’t use your real name
  2. Use a generic phrase to describe your business rather than giving a business names or school names
  3. Limit who can see your e-mail, home and business addresses. I set mine to contacts and family.
  4. Don’t display the year of your birthday. You can limit display to only family and contacts.
  5. Don’t specify your gender.
  6. Lastly, and perhaps most importantly, do not display your customized URL. It shows your e-mail address.
OMG. I like Google, in a generic, gotta-have-it kind of way. And I know their mission is “do good”. But just kinda have this itch in the back of my cranium that says in 100 years, Google is gonna be Hal.
You know, Hal for 2001 a Space Odyssey?
Ugh.

Anyone that has ever tried a “Natural Language” search… whether using something as generic as Google, or searching a more focused databases like Westlaw or Lexis… knows that it is a hit or miss type searching strategy. The nuances of the English language make a sentence like “Last night I shot an elephant in my pajamas” nearly unintelligible for computers. (How did that elephant get in your pajamas, anyway??) Legal research providers have always dreamed of establishing an algorithm that can take a normal sentence that a human can interpret fairly easily… understanding that the person was wearing pajamas, not that the elephant was in the pajamas… by using their experiences, knowledge and intuition to understand exactly what the sentence means, thus being able to give an appropriate response to the sentence. It is this ability – this insight – that humans have, that computers simply have not been able to accomplish so far.

Enter IBM’s “Next Grand Challenge” where the scientists at IBM accept this challenge, and attempt to create a computer system that can not only handle natural language, but can understand the nuances that are found in the game show Jeopardy!

The IBM Jeopardy! Challenge poses a specific question with very real business implications: Can a system be designed that applies advanced data management and analytics to natural language in order to uncover a single, reliable insight – in a fraction of a second?

The IBMers are calling the project “Watson,” named after the company’s founder, Thomas J. Watson (not Sherlock Holmes’ sidekick like I initially assumed.) Sam Palmisano, IBM Chairman and CEO, says that , like its big brother, Big Blue (the computer Chess Master), or Blue Gene (Human Genome Project) Watson is attempting to do something that many people believe is impossible for technology to accomplish – “the ability of a computer to do something that’s far more challenging than chess: to understand natural human speech about a limitless range of topics, and to make informed judgments about them.”

Here are some videos that explain the Jeopardy! challenge, and the glitches, and accomplishments that Watson has shown so far. If you are a legal researcher, you should watch these videos from that angle, and think about the possibilities that can come from applying the techniques that IBM is using to answer the scope of questions presented on the game show, and start wondering how that could apply to a more narrow set of legal topics and questions that we face on a day-to-day basis.

The Next Grand Challenge

The part on “Open Question Answering,” that Dr. Katharine Frase discusses around the 2:00 mark the issues and differences between “searching” and “keywords,” and the issues of understanding and interacting in “the way normal humans communicate.”

What is Watson? Why Jeopardy?

Because you have to really understand the complexity of the English language, not just the pieces of information, the nature of the game Jeopardy! presents a very good challenge for Watson to not only extract knowledge, but to interpret that knowledge. Watson has to understand the nuances of the “answer” that is presented by the Jeopardy! host, and not only answer it quickly and accurately, but also to understand when not to answer if it is wrong (risk factors.) That’s a very complex idea, and one that made for some funny answers at first, but over time, Watson started getting the “questions” right… 15% of the time, 50% of the time, then 60% (average player level), then 70% (average Champion level), 80% (3x Champion level), then 90% (Grand Champion level). What was a little scary, was the speed at which the increase occurred… in less than a year, it went from 15% accuracy, to over 80%.

In 2011, IBM’s Watson is supposed to compete on an actual show of Jeopardy! It will be interesting to see how the technology advances of “Open Question Answering” work not only in the areas of answering game show hosts… but how this type of advancement in natural language in computer databases can work to improve the way those of us conduct what we call “search” today.

One interesting issue that I saw on another video that goes into depth of what Watson can do, one of the first questions that Watson was asked, was answered incorrectly (according to a comment, and the answer I got from WolframAlpha.) Watson answered that “ln((12546798*pi)^2)/34567.46” was 0.00885, and the answer according to other sources is actually 0.001011917. Will one of you with a degree in Mathematics (or at least a good calculator) double check that, please? If Watson answered this incorrectly, then IBM may want to look at Watson’s math algorithms one more time before going on to face the Jeopardy! Challenge.

[Note: seems that Watson wasn’t wrong after all… see the comment below that explains the issues with the parens placement.]

Watson’s Question
Watson’s Answer
WolframAlpha’s Answer

While I was attempting to find an article yesterday, I stumbled upon an interesting looking search tool called Interceder (apparently a Spanish word meaning “to intercede”). This dashboard approach at search results definitely caught my eye because it had so much information strategically placed in what looks very similar to SharePoint Widgets. When I ran the search for Thomson Reuters, it produced a number of widgets ranging from basic information on the company, to the latest Tweets, to a breakdown of the latest news placed on a timeline.  I also found that you could display the results in either a basic “list” (news timeline, company info, people mentioned), or in an expanded “dashboard” format (list view, plus, spotlight, video, Twitter, Blog, and related search information.

List View
Dashboard View

It looks like Interceder uses a number of search tools and brings the results back from individual resources into its own widget box. Company information is pulled from Freebase; Videos from YouTube; Blog and News from Google News and Google Blog Search; [got an email from Interceder developer Michael Bade and he let me know that they actually pull news from the Yahoo BOSS service rather than Google News]; Semantic search and related people results from Open Calais; and, Tweets from Twitter Search. The process seems pretty simple, but the results look very professional.  The search is very basic, and no advance search functionality, such as phrase searching, date restrictions or Boolean style searches, appears to be available for more complex searching.  In fact, if you try to do any Boolean searching, it will simply take the first word and leave the rest off the search results.

It looks like the domain for Interceder was set up in 2008, but for some reason I’ve never ran across it before (nor have any of my research friends that I showed it to yesterday.) I’ve put in an email to the Interceder contact address, but it looks like it might be an Australian website (so I may not get a response until tomorrow.)

Even with the simple search capability, Interceder is a fun search tool to use, and I think you’ll find the way it displays the results to be fascinating.  So, go do what I did… search the name of your company, your college, and your kid’s school to see how Interceder structures the search results. If you’re like me, you’ll find Interceder to be a fun way to search the Internet.

When I got my iPad earlier this year, the first app I actually ponied up real money for was a song recognition app called SoundHound. It is no secret to my friends and family that I love punk bands with female lead singers. So, when I run across a new-to-me band that I like, I use the SoundHound app to help me learn more about the band, their music, tour dates, and other information like bands that may sound similar. Now, stay with me here because I’m going to talk about how I use SoundHound, and then I’m going to bring it back to law firms and how Knowledge Management could think about how we deliver content and information back to our clients and even bring in how LexisNexis for Microsoft Office may already be on the right track.

Let’s say I’m listening to The Dollyrot’s awesome remake of Melanie’s 1971 novelty hit “Brand New Key” and I want to learn more. Using SoundHound, I simply tap the “What’s That Song” button, and it magically comes back with the information I need. Actually, it isn’t magic at all, it is called “Sound2Sound” (S2S) technology… it just seems like magic to those of us that benefit from the technology. Without going too deep into the S2S tech, here’s what SoundHound has to say about it:

SoundHound’s breakthrough Sound2Sound technology searches sound against sound, bypassing traditional sound to text conversion techniques even when searching text databases. Sound2Sound has resulted in numerous breakthroughs including the world’s fastest music recognition, the world’s only viable singing and humming search, and instant-response large scale speech recognition systems.

Yes, you read that right, it also allows you to sing or hum a tune and it will use that to attempt to match the song you’re looking for. Okay… putting all the ‘coolness’ of what it does to the side for a moment, let me show you the practical aspects of what I get from the SoundHound results.

  1. Picture of Album Cover
  2. Name of Song
  3. Name of Band
  4. Name of Album
  5. If this song exists on my iPod (it does!)
  6. Ability to bookmark the song
  7. Share this info (via email, twitter, or Facebook)
  8. Buy the song on iTunes
  9. See the lyrics (so I can sing along!!)
  10. See any videos from YouTube of the band
  11. See other songs for sale on iTune from the band
  12. Launch the Pandora Station of this band
  13. View the tour dates
  14. Find similar artists
  15. Find other fans of this band

    In addition to this, you can also get to information such as:

  16. Biography of the band
  17. See other albums by this band
  18. See other songs on this album
This is a lot of information available from my initial search. From a Knowledge Management point of view, this gets me the information that I didn’t know I needed. Such as, The Dollyrots were currently touring Texas with Bowling for Soup. As a result, I bought my tickets and took the whole family to the concert. And we even got to hang out and get autographs from the bands.
[Note to self… next time, leave the girls at home. A three-hour concert is a little too much for pre-teens — and that can cause 40something year old parents to also become exhausted!!]
Hopefully you’ve stuck with me here because I’m about to bring this back to Knowledge Management (KM).
Do any of our KM products give us these kinds of rich information results when we search them? Is it possible to create a “Document2Document” product that emulates what I can do with SoundHound’s Sound2Sound technology? Can I be in a document, or a web page, or an email, click a “What’s This Document?” button and get a list of related content results? Could our compiled knowledge that is stored in Document Management Systems, Internal and External Databases, Client Relationship Management Systems, Intranets, Extranets, etc., etc. give us the rich content results that would help us find useful information? As an end-user, I could benefit from this type of integrated search technology that helps me find similar documents without a lot of effort, and without having to exit the current program to do it.

As I was thinking about this, it does sound like something that LexisNexis is nibbling around with this type of Document2Document technology with their LexisNexis for Microsoft Office (L-MO). Think of L-MO in the way that I used SoundHound above. I’m in a document, I click a button, and I can learn more about what’s in this document as well as related information from external resources (LexisNexis, Martindale-Hubble) and if I have Lexis’ Total Search product, I can also discover related content within the firm’s internal information.

Read through L-MO’s description of what the product does… but think of how it is similar to the SoundHound idea of search and search results.

“Search” – A single search box for one-click access to the vast collection of legal content from LexisNexis, the open Web and the user’s desktop files. Results are displayed in a window next to the active document for easy review and management – virtually eliminating the time and effort to switch between the desktop and search sites to conduct research.

“Background” – This function scans an Outlook message or Word document for “entities” such as people, companies, organizations and cases and provides hyperlinks to relevant documents, full text case law, news and other types of information within internal, LexisNexis and Web resources. Upon clicking a hyperlink, information is displayed in a side pane within the Microsoft Office applications. The Background feature also displays Shepard’s® reports and applies Shepard’s® SignalTM indicators directly to the cases cited within the text of the document. These features benefit users by minimizing the need to go back and forth between multiple databases to collect information – speeding the process and reducing the risk of missing information when transposing it from one place to another.
“Suggest” – By manually highlighting any text in an Outlook or Word file, the user can prompt a search that will pull up relevant information from internal, LexisNexis and Web resources. The content is displayed in a side pane within the application for use and management for more productivity and efficient work.

From a user’s perspective, the “Search”, “Background” and “Suggest” seem to relate to the “What’s That Song?”, Band/Album/Song Information, and the “Similar Artists” results you get from SoundHound.

I think that Toby saw this potential a few months ago (so call me a little slow…). I’m thinking that L-MO or some other player is going to catch on to this type of Document2Document technology searching and help KM help the attorneys discover the information and knowledge that they didn’t know they needed to know.