7/28/15

Watson Graduates Law School, Returns to America



We all know this coming of age story. A boy leaves home to study abroad, sows his wild oats, and returns home a grown man, wiser and ready to take on the world. Except this coming of age story has a bit of a twist. The boy is actually a computer. And that computer's name is Watson.
ROSS Intelligence, which is making headlines for its novel application of the IBM Watson machine learning platform to legal research, has been hard at work training the system to understand law. The team originally worked with Canadian legal content and lawyers, teaching Watson what "good" results looked like. But yesterday, the ROSS team announced they are bringing Watson back to the States to tackle US case law. They also announced support and funding from a powerful investor: Silicon Valley's Y Combinator. ROSS is starting small with bankruptcy and, in a similar fashion to their original work north of the border, has partnered with a number of pilot law firms. But make no mistake, this first small step is likely to create tremendous ripples in the legal profession as their program expands.
I sat down recently with two of the co-founders of ROSS Intelligence, Andrew Arruda (CEO) and Jimoh Ovbiagele (CTO), to learn a bit more about ROSS and their experience with Watson.
One of the first topics was whether ROSS complemented or replaced the likes of LexisNexis and Westlaw. Arruda's perspective was that it complemented traditional legal research for now, but the goal is ultimately to replace them. In reality, it is a bit of an apples and oranges comparison. Traditional legal research vendors generally provide data and a search box, leaving much of the heavy lifting to lawyers. This approach was well-suited to the "leave no stone unturned" philosophy that guided legal research in the golden age of law. ROSS, on the other hand, serves up insights based on a more natural dialogue between the lawyer and its Watson-based system. This approach fits better in a post-recession world where clients are cost-conscious and expect efficiency in their law firms.
For now, ROSS is still relatively targeted in its scope and utility. LexisNexis and Westlaw have massive stores of content they either own or license, and they have spent decades gathering and curating this content. Matching their breadth and depth of content will be a daunting task, to say the least. But the big vendors would be foolhardy to ignore this threat. Anyone familiar with Clayton Christensen's The Innovator's Dilemma and the concept of disruptive innovation knows that incumbents are often unseated when entrants perfect their technology downstream then move to compete directly. As Arruda says, "Think of us like the Netflix of legal research; we are going to keep adding capabilities and original content until lawyers no longer have a reason to stay with their traditional providers and can cut the cord."
ROSS's pilot approach is consistent with this notion. They start by turning associates loose, using ROSS just as they would other research tools (yes, Google and Wikipedia, you're included in that list). As ROSS returns results, associates can provide feedback on whether ROSS’ answer was helpful. If it is not, the result is dropped and the next most relevant one is shown. This user feedback loop helps ROSS understand what is relevant for a particular topic.
As ROSS gets a more sophisticated understanding of an area of law, the pilot then moves upstream to senior associates and, ultimately, counsel and partners. This incremental approach to learning is a recurring theme in the world of deep learning, where AI systems learn in much the same way as children. In this instance, the ROSS team took Watson to law school and is now guiding ROSS through its first years at a law firm.
I asked Arruda and Ovbiagele about some of the challenges they faced adapting Watson to the legal profession. I have some familiarity in this space, having built several AI systems for LexisNexis back in the early 2000's. One of the key issues is the structure of the typical legal document. If you break it down, much of the text in a brief or agreement is not really that important. It's filler text or scaffolding where the real meat of the argument is hung. Take the heading of a court filing, for instance. It may say "In the 2nd district court of appeals," but really all that matters is 2nd and appeals. All that extra text, like "in the court of," or the "by and between parties" in an agreement, really don't mean much. But to a system trying to extract and make sense of concepts, the extra text is a real problem.
Arruda and Ovbiagele confirmed they experienced the same issue. Much of their work has been tailoring and building an infrastructure around Watson to make legal text understandable. While some may cry foul at this level of intervention, that is the reality of where we are with AI. There is currently no "silver bullet" general purpose AI that is fully automated. But that does not stop the creation of targeted, specific-purpose AI like ROSS. And as has been shown in many other domains, that level of targeted AI is usually sufficient to disrupt an industry.
We also discussed how Watson is designed for a very specific type of question/answer interaction. Developers are constrained to a very specific formula of content ingestion, topic extraction, and tuning of relevant answers to questions. There are many other machine learning techniques out there - clustering, classification, prediction - that Watson does not do. ROSS, like many other Watson applications, layers their own special sauce on top of Watson to make results even more relevant and meaningful. "ROSS is a composite of AI technologies with Watson at its center, but we have a dedication to using the best methods available for this grand challenge," explained Ovbiagele.
So what's next for ROSS? With their move to the largest legal market in the world, it is clear they are setting their sights on broader application, both in terms of practice areas and law firm customers. But much remains to be seen. How fast will this occur? What will the business model and cost ultimately look like? How will other legal research providers like LexisNexis and Westlaw and intelligence system providers like Kira Systems react? And perhaps most importantly, will lawyers embrace help from a computer as it becomes more human-like?
One thing is clear. Disruption is coming to legal, as it has to so many other industries, and this time there is a feeling of inevitability. Lawyers and firms will have a choice: adapt, or perish.

++++++++++++++++++++++++++

Matt Coatney is an AI expert, data scientist, software developer, technology executive, author, and speaker. His mission is to improve how we interact with smart machines by making software smarter and teaching people how to work (and cope) with advanced technology. Great things happen when smart people and smart machines work together toward a common goal.
Follow Matt on LinkedIn and on Twitter @mattdcoatney. Follow the conversation at #BridgingTheAIGap.

Bookmark and Share

5 comments:

Kate Simpson said...

Nice summary - calm and measured, just what the Dr ordered.
Still a little bothered by the fact that Watson's first teachers are associates rather than librarians and retrieval experts, but am looking forward to seeing this graduate all grown up.

Greg Lambert said...

I might be a bit jaded from my previous experiences with "crowdsourcing" and lawyers, but this part might be the weak link of ROSS' plan:
As ROSS returns results, associates can provide feedback on whether ROSS’ answer was helpful.
Relying upon lawyers to give you feedback (at least actively giving feedback) is a lost cause in my opinion. It is just not in their nature to interrupt their process and check a box of "good" or "bad" results. Maybe ROSS is the exception to the rule, but I seriously doubt it.

David Hobbie said...

Greg--I don't disagree with your skepticism about obtaining feedback from lawyers that isn't integrated with their workflow. From the description, however, it sounds like the feedback request might be built into the way that lawyers retrieve information from the tool. Perhaps they get a limited set of results or a small number, and if they reject / "dislike" some of them, they get more or the "page two" results.

Greg Lambert said...

David - if they can build it into the process, and not make the feedback a separate function, then that would work much better.

Matt Coatney said...

Greg and David, great conversation. My understanding is it currently behaves a bit more as David describes, with the feedback mechanism built into the natural workflow. There's an interesting psychological component here to. A recent study published in HBR found that people generally trust their gut more than a machine's, even when it is shown their judgment is POORER. But, the same study found that giving people the opportunity to influence the algorithm actually led to better acceptance. Watson found the same thing in its medical application with doctors. The real test will be whether lawyers, who are skeptical and hunt for problems by nature of their training, are willing to accept less relevant results as a natural part of a machine's learning.

 

© 2014, All Rights Reserved.