This episode of The Geek In Review has it all. We talk with Kyle Doviken, Senior Director at Lex Machina about their legal analytics tool, and about Kyle’s passion for helping out in the Austin community through substantial Pro Bono efforts. (17:05)

Greg disturbs a recent third-time father, Noah Waisberg, CEO of Kira Systems to see how the acquisition of $50 million in minority funding will help Kira expand its reach into the legal market and, according to Waisberg, well beyond the legal market. (5:35)

We are adding a new (hopefully) installment of updates on government actions, public policy, and other actions affecting the legal information profession. Emily Feltren, Director of Government Relations at the American Association of Law Libraries fills us in on potential actions coming before the midterm elections, and AALL’s push to fill the Privacy and Civil Liberties Oversight Board. (11:10)

Continue Reading Podcast Episode 10 – Lex Machina on Analytics, Kira System’s $50M, and a Gov’t Update

One of the best features that Lex Machina provides for Intellectual Property attorneys is their increased accuracy of information pulled from PACER. The improvements that Lex Machina has made on Cause-of-Action (CoA) and Nature- of-Suit (NoS) codes entered into PACER make it an invaluable resource to clearly identify relevant matters and weed out irrelevant cases. By improving the data, Lex Machina reduces the “garbage in – garbage out” effect that exists in the PACER database.

Now Lex Machina has turned its focus on cleaning up another annoyance found in PACER data, as well as many of the other platforms that pull data from PACER. The Attorney Data Engine analyzes the PACER information and identifies the attorneys that are actually associated with the case, even if those attorneys do not show up on the attorney list via PACER.

I talked with Karl Harris, Vice-President of Products at Lex Machina, a couple weeks ago, and he gave me some insights on the new Attorney Data Engine, and how they are increasing the accuracy of identifying attorneys and law firms that are actually working on the cases filed through PACER. Karl mentioned that in New Jersey and Delaware, two very important states when it comes to Intellectual Property cases, only about 54% of the attorneys that work on the cases, actually show up in the PACER information. That means that nearly half of the attorneys are missing from the metadata produced by PACER. When accuracy is important, missing nearly half of the attorney names can cause quite a problem.

For those of us that ever put on a demo for an attorney of docket information, we know that one of the first questions the attorney asks is “can you find ‘X’ case, which I represented ‘Y’ client?” If you cannot find that information, the demo may as well end right there. Attorneys are issue spotters. If you cannot get accurate information, they will not trust that the product actually works.

With the new Lex Machina Attorney Data Engine, you should be able to find the attorney information, even if PACER missed it.

Here is an overview of the three components of the Attorney Data Engine:

  1. The PACER metadata itself: Every time Lex Machina crawls PACER data, they keep a historical record and can identify when attorneys are added or removed from a case over time. This makes the PACER data better by itself.
  2. Pro Hac Vice Extractor: Docket entries will mention when attorneys are added Pro Hac Vice to a case. Lex Machina also keeps a record of attorneys associated to law firms (over time.)
  3. Signature Block Analyzer: Lex Machina analyzes the documents attached to the docket entries and identifies the signature blocks for each attorney. Even if the attorney’s name doesn’t show up in the Docket entry, if they have a signature block, they are then associated with the case. 
Karl Harris states that the Attorney Data Engine makes Lex Machina “the best source for reliably figuring out which attorneys are involved in which cases.” 
It will be interesting to watch Lex Machina grow over the next couple of years, and to see how its new parent company, Lexis, assists in advancing its progress through access to additional data points. It is not a far jump to see how the Attorney Data Engine processes can be turned into a Company Data Engine using Lexis’ company information databases. Lexis has the content, and Lex Machina has the analytical resources to make that content better. It should make for some interesting results as the two companies learn how to adapt processes to the different products. 
Image [cc] photologue_np

Over the past few years I have been less than impressed with the types of new research tools that have entered the legal market. Especially from the major players. In the past five years, all of the major vendors have re-vamped their flagship products, or have merged with other companies and have updated the interface, and the back end. This makes for a slicker look and feel and some enhancements on the user’s experience, but when you really break it down, it’s really just the same concepts with a few new features and (hopefully) better functionality. When I worked for the Oklahoma Supreme Court Network, way back between 1999 and 2002, I felt like the legal technology field was on the cusp of something really great. Thirteen years later, I feel like I’m still waiting on that greatness to actually arrive. It’s been over a decade of technologies just not quite reaching that threshold, but maybe my wait is finally over.

In the past week I’ve talked with a number of people that have come out of Stanford University’s CodeX, the short name for The Stanford Center for Legal Informatics program. It may be the first time in a decade that I’ve actually gotten excited enough about legal information technology that I thought I need to quit my job immediately and find a way to get involved in these start ups coming out of California. The ideas coming out of CodeX are actually novel concepts, rather than what we’ve seen for many years of simply repackaging old ideas into cheaper, better, easier, or more accessible platforms. CodeX is having a FutureLaw Conference this week, and I’m sorry that I’m not going to be there to see first hand what is the latest technology being incubated in CodeX.

I want to touch on three products, not as a full product review of those products, but rather just from the idea of how they are looking at things differently. All got their start through the Stanford program, and all have some truly unique and original concepts of how to pull relevant information from legal documents.

First up, Lex Mechina.

Lex Machina isn’t new on our radar. We did a bit of a review on this last year. The idea comes from what they call “Legal Analytics” of parsing large amounts of information about judges, lawyers, and other points of data regarding IP Litigation. The concept of analyzing the data to help “predict the behaviors and outcomes that different legal strategies will produce.” The most impressive review of Lex Machina came from an attorney that told me he was tired of getting beat by opposing counsel because they had this product. That is perhaps the best quote to ever hear from your attorneys when you are contemplating buying a new product. It’s hard to argue against.

Second is Ravel Law.

Jean O’Grady has reviewed and talked about Ravel Law, so there’s no need for me to rehash that here. As with many law librarians, sometimes we have to see with our own eyes before we actually “get it” when it comes to new products. I have to admit that happened to me with Ravel Law. I saw Ravel Law’s Co-Founder and CEO, Daniel Lewis, present alongside Fastcase’s Ed Walters at the ARK Group’s Law Library conference back in February, and have to say it was at this time that the “light went on” in my head that we were looking at a different approach. Information laid out in a readable and effective method, along with visual representations that allow a researcher to quickly spot the relevant information quickly and move in a non-linear method toward additional information. The Judge Analytics is one of the most interesting ideas I’ve seen in a while.  It was pretty amazing to watch it all unfold, and come to realize that they were definitely on to something with this product.

Finally, there is Casetext.

Just as with Ravel Law, I just didn’t immediately “get it” when it came to Casetext. However, after having a two and a half hour long call with Pablo Arredondo last week, I immediately became a fan. Just as with the other products, the information is compiled and displayed differently than we typical researchers are use to seeing. Heat maps and summaries and context and innovative citing methods are used to create visually stimulating and logical organization of the information all within the visible screen area. Add to this the ability for users to add in relevant information, upload briefs, and join communities, it just shows the potential of this platform and a truly novel approach at leveraging a community of legal researchers and practitioners.

Are We Seeing the First Steps Away From Keywords?

This is something I think I will come back and visit in later posts, but I wanted to touch on it here. It is my belief that in the next five to ten years we will no longer look at keywords as the primary way to research legal information. I think we are seeing the genesis of that concept here with these three products. In a way, we are looking at a high-level of compiling documents, information, topics, and insights through advanced algorithms or crowd sourced trends and actions. Think of it as the traditional digest system, only automated and always morphing as new information is added or the actions of individuals change throughout the research process. It is a fascinating idea to contemplate, and I really think that we are on the edge of a monumental change in how we typically “find the law” in legal research.

Content Is Still King

What I’m seeing with these product is that we are simply scratching the surface of what is coming next. Lex Machina is taking a tiny slice of the legal information world with its IP Litigation docket process. Ravel Law and Casetext are doing great things with a core set of case law. Imagine what would happen if these and other products start parsing larger amounts of data. No one seems to be touching statutes and regulatory information. Dockets are a logistical mess, but the potential is huge. News, law reviews, blogs, internal documents, state, federal, and foreign and international information are ripe for exploitation from these new thinkers. It will be interesting to see if there are ways that these powerhouses of idea generations will be able to team up with the mega information holders, whether that be governments or private holders, and really test the limits of how we conduct legal research in the future. I, for one, am excited to see what’s next.