Lex Machina Releases the Attorney Data Engine

By Greg Lambert on February 10, 2016

One of the best features that Lex Machina provides for Intellectual Property attorneys is their increased accuracy of information pulled from PACER. The improvements that Lex Machina has made on Cause-of-Action (CoA) and Nature- of-Suit (NoS) codes entered into PACER make it an invaluable resource to clearly identify relevant matters and weed out irrelevant cases. By improving the data, Lex Machina reduces the “garbage in – garbage out” effect that exists in the PACER database.

Now Lex Machina has turned its focus on cleaning up another annoyance found in PACER data, as well as many of the other platforms that pull data from PACER. The Attorney Data Engine analyzes the PACER information and identifies the attorneys that are actually associated with the case, even if those attorneys do not show up on the attorney list via PACER.

I talked with Karl Harris, Vice-President of Products at Lex Machina, a couple weeks ago, and he gave me some insights on the new Attorney Data Engine, and how they are increasing the accuracy of identifying attorneys and law firms that are actually working on the cases filed through PACER. Karl mentioned that in New Jersey and Delaware, two very important states when it comes to Intellectual Property cases, only about 54% of the attorneys that work on the cases, actually show up in the PACER information. That means that nearly half of the attorneys are missing from the metadata produced by PACER. When accuracy is important, missing nearly half of the attorney names can cause quite a problem.

For those of us that ever put on a demo for an attorney of docket information, we know that one of the first questions the attorney asks is “can you find ‘X’ case, which I represented ‘Y’ client?” If you cannot find that information, the demo may as well end right there. Attorneys are issue spotters. If you cannot get accurate information, they will not trust that the product actually works.

With the new Lex Machina Attorney Data Engine, you should be able to find the attorney information, even if PACER missed it.

Here is an overview of the three components of the Attorney Data Engine:

The PACER metadata itself: Every time Lex Machina crawls PACER data, they keep a historical record and can identify when attorneys are added or removed from a case over time. This makes the PACER data better by itself.
Pro Hac Vice Extractor: Docket entries will mention when attorneys are added Pro Hac Vice to a case. Lex Machina also keeps a record of attorneys associated to law firms (over time.)
Signature Block Analyzer: Lex Machina analyzes the documents attached to the docket entries and identifies the signature blocks for each attorney. Even if the attorney’s name doesn’t show up in the Docket entry, if they have a signature block, they are then associated with the case.

Karl Harris states that the Attorney Data Engine makes Lex Machina “the best source for reliably figuring out which attorneys are involved in which cases.”

It will be interesting to watch Lex Machina grow over the next couple of years, and to see how its new parent company, Lexis, assists in advancing its progress through access to additional data points. It is not a far jump to see how the Attorney Data Engine processes can be turned into a Company Data Engine using Lexis’ company information databases. Lexis has the content, and Lex Machina has the analytical resources to make that content better. It should make for some interesting results as the two companies learn how to adapt processes to the different products.