[Ed. Note: Please welcome guest blogger, Ravi Soni, data scientist from Casetext. I was introduced to Ravi by Casetext’s Vice-President, Pablo Arredondo, and asked to publish Ravi’s discussion on how he uses analytics at Casetext to determine if “the holding in a case is more procedural or more substantive,” and how to leverage that information to potentially predict outcomes. – GL]
One of the biggest constraints to innovation in legal research is how hard it is to scalably classify and quantify information without significant human intervention. At Casetext we’ve made real progress using advanced analytics to better leverage the wealth of content within the law to predict certain outcomes with more precision. The applications for this can range from anything between practice management, case strategy, or in my case, legal research. There is one such challenge I’m particularly interested in, namely, how to quantifiably determine whether the holding in a case is more procedural or more substantive.
I started with a collection of 47,464 briefs written by top law firms in the country. Using the citations and nature of suit (NOS) code associated with each brief, I was able to determine how many unique NOS codes were associated with each case. I defined this as how “polytopic” a case is. In other words, I counted all the unique NOS codes from the briefs that cited to each case and assigned that number as the polytopic score for each case. Ultimately, my goal was to use polytopicness as a proxy to measure proceduralness.
The idea behind using polytopicness to measure proceduralness comes from a simple concept. Let’s say we have a lawyer at an AmLaw 50 firm working on a massive M&A, a public defender in a small county appealing a death penalty verdict, and a boutique immigration firm working on a deportation case, and they all cite to the same case. What does this case have that all three of these attorneys found useful? The short answer is probably nothing substantive. What is more likely is that they are all citing to this case because it is a foundational case that sets the framework for some common motion that transcends practice area.
Let’s look at a concrete example. If I ask a roomful of lawyers if they know about A to Z Maintenance Corp. v. Dole 710 F. Supp. 853 (D.D.C. 1989), it’s quite unlikely that any of them would be able to tell me much, or anything at all. If I asked about a case like Bell Atl. Corp. v. Twombly 550 U.S. 544 (2007), any attorney in the room should be able to tell me how it changed the standards for dismissal. Looking at Figure 1, we can see how there is a difference in citation count and polytopic score between these two procedurally distinct cases.
ASHCROFT V. IQBAL 556 U.S. 662 (2009)
BELL ATL. CORP. V. TWOMBLY 550 U.S. 544 (2007)
CELOTEX CORP. V. CATRETT 477 U.S. 317 (1986)
ANDERSON V. LIBERTY LOBBY, INC 477 U.S. 242 (1986)
MATSUSHITA ELEC. INDUSTRIAL CO. V. ZENITH RADIO 475 U.S. 574 (1986)
LUJAN V. DEFENDERS OF WILDLIFE 504 U.S. 555 (1992)
CONLEY V. GIBSON 355 U.S. 41 (1957)
DAUBERT V. MERRELL DOW PHARMACEUTICALS, INC 509 U.S. 579 (1993)
KOKKONEN V. GUARDIAN LIFE INS. CO. OF AMER 511 U.S. 375 (1994)
FOMAN V. DAVIS 371 U.S. 178 (1962)