[Note: Please welcome guest bloggers Jennifer Wondracek, Director of the Law Library, Professor of Legal Research & Writing at Capital University Law School, and Rebecca Rich, Assistant Dean for the Law Library and Technology Services, and Assistant Teaching Professor at Drexel University Thomas R. Kline School of Law. – GL]
AI content generation tools, such as ChatGPT, have been in the news a lot lately. It’s the new cool tool for doing everything from coding to graphic art to writing legal briefs. It was even, briefly, used for a robot lawyer that was going to argue in court. And Greg Lambert wrote about it a few weeks ago on this very blog in What a Law Librarian Does with AI Tools like ChatGPT – Organize and Summarize. This post continues Greg’s discussion on ChatGPT use.
AI content generation tools are also the new education bogeyman. A myriad of headlines have been written in the last two months about how ChatGPT is the death of the essay and multiple-choice exams. It’s the newest in a line of digital tools–starting with the Internet and Wikipedia–that students might use to cheat in legal education. But we think this is a bit of an overreaction. ChatGPT and other similar AI-generative content creation tools can, and have, absolutely been misused; but we found that even with expert prompt creation and a high level of expertise, ChatGPT et al. are not yet capable of producing student work that is indistinguishable from real student work. Not all share this belief. Several professors at the University of Minnesota Law School ran some exams through ChatGPT, using a closed universe of cases provided to the program. The exams resulted in an average grade of C+ across the four exams, which were graded blindly. But they noted a few important takeaways, including:
[W]hen ChatGPT’s essay questions were incorrect, they were dramatically incorrect, often garnering the worst scores in the class. Perhaps not surprisingly, this outcome was particularly likely when essay questions required students to assess or draw upon specific cases, theories, or doctrines that were covered in class.
In writing essays, ChatGPT displayed a strong grasp of basic legal rules and had consistently solid organization and composition. However, it struggled to identify relevant issues and often only superficially applied rules to facts as compared to real law students.
The authors of this blog post have also done some experiments with ChatGPT. Jenny was curious about the kind of legal work that ChatGPT thought it could perform. When asked what types of legal tasks it could do, ChatGPT listed seven options, ranging from summarizing laws to drafting legal documents. The option that caught Jenny’s eye was “Helping with legal research, by providing the most relevant cases and statutes.” Challenge accepted.
Using a current problem her Legal Research and Writing students were working with, Jenny asked ChatGPT “What are the most relevant cases and statutes to determine if someone is a recreational user land entrant under Ohio law?” A few seconds later, ChatGPT gave her two statutes and three cases with brief summaries of each. While it had the general premise correct that a landowner is not liable for injury to a recreational user, assuming all of the requirements are met, it provided incorrect definitions, and every statute and case cited were incorrect. It also disagreed with itself about the duty of care owed to the recreational user in another sentence. Neither statute provided led to R.C. 1533.18 or 1533.181, the Ohio statutes for this law. When asked for more citations for the three cases listed, Jenny received both regional reporter and Ohio citations that were readable, if not quite properly Bluebooked. Investigations into the cases determined that none of the three existed by name and each of the six reporter citations led to a unique case, none of which were remotely responsive to the question. In the end, ChatGPT gave Jenny a partially correct answer with two incorrect statutes, three made-up cases, and six incorrect cases. Not a good day for accurate legal research!
Becka experimented with the law-review style research and policy paper prompts she uses for her Education Law and AI and the Law classes and had a similar experience. Even with prompts to write longer papers, ChatGPT produced short, generically written papers with no or minimal citation (including often made-up citations!) and no analysis. A five-page paper would have an average of two footnotes per page even when prompted to add more. Becka shared the results with one of her students who commented that even she could tell this was an F paper. Becka also experimented with having ChatGPT create a class policy presentation. Again, even after several refining prompts, the presentation was, at best, a C- presentation.
Given the legal writing learning curve and low level of longer form writing experience of many of our students, along with their documented increased level of stress and reduced mental health, it is understandable that instructors are nonetheless concerned about the use of AI content generation tools.
As with plagiarism tools, there is now a profitable market for detecting the use of AI generated content using AI. There are currently at least two startups developing tools: AICheatCheck and CrossPlag, both of which have usable demos. GLTR and GPTZero were developed by a collaboration between an MIT and a Harvard professor and a Princeton University student respectively (for more about these tools and a comparison of how they work, take a look at this RIPS-SIS post). Our friendly neighborhood plagiarism detection companies, Turnitin and Copyleaks, are also in the process of adding AI content generation tool use detection to their products. OpenAI (ChatGPT’s company) has developed a tool to assist in detecting the use of its tool in writing and is in the process of adding watermarking to ChatGPT-generated content.
None of these detection tools are 100% effective, so it may also be helpful to consider adding ChatGPT detection options to your paper grading rubric. Some options:
- ChatGPT generated text is formulaic: it generally follows the 5-paragraph essay, stereotypical topic sentence at the top structure.
- Sentence length does not vary as much as human-generated text does.
- ChatGPT generated text is light on analysis and applying facts to an issue.
Also remember that ChatGPT isn’t good at citation and doesn’t have any information in it from after 2021 yet. Well-done, indistinguishable from humans is a difficult enough problem to solve that no one’s gotten there yet (though an Israeli start-up is trying).
Lastly, we recommend considering teaching students about ChatGPT rather than banning it. There are so many AI-assisted drafting tools available for lawyers now that we’d be doing them a disservice otherwise (e.g. ClearBrief, Clawdia, and Docugami). The Sentient Syllabus Project has three great principles for doing so:
- AI cannot pass this class,
- AI contributions must be limited and true, and
- AI use should be open and documented.
On to the next experiment!
This is going to be something that all of you will find “interesting,” but maybe not something that you will like. Last week on 3 Geeks, I posted a blog that talked about how to use AI to generate summaries of legal articles. This week, I wanted to expand on that project a little and see if I could turn the summaries into a podcast. The goal was to try to get it completely automated, and completely AI generated. Well… as you can see from the title of this episode, it was almost completely automated, and AI generated. But not 100%.
RSS Feed that tracks new BigLaw Podcast Episodes.
Use a Python script to pull the episode information.
Use GPT to create a description of the episode.
Use Descript to translate the text summaries into voice output. (I did lightly edit these with an intro and outro as well as tweak the transitions between each review.)
Use Soundraw to create an intro/outro music.
Combine in Audacity.
Output in mp3.
There is obviously a ton of hype and buzz going on right now with ChatGPT and other AI tools, including this week’s Geek in Review podcast. I wanted to see if there’s something that I could do that was a practical use of GPT in my job as a law librarian. I think I’ve found something that might fit that bill. Summarizing text.
Law Librarians are great at finding good information and getting that quickly into the hands of lawyers, legal professionals, judges, pro se representatives, etc. However, we don’t always have a lot of time to read all of that information and create a summary for the person we are working with. It’s not uncommon for a firm to have 100 – 300 attorneys for each librarian. Any tool that would help librarians synthesize information in a useful way is a welcome tool for us all. I put GPT 3.5 (the paid version) to the test to see how it could be that electronic assistant in summarizing information quickly.
It is early in my experiment, but I’m impressed with what I’ve seen so far.
The Current Process
I wanted to try something that I personally set up for myself that is “good” but not “great.” And that is tracking BigLaw Podcasts as they come out. What I have now is an RSS feed (yes, that is still a thing!) that follows AmLaw 100/200 firms’ websites and lets me know when a new episode comes out. I have that RSS feed set up in my MS Outlook folders. I’m using LexisNexis’ NewsDesk to set this up.
Right now, it looks like this:
This works fine, but it really doesn’t give me a lot of information on the podcast. I’d really like to see more of a summary of the podcast before I make a decision to click through and listen.
I’ve got the basic information from the RSS feed, but now I want to expand that information. I’m a former programmer from “back in the day” but I haven’t done any serious programming in a long time. But, I know that Python is a great tool for processing text, so my top-of-the-head idea was to have Python look at my RSS output and see if it could get me more information. Actually, I wanted to see if Python might be able to summarize the RSS information directly. This is where the ChatGPT tool came in handy. Continue Reading What a Law Librarian Does with AI Tools like ChatGPT – Organize and Summarize
Bob Ambrogi’s LawNext Interview of Daniel Martin Katz and Michael Bommarito on GPT 3.5’s Bar Results
Emily Rushing, Director of Competitive Intelligence, Haynes and Boone, LLP
Pam Noyd, Information Resources Manager at Foley & Lardner LLP
Erik Adams, Manager of Library Digital Initiatives Manager of Library Digital Initiatives at Sidley Austin LLP, and Chief of Technology at Golden Arrow Publishing LLC
Keli Whitnell, Director of Firm Intelligence at Troutman Pepper
Christopher O’Connor, Senior Director, Product Management at LexisNexis
Crystal Ball Question:
We give you the true “3 Geeks” experience on this week’s show as we are joined by an OG (original geek) Toby Brown. Toby, Marlene, and Greg talk with Litify’s President and COO, Ari Treuhaft, and Pam Wickersham, the VP of Product and Engineer there at Litify. One of the taglines at Litify is that they #BreakLegalSilos. Treuhaft and Wickersham explain what that means, and how they focus on providing an operating system, built on Salesforce, that creates transparency between Corporate Counsel and their law firms.
Both Ari and Pam got their start in Financial and Professional services, so they come at these business problems with a different approach. With Pam’s engineering background, and experiences at Google, she brings in a unique perspective on how to build the technology through the lens of the customer. Ari’s experiences with the Financial Services industry going to the cloud over a decade ago also positions him to better understand the naysayers in the legal industry who are still resistant to placing data in the cloud.
It’s a great conversation. We want to thank the great folks at City Acre Brewery in Houston, Texas for letting us record this episode there. And, for not laughing too hard as Greg destroyed his laptop by spilling an entire Maple Porter into his brand-new laptop. We hope this is a semi-regular event! (Recording at City Acre… not pouring a beer into laptops!!)
Listen on mobile platforms: Apple Podcasts | Spotify
Crystal Ball Question
Toby Brown takes on our question this week by talking about the fact that attorneys are resistant to changing behaviors, not because they are unwilling to adapt to new technology, but because this is an industry that is very reputational based.
- Platform overview
- Salesforce Advantage
- Specs on Litify for Corporate Legal
- Recent blog by Ari about collaboration
- Recent blog about creating a culture of retention
At present, the most universal priority for law departments is “controlling outside counsel costs” per 85% of respondents to the most recent TR Legal Department Operations Index.
I understand. I also doubt the marginal utility of simply pressing harder on the traditional levers of cost control (discounts, panels, RFPs, outside counsel guidelines, AFAs). My sometimes solicited, alternative advice:
- Package work. Identify opportunities to enter portfolio arrangements, including integrated law relationships with New Law offerings.
- Move work. Right source, including greater use of legal marketplaces to find the right talent at the right price.
- Re-examine costs on autopilot. Major advances in ediscovery, ADR, court reporting, staffing, etc. present substantial, immediate spend-optimization opportunities.
- Don’t stop investing in compliance by design. Embedding legal knowledge in business processes is the only viable, long-term approach to meeting the evolving legal needs of business in an increasingly complex operating environment.
If you want to discuss, call me, maybe.
Herein, however, I am not focused on being better. Rather, we will continue our exploration of avoiding worse. The unpalatable message remains that even when something must be done, doing nothing is superior to doing the wrong thing. Running in the wrong direction cannot be course corrected solely by redoubling our efforts.