Not too long ago, a statement like this spoken in the hushed, hallowed hallways of the Harvard Law School library would have been considered heresy: “I think for court decisions, law books are becoming obsolete and even to some some degree a hindrance.”
That’s Adam Ziegler, and he’s no heretic. He’s the managing director of the Library Innovation Lab at Harvard. Ziegler is leading a team of legal scholars and digital data workers in the lab’s Caselaw Access Project.
“We want the law, as expressed in court decisions, to be as widely distributed and as available as possible online to promote access to justice by means of access to legal information,” Ziegler said. “But also to spur innovation, to drive new insights from the law that we’ve never been able to do when the law was relegated to paper.”
“So what’s going to result from this project is a huge database of electronic, digital court decisions. And the world of law has never seen that before.”
Adam Ziegler, managing director of the Library Innovation Lab
Historically, libraries have been collections — books, multimedia materials and artwork. But increasingly they’re about connections, linking digital data in new and different ways. The Caselaw Access Project is a state-of-the-art example of that shift.
“So what’s going to result from this project is a huge database of electronic, digital court decisions,” Ziegler explained. “And the world of law has never seen that before.”
‘Unbinding The Law’
Harvard Law’s collection, second only to the one kept by the Library of Congress, includes the civil and criminal case law decisions from every state and federal court.
Ziegler and his team estimate that across all 43,000 case law books in the collection, each has an average of about 921 pages. That’s nearly 40 million pages that need to be digitized.
The law school has so many books that the majority are stored in a vast vault in a hidden hilltop repository in Southborough, out of sight and not very accessible to students and scholars.
Ziegler says the oldest decision in Harvard’s case law collection dates back to Rhode Island’s Court of Trials circa 1647. He wants to extend its future forever.
“We’re all bound by the law,” he said. “We’re all bound by the decisions that judges issue, we ought to be able to read them, and we ought not have to pay to read them.”
The goal of the Caselaw Access Project is to liberate law books, making the contents available to anyone with an internet connection.
“We are literally, and sort of metaphorically, unbinding the law and making it available online for free, which is exciting,” Ziegler says.
The books that are set to be digitized are first shipped from the Southborough storage facility to the law school library. The physical unbinding happens in a prep room, where Zach Bodnar, a digitization specialist, uses an x-acto knife to carefully cut case law books from their bindings. Then a machine slices neatly through the spines of the books.
“The machine itself does chomp with more force than a great white shark,” Bodnar said. “A fun tidbit.”
From there, the books are sent to another room, where a high speed scanner takes four different images of each page — 100,000 pages a day.
“They also apply metadata that give structure to the resulting file — so we know the name of the case, name of judges, the name of the court, the date on which the decision was issued,” Ziegler explained.
After being scanned, the unbound books are hermetically sealed in plastic along with their original binding using a device used by meat packers.
From there the books are sent to a limestone cave in Kentucky.
“It’s important to have it, just in case,” Ziegler said. “If we need to reboot our democracy for some reason then we’ll have all these books in Louisville. But also if we do our job right then that book will be a backup that’s only needed if only something goes wrong.”
Mining The Data
The digital transformation turns the case law books into files that can be data mined, and the information extracted for profit.
Harvard has granted Ravel Law an eight-year exclusive contract to use the case law information. The law school has an equity interest in the California-based company, which plans to use the data in new and innovative ways.
Daniel Lewis, CEO of Ravel, says it has applications that can detect trends and patterns in the law, even tracking bias among judges, presenting data in a visual way that discloses relationships never seen before in the law.
“So you have this raw case that’s now digital, and then what we can do is add machine learning on top of that. And by adding all these extra pieces of information we make it more possible to sift through millions and millions of documents to find exactly what you want,” Lewis said. “And we do that by combining legal expertise with software engineering.”
Ravel’s applications turn case law data into legal narratives in a way, the company says, word search databases such as LexisNexis and Westlaw do not. And while accessing the raw data is free, the analysis is going to cost you.
The Caselaw Access Project should be complete by next March, in time for Harvard Law School’s bicentennial anniversary and a new chapter in the law school’s future.