A critical step in parsing court opinions is knowing which court produced the opinion. Unfortunately, courts change their names over time and so we encounter opinions from the “Supreme Court of Massachusetts”, the “Supreme Judicial Court of Massachusetts”, and the “Massachusetts Supreme Judicial Court”, among others. These are all names for the very same court of last resort in Massachusetts, so we created a tool that recognizes all these varied names.
We call our tool the Free Law Project Courts-DB.
Using Courts-DB, you can easily look up the name of nearly any American court with published cases going back to 1600. We have used this functionality to parse nearly 16 million court names. After doing so, our accuracy at parsing court names stands at 99.998%. (The remaining 0.002% generally requires a human to understand.)
The Numbers
Tested against 16M courts
17,887 lines of code
718 court identifiers
361 court websites
2,100 regular expressions
Courts-DB consists of over 17,000 lines of code and has data about American courts from the 1600s until modern times. Generally, if the court ever had a published case — and often even if it did not — then that court will be available in Courts-DB. This includes special and limited jurisdiction courts, tribal courts, and even a couple of United States Courts of other countries (looking at you United States Court for Berlin).
Courts-DB uses over 2,100 regular expressions to match court names, has over 300 court websites available for lookup, and provides thousands of examples, variations, typos, and other court metadata.
Finally, the DB contains identifiers for all of these courts. Identifiers are an important part of building any software system, and their absence from the legal industry has been an ongoing challenge to innovation and interoperation. Many of our identifiers are already adopted by the SALI Alliance and we hope to soon incorporate the rest into their standards. If you are developing any sort of legal software, we hope you will consider using these identifiers.
Starting now, Courts-DB is available as open code, a python package or as an extremely long JSON file.
For the techies: To give you a quick taste of what the code looks like, here is one entry in the data for “Massachusetts Supreme Judicial Court”.
[{
"regex": [
"${sjc} Ma(ss(achusetts)?)?(\b|$)?",
"${ma} ${sjc}",
"Supreme Court Of ${ma}",
"State Of ${ma} Supreme Court"
],
"name_abbreviation": "Mass. Sup. Jud. Ct.",
"dates": [
{
"start": "1692-01-01",
"end": null
}
],
"name": "Massachusetts Supreme Judicial Court",
"level": "colr",
"case_types": ["All"],
"system": "state",
"examples": [
"Supreme Court Of Massachusetts",
"Supreme Judicial Court Of Massachusetts",
"Massachusetts Supreme Judicial Court"
],
"court_url": "http://www.mass.gov/courts/sjc/",
"type": "appellate",
"id": "mass",
"location": "Massachusetts"
}]
Courts-db is part of larger initiatives at Free Law Project to organize and provide free and open access to every US court opinion in history. We encourage and invite users to join, research and test our code. In particular, we are looking for help adding court start and end dates to Courts-DB. If you’re interested in lending a hand, please get in touch.
To learn more about the project, the data and how to use it please visit Courts-db on Github
No comments:
Post a Comment