I wrote here in November about plans by Public.Resource.Org to publish 1.8 million pages of public-domain federal case law sometime this year and its goal of eventually creating a public-domain repository of all federal and state case law. More recently, in an article of mine published on Law.com, I singled out Public.Resource.Org and a similar project, AltLaw, as among the five most notable legal sites of 2007.
Now, a parallel project aims to bring to this growing body of public-domain law a sophisticated search engine comparable to those of commercial legal databases. In fact, the developers of this experimental legal search engine, called PreCYdent, say their tests outperform “by a wide margin” Westlaw natural language search, not to mention Lexis and other commercial databases.
The site is up and running in an alpha version containing about 340,000 cases, with a beta launch planned for the end of February. It came about through the work of University of San Diego School of Law Professor Thomas A. Smith, who serves as its CEO. Smith and the other members of the PreCYdent team say they base their work on two fundamental beliefs: that judicial opinions and statutes must be in the public domain, and that everyone — lawyers, students and the public — should have access to state-of-the-art legal research technology. “The site is free and will stay that way,” Smith wrote me in an e-mail. The service will rely on ads to generate revenue.
The power of PreCYdent’s search engine comes from its ranking of results by “authority,” using a propriety algorithm to analyze connections within networks of data similar in concept to Google and its PageRank technology. Here is how the Web site explains it:
“PreCYdent search technology ranks results by ‘authority’, using mathematical techniques, such as eigenvector centrality, similar to those used by advanced Web search engines, as well as proprietary techniques we have developed that are specialized to the legal domain. PreCYdent search technology is able to mine the information latent in the ‘Web of Law’, the network of citations among legal authorities. This means it is also able to retrieve legally relevant authorities, even if the search terms do not actually occur or occur frequently in the retrieved document.”
Smith describes the development and mechanics of PreCYdent in greater detail in an interview with Joe Hodnicki published yesterday at Law Librarian Blog. Smith’s initial research that formed the genesis of the project is described in his 2005 article, The Web of Law.
PreCYdent’s developers are incorporating a number of Web 2.0 features. The site already allows users to add commentary, recommendations and ratings to cases. Smith writes:
“Coming soon is a social network platform that will interface seamlessly (or pretty seamlessly) with the law library and search. This will enable people to find lawyers and lawyers and laypeople to share knowledge and experience. An upload feature will allow users to upload all kinds of documents, such as briefs, memos, videos, audio, and so on. All of this will be parsed by us, put into our network, and be searchable and ranked by our engine.”
I performed various top-of-the-mind searches today — nothing too sophisticated — and was impressed by the relevance and ranking of results. The default ranking is by authority, so if there are relevant Supreme Court cases, they tend to appear at the top. Advanced search options allow you to modify the ranking to be chronological or “traditional” and to limit your search by date, court and judge. Cases include citations and page numbers. Once Public.Resource.Org publishes the cases mentioned at the outset of this post, PreCYdent will add them to its database.