No other court in the world has had more influence on the use of machine learning in litigation than the U.S. District Court for the Southern District of New York (the “SDNY”) and that tradition continues with the court’s new ruling in Winfield v. City of New York.
In short, the order adds to the body of case law supporting predictive coding—a machine learning process that mimics the behavior of attorneys to classify documents by their likely relevance—while taking a balanced approach to the level of transparency expected when using AI in litigation.
The underlying case in Winfield surrounds New York City’s affordable housing program and policies that the Plaintiffs claim have a disparate impact on racial minorities. An elaborate discovery process ensued as the City was tasked with producing relevant documents from 50 data custodians. To expedite discovery, the Court directed the City to use predictive coding (also sometimes referred to as technology assisted review or simply “TAR”).
However, the Plaintiffs later raised challenges over the predictive coding training process, arguing that the City’s pattern of under-representing the relevance of documents, among other issues, yielded an incomplete production of documents. As evidence, they pointed to a small group of admittedly relevant documents the City had “inadvertently produced” to Plaintiffs by failing to properly cleanse the production database of their text.
In evaluating these challenges, Magistrate Judge Katharine Parker gave strong weight to the body of case law providing near-universal support for the use of predictive coding. The SDNY itself has issued several high-profile opinions, including Rio Tinto v. Vale in which Judge Andrew Peck famously declared: “it is now black letter law that where the producing party wants to utilize TAR for document review, courts will permit it.”
After observing that “courts are split as to the degree of transparency required” in the predictive coding process, Judge Parker held fast to the doctrine that attorneys conducting e-discovery are not held to standards of perfection: “Reasonableness and proportionality, not perfect and scorched-earth, must be their guiding principles.” Due to the City’s numerous perceived missteps, however, the Court saw fit to provide the Plaintiffs some measure of satisfaction.
Rather than ordering the release of all “seed set” training documents (which some courts have indicated is beyond a judge’s authority) and other comparatively draconian remedies sought by Plaintiffs, the Court directed its focus away from process transparency and toward the validation of results. The Court ordered the City to provide random samples of supposedly non-relevant, non-privileged documents from its data sets for the Plaintiffs to review. Plaintiffs will either be able to find further evidence of production deficiencies or, alternatively, satisfy themselves that the City’s production was reasonable and proportional.
The path forward
As Adam Kuhn points out, the challenges raised to the City’s process may be linked to the particular predictive coding model employed by the City, a protocol-heavy approach some refer to as “TAR 1.0.” Had the City leveraged a flexible,continuous machine learning model, many of these challenges could have been obviated altogether.
Regardless, Winfield represents another gentle turn in the long and winding road of legal industry adoption of AI, as Courts seek to reconcile attorneys’ well-established discretion to leverage beneficial discovery technology with their own elusive desire for transparency and cooperation in a highly adversarial context.
OpenText™ Discovery is a leading provider of machine learning for litigation and investigations.
Learn more today: