Which tier of AI review do I actually need?

For most commercial litigation matters under 5 TB with a defined review team, TAR 2.0 (CAL) is the defensible, cost-effective default. GenAI scoring adds value on complex issue sets where a natural-language prompt is more accurate than a seed set. Full agentic review is appropriate only for very large, multi-issue matters where cross-custodian reasoning and structured reasoning traces are procurement requirements.

What is agentic eDiscovery? A taxonomy for 2026

Updated 21 April 2026 | Independent reference | Not legal advice

The phrase “AI eDiscovery” is a marketing envelope, not a technology. Inside the envelope there are four distinct technologies with four distinct defensibility profiles, four different procurement conversations, and four different cost curves. This page defines each one precisely.

Key distinction

What most vendors call “AI review” or “predictive coding” in 2026 is Continuous Active Learning (CAL), sometimes layered with an LLM relevance scorer. Genuinely agentic eDiscovery -- where LLM agents reason across custodians, issues, and document timelines with auditable reasoning traces -- is real but is found in a narrow set of platform features, not across the market generally.

Tier 1: Keyword and Boolean search

The baseline. Terms-and-connectors search uses exact keyword matches, proximity operators, and Boolean logic. It predates AI entirely and remains in active use because it is fast, transparent, and easy to document. Approximately 40% of matters still rely primarily on keyword search for initial culling, particularly for early case assessment and litigation hold scoping.

Keyword search fails when the relevant documents do not use the expected terminology, when reviewers lack domain knowledge to anticipate all relevant terms, or when the corpus spans multiple languages or includes audio and video files. FRCP 26(g) requires the certifying attorney to conduct a “reasonable inquiry” before signing off on a search -- keyword design is not inherently defensible simply because it is transparent.

Tier 2: TAR 1.0 -- Predictive coding with a fixed seed set

TAR 1.0, commonly called predictive coding, was the first generation of machine-learning-based document review. Attorneys code a fixed seed set of documents (responsive / not responsive), a classifier trains on those coded documents, and the trained model then predicts relevance scores for the remaining corpus. The process runs once; the classifier is applied to the full population.

“The Court approves the use of predictive coding for this litigation... predictive coding is an acceptable way to search for relevant ESI in appropriate cases.”

Da Silva Moore v. Publicis Groupe, 287 F.R.D. 182, 193 (S.D.N.Y. 2012) -- Judge Andrew J. Peck

Judge Andrew Peck's opinion in Da Silva Moore v. Publicis Groupe (287 F.R.D. 182, S.D.N.Y. 2012) was the first judicial approval of TAR, establishing the core principle that process transparency, not document disclosure, is the defensibility standard. TAR 1.0 fell out of favour on large matters because the fixed seed set became stale as the corpus changed during rolling productions, requiring re-training and increasing cost.

Tier 3: TAR 2.0 -- Continuous Active Learning (CAL)

Continuous Active Learning, described by Grossman and Cormack in their landmark 2011 Richmond Journal of Law and Technology paper (“Technology-Assisted Review in Electronic Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review”), continuously retrains the classifier as reviewers code documents. The model actively selects the most informative documents for coding at each iteration, accelerating convergence.

TAR 2.0 is the current industry default. Rio Tinto PLC v. Vale S.A. (306 F.R.D. 125, S.D.N.Y. 2015) reinforced its defensibility framework, with Judge Peck holding that TAR -- including CAL -- could proceed without disclosure of the seed set, provided the producing party documented its process adequately. In re Biomet (2013 WL 6405156, N.D. Ind. 2013) extended the proportionality analysis to justify cost-burden shifting.

Validation requirements for TAR 2.0 are well-established: elusion testing (a random sample of documents predicted non-responsive is reviewed to measure the elusion rate), precision and recall measurement, and a final agreement on the recall level before production. Grossman and Cormack established F1 scoring and 95% confidence/plus or minus 5% margin as the standard target, though specific thresholds should be stipulated in the discovery protocol.

Tier 4: GenAI review -- LLM-scored relevance

In 2023-2026, the major platforms added an LLM relevance scorer alongside or replacing the classical CAL classifier. The attorney writes a natural-language issue description; the LLM scores documents for relevance against that description. Relativity's aiR for Review, EverlawAI Assistant's Single Document Review, DISCO Cecilia's narrative intelligence, and Reveal Ask all implement this architecture.

The legal defensibility profile is substantially the same as TAR 2.0 under the existing case law framework: documented process, statistical sampling, and stipulated protocol. EEOC v. Tesla (N.D. Cal. 2024-2025), the first public-record matter involving GenAI document review, was accepted subject to the same validation requirements. No new case-law category was created; existing defensibility doctrine applied.

What GenAI scoring adds over classical CAL: better handling of conceptual relevance (documents that are responsive but do not use the expected terms), faster convergence on complex multi-issue review sets, and natural-language prompt flexibility that allows issue-by-issue scoring in parallel. What it does not add: automatic per-document explainability (most platforms return a score, not a reasoning trace), reduced human reviewer requirement, or reduced validation obligation.

Tier 5: Agentic review -- LLM agents reasoning across the corpus

Genuinely agentic eDiscovery is narrow in 2026. It describes LLM agent workflows that perform multi-step reasoning across a document corpus: retrieving related documents, scoring relevance and privilege in chained steps, identifying cross-custodian communication patterns, reconstructing narrative timelines, and producing per-document reasoning traces that are exportable for attorney review.

The closest current examples are Lighthouse AI's agentic retrieval workflows, the experimental Relativity aiR agent chain for case strategy, and Nuix Neo's agentic investigation workflows. Early OSS legal stacks built on LangChain and LlamaIndex are also in use in specialist shops. Full agentic review is not yet a mainstream procurement category; it is a feature set that distinguishes the advanced tier of the leading platforms.

The defensibility question for agentic review is not yet settled. The same Sedona Principle 6 / FRCP 26(g) framework applies, but the requirement for auditable reasoning traces is significantly more important when an agent, rather than a human reviewer, is making or influencing privilege and responsiveness determinations. Firms using agentic features should document the agent chain as thoroughly as the review protocol.

The defensibility framework for all four tiers

Sedona Conference Principle 6 is the foundational statement: the responding party is best situated to evaluate the search and review methods appropriate to its circumstances. Courts following Rio Tinto and Biomet apply this principle by evaluating the process, not the outcome -- the question is not whether every responsive document was found, but whether the producing party used a documented, validated, proportionate process.

FRCP 26(g)(1)(B): By signing, an attorney certifies that to the best of the person's knowledge, information, and belief formed after a reasonable inquiry, the disclosure or response is complete and correct...

Practical defensibility requirements for any AI-assisted review: (1) a written review protocol describing the methodology, agreed with opposing counsel or approved by the court in advance; (2) statistical sampling to validate recall and elusion rates; (3) documented seed set or issue prompt; (4) quality control coding and log; (5) a Rule 502(d) order covering inadvertent privilege disclosures. See the full case-law reference and the Sedona Principle 6 text at /case-law.

Which tier do you actually need?

Matter profile	Recommended tier	Key reason
Single issue, under 50 GB, 3 reviewers	Keyword + Boolean	Proportionate; CAL overhead not justified
500 GB, defined issue set, 8 reviewers	TAR 2.0 / CAL	Industry default; established defensibility
5 TB, complex multi-issue, 25 reviewers	GenAI scoring on CAL	LLM handles conceptual relevance; faster convergence
50 TB, regulatory, cross-custodian patterns	GenAI + agentic features	Agent chains for cross-custodian reasoning; reasoning-trace export required

Last verified April 2026

Frequently asked questions

What is the difference between TAR and CAL?

TAR (Technology Assisted Review) is the umbrella term. TAR 1.0 (predictive coding) trains on a fixed seed set. CAL (Continuous Active Learning), also called TAR 2.0, continuously retrains as reviewers code documents, making it significantly more accurate on large and rolling productions. Most vendors now use CAL and call it predictive coding or AI review.

Is GenAI review the same as agentic review?

No. GenAI review layers an LLM relevance scorer on top of CAL. Agentic review goes further: LLM agents reason across the corpus in multi-step chains, identify cross-custodian patterns, and produce per-document reasoning traces. Agentic review is narrow in 2026 -- most platforms claiming 'agentic AI' are GenAI review with enhanced analytics.

Which tier of AI review do I need?

For most commercial matters under 5 TB, TAR 2.0 (CAL) is the defensible and cost-effective default. GenAI scoring adds value on complex issue sets where natural-language prompts outperform seed sets. Full agentic review is appropriate for very large, multi-issue matters where cross-custodian reasoning and reasoning-trace exports are procurement requirements.

How is agentic eDiscovery different from standard AI document review?

Standard AI document review (GenAI scoring on CAL) scores individual documents for relevance. Agentic review chains multiple reasoning steps: retrieving related documents, comparing custodian timelines, flagging cross-document privilege clusters, and generating structured reasoning traces. The output is richer but the attorney validation burden is also higher.

Continue reading

Predictive Coding 2.0 deep dive →Platforms compared →Case law reference →Ethics and confidentiality →