How the Legal Research Copilot is sequenced: the evidence behind the bet, how Therese Steiner's direction maps to phases, where each phase sits in Mark Koussa's lifecycle, the integration it depends on, and what we need from you to unlock each step. The plan is gated, not dated — phases graduate on evidence and unlocked dependencies, not a calendar.
The evidence behind the bet
The funnel above is our corpus (Dataset A). Within that core, the prior June 2025 analysis (separate dataset) found the demand concentrates — top-15 of 35 clusters ≈ 59% — shown below; the two are not stacked into one funnel.
From our corpus · 5,732 chat + 325 voice
6,057
CS interactions in our corpus (5,732 chat + 325 voice)
~83%
are legal-research in nature — the volume the copilot targets
~63%
of chats resolve with a CSR-crafted Lexis permalink — the gold-standard move
From a prior clustering analysis · separate, older dataset · 606 graded docs (Jun 2025)
59%
of graded questions fall in the top 15 of 35 clusters — concentrated, MVP-friendly
29.5%
of graded docs were transferred to another team — leakage the copilot can reduce
60.6%
of graded docs carry deadline / urgency language
Concentration (prior analysis): the top 5 clusters alone (Federal Civ Pro, Civ Pro motions, citation/source location, contracts, FRE depositions) cover ~27% of graded questions — a tight Phase-1 target. Competitive wedge: LexisNexis already runs reference attorneys; Westlaw has them but doesn't productise them, Harvey has no human layer.
Two distinct datasets — not blended. The volume figures (top row, green) come from our chat/voice corpus — 5,732 chat + 325 voice (6,057), the recent pull from Therese (working belief: ~March 2026; date being confirmed). The cluster, transfer, and urgency figures (amber) come from a separate, older clustering analysis — 606 graded docs from June 2025, run by a prior DS team, with no file overlap with our corpus. All figures are a directional hypothesis anchor, not proven scale; sufficient to test the bet, not to train against (≈200k–250k interactions estimated for training). No annualized volume or rate is claimed.
Alpha scope boundary — what we're building to first. The alpha targets one interaction: a legal-research chat that ends in an actual Lexis link delivered to the customer — the ~63% of chats that resolve with a CSR-crafted permalink. The gold-standard move: a clean, measurable outcome and a clear accept / edit / reject training signal. Chats that resolve without a delivered link are out of current alpha scope — not out of the concept. As coverage widens at beta, the scoping line shifts from link-presence to a broader resolution / outcome signal (source guidance, Shepardize, cite fixes — read from AgentWeb dispositions), and those interactions re-enter. Link-presence is the deliberate alpha gate (confirmed with Therese, 2026-06-08).
Therese's direction → how it maps to phases
From the 26 Mar Embedded Innovation collaboration · Therese's framing
Control
The CSR vets every suggestion before any customer sees it; nothing auto-sends. Who decides what reaches the customer — the safety spine of the near-term build.
Build Alpha + Co-pilot — core
Placement & adoption
The suggestion has to be unmissable in the CSR's flow, or it won't get used — where it appears. That's exactly the tension in the surface decision: a standalone EI app beside AgentWeb ships fast and EI-owned but is a separate window (real adoption risk); AgentWeb integration is genuinely in-flow but slower and cross-team.
Surface open — standalone or AgentWeb
Direct-to-customer
Long-term, the same capability could serve customers directly for routine research, freeing CS for complex work. Vehicle undecided (in-product Lexis Advance/Lexis+, or a customer-facing chatbot via NICE) — a downstream decision owned by another team.
In-product — horizon
Why "shadow" is not a phase that starts today. "Shadow mode" implies a deployed model running alongside CSRs while we watch — which presumes a built, integrated tool. There isn't one yet. Observation of CSRs using the copilot happens inside Build Alpha, after the tool is built and the real-time ingestion path is solved. Treating it as a day-one activity is what made the original phasing read as further along than it is.
Phased plan — gated, not dated
The axis below is relative, not calendar — time runs left to right so you can read the shape and pace, but it is anchored to T0 (when access lands), not to specific months. Phase 0 eval runs now; everything after it waits on access we don't yet have — shown as a labeled unknown-duration zone, not an assumed date. Each phase graduates only by clearing its gate; a slipped dependency holds position and pushes everything downstream.
◆ gate — must clear to graduate · ▒ unknown-duration zone, gated on access · the axis is elapsed-from-start, not a calendar. T0 = CX1 raw access + Agent Web bridge + real-time ingestion path in place. The NOW→T0 gap has no committed duration until those land.
1
Concept Build · Mark Stage 3 · now
Feasibility eval
~weeks · scope-dependent
Starts when
Now — no blocking dependency. Uses the sample already in hand.
Does
Offline: model produces source + Boolean + Ask-prompt on held-out questions; senior CSRs grade. No live tool.
Also
Re-cluster our corpus under a documented method to define the target taxonomy + eval set — the prior 35-cluster analysis is year-old, different data, unknown method, so it scopes nothing.
Gate → Qualitative lift demonstrated on the top clusters of the re-run taxonomy, before any infrastructure ask.
2
Build Alpha · Mark Stage 4 · suggest-only
CSR Copilot (HITL)
pace set by ingestion build
Starts when
CX1 raw access + Agent Web bridge land, and the real-time ingestion path is solved.
Does
Suggest-only, on a side-by-side surface next to LN's stack — minimal integration to reach CSRs fast. Top clusters, small CSR group. Accept/edit/reject logged.
Expand to more / all clusters — still in chat — with jurisdiction-aware tuning, Shepardize / secondary / forms. Expansion is cluster coverage, not new channels.
Gate → accept-as-is holds as coverage widens · citation-hallucination below ceilingproposed · not SME-confirmed
Same capability reaches customers directly; sampled human review as a quality service. Vehicle undecided (in-product Lexis Advance/Lexis+ or a NICE chatbot) — a downstream decision owned by another team. Content-nudge for stickiness lives here.
Gate → production readiness — the direct-to-customer threshold, deliberately conservative.
Each ▸ gate must clear before the next phase starts · a missed dependency holds position, it does not graduate the surface
How the CSR's day changes
In one research request
Today — no copilot
Build Alpha — suggest-only
Co-pilot — expanded
Find the source
From memory / individual skill; varies CSR to CSR.
Copilot drafts the Boolean + a Lexis+ Ask prompt; CSR edits.
Connector tuning per jurisdiction.
Clarify intent
Ad hoc, if the CSR thinks to ask.
Copilot suggests the clarifying question.
Same, refined from feedback.
Deliver
CSR pastes the permalink to the customer.
Unchanged — CSR still sends; nothing auto-sends.
Unchanged.
Feedback loop
None — nothing is captured.
Accept / edit / reject logged as the training signal.
Continuous fine-tune from that signal.
Channel
Chat, and voice (phone + emailed result), both manual.
Chat.
Still chat — more clusters. Voice is a later channel.
Integration architecture — what the copilot plugs into
Architecture split (working direction — not yet a built contract). Embedded Innovation builds the backend AI capability and calls the existing LexisNexis search API (managed by Jim Presto) to ground and produce the suggestion; both are handed to LN's team to fold into their stack — at alpha, or at beta later. For the alpha, the CSR-facing experience runs side-by-side with LN's existing stack rather than integrated into it — a standalone surface that minimizes integration overhead, so we can put it in front of CSRs and gather early feedback fast, before committing to deeper in-stack integration. The conversational platform (Amelia, current, vs Cognigy, proposed with native HITL) and the real-time ingestion path — consuming the live chat feed turn-by-turn — remain unresolved and sit on the critical path to Build Alpha.
Integration & dependencies — what each phase needs
What every phase needs, who provides it, and what's blocking — in one place.Required must land before that phase starts; Blocking is the hard gate that, if unmet, stops the phase; inherited = carried from the prior phase; — = not needed yet. This replaces a separate "asks" list — the Owner column is what we need from you. Voice is a later channel (live voice-to-text HITL + emailed delivery), not one of these near-term phases; its dependencies are scoped when chat clusters are covered.
Integration / dependency
P1
P2
P3
P4
Owner & why
Real-time ingestion (live conversation feed)
—
Blocking
in use
in use
EI + LN (NICE). The #1 unknown — consume the chat live, stay ahead of the CSR. Test cheaply and early before integration spend. The later voice channel adds live voice-to-text on top of this.
CSR delivery surface (side-by-side for alpha; AgentWeb later)
—
side-by-side
in use
Decide
EI's call. Alpha runs a side-by-side standalone surface next to LN's stack — fast, EI-owned, minimal integration to reach CSRs. Deeper AgentWeb integration is the later / beta decision.
CX1 raw transcript access (rolling window, source-truth)
sample in hand
Required
inherited
inherited
Therese → LN Murthy. The sample tests the bet; training-grade fidelity needs a real rolling window (~200k–250k).
AgentWeb disposition feed (outcome labels)
—
Required
inherited
inherited
Therese (CS-IT). Pick-list dispositions + auto-summary joined to transcript ID → resolved/escalated/repeat labels. Read-only.
Knowledge base inventory (index + per-article metadata)
—
Required
inherited
inherited
Therese + KB owner. Lets the copilot ground its rationale in known-good content, not free-generate it.
Conversational platform decision (Amelia vs Cognigy)
—
Needed
inherited
inherited
Architecture (Ollie / Lakshmi). HITL-native platform; migration cost / lock-in / learning curve. Architecture split: EI builds the API, LN owns the NICE integration.
Later channel — out of the near-term phases. Same suggest-only HITL as chat, but harder: live voice-to-text feeds a suggestion the CSR reviews on the call, then emails the result (email is voice's delivery step, not a separate channel). Instrumenting it = capturing that emailed deliverable + tying it to the call. Scoped when chat clusters are covered. Owner: Therese & Mark.
Therese (CS Ops). Baselines already exist — AHT 28→21 min target (−25%, the ROI lever), CSat ~78%, FCR 71% (2026-05-29). Join to the pilot cohort to measure impact. AHT delta is provable only under steady CSR adoption, so it reads in live use, not at offline eval.
Customer engagement with the delivered result
—
—
Required
inherited
Owner depends on where it lives. Did the customer click the permalink and use what we sent? A chat transcript shows the link was sent, not clicked — so this needs click/engagement telemetry from somewhere we don't own: the chat platform (NICE/AgentWeb), an instrumented-link layer, or Lexis+ analytics for in-product use. Where it lives is open. Cross-team either way; the richest outcome signal, lands after the CS-side metrics above.
Customer delivery surface (in-product vs NICE)
—
—
—
TBD
Downstream team (not EI). In-product (Lexis Advance/Lexis+) vs. customer-facing chatbot via NICE — depends on feasibility + existing architecture.