LexisNexis Customer Operations Legal Research Copilot — Delivery Overview Embedded Innovation · Customer Ops Discovery
LexisNexis Embedded Innovation · Customer Ops Discovery

Legal Research Copilot — Delivery Overview

How the Legal Research Copilot is sequenced: the evidence behind the bet, how Therese Steiner's direction maps to phases, where each phase sits in Mark Koussa's lifecycle, the integration it depends on, and what we need from you to unlock each step. The plan is gated, not dated — phases graduate on evidence and unlocked dependencies, not a calendar.

The evidence behind the bet
6,057 — all CS interactions in our corpus ~83% — legal-research in nature ~66% — case-law / motions / source-finding core
The funnel above is our corpus (Dataset A). Within that core, the prior June 2025 analysis (separate dataset) found the demand concentrates — top-15 of 35 clusters ≈ 59% — shown below; the two are not stacked into one funnel.
From our corpus · 5,732 chat + 325 voice
6,057
CS interactions in our corpus (5,732 chat + 325 voice)
~83%
are legal-research in nature — the volume the copilot targets
~63%
of chats resolve with a CSR-crafted Lexis permalink — the gold-standard move
From a prior clustering analysis · separate, older dataset · 606 graded docs (Jun 2025)
59%
of graded questions fall in the top 15 of 35 clusters — concentrated, MVP-friendly
29.5%
of graded docs were transferred to another team — leakage the copilot can reduce
60.6%
of graded docs carry deadline / urgency language
Concentration (prior analysis): the top 5 clusters alone (Federal Civ Pro, Civ Pro motions, citation/source location, contracts, FRE depositions) cover ~27% of graded questions — a tight Phase-1 target. Competitive wedge: LexisNexis already runs reference attorneys; Westlaw has them but doesn't productise them, Harvey has no human layer.
Two distinct datasets — not blended. The volume figures (top row, green) come from our chat/voice corpus — 5,732 chat + 325 voice (6,057), the recent pull from Therese (working belief: ~March 2026; date being confirmed). The cluster, transfer, and urgency figures (amber) come from a separate, older clustering analysis — 606 graded docs from June 2025, run by a prior DS team, with no file overlap with our corpus. All figures are a directional hypothesis anchor, not proven scale; sufficient to test the bet, not to train against (≈200k–250k interactions estimated for training). No annualized volume or rate is claimed.
Alpha scope boundary — what we're building to first. The alpha targets one interaction: a legal-research chat that ends in an actual Lexis link delivered to the customer — the ~63% of chats that resolve with a CSR-crafted permalink. The gold-standard move: a clean, measurable outcome and a clear accept / edit / reject training signal. Chats that resolve without a delivered link are out of current alpha scope — not out of the concept. As coverage widens at beta, the scoping line shifts from link-presence to a broader resolution / outcome signal (source guidance, Shepardize, cite fixes — read from AgentWeb dispositions), and those interactions re-enter. Link-presence is the deliberate alpha gate (confirmed with Therese, 2026-06-08).
Therese's direction → how it maps to phases
From the 26 Mar Embedded Innovation collaboration · Therese's framing
Control
The CSR vets every suggestion before any customer sees it; nothing auto-sends. Who decides what reaches the customer — the safety spine of the near-term build.
Build Alpha + Co-pilot — core
Placement & adoption
The suggestion has to be unmissable in the CSR's flow, or it won't get used — where it appears. That's exactly the tension in the surface decision: a standalone EI app beside AgentWeb ships fast and EI-owned but is a separate window (real adoption risk); AgentWeb integration is genuinely in-flow but slower and cross-team.
Surface open — standalone or AgentWeb
Direct-to-customer
Long-term, the same capability could serve customers directly for routine research, freeing CS for complex work. Vehicle undecided (in-product Lexis Advance/Lexis+, or a customer-facing chatbot via NICE) — a downstream decision owned by another team.
In-product — horizon
Why "shadow" is not a phase that starts today. "Shadow mode" implies a deployed model running alongside CSRs while we watch — which presumes a built, integrated tool. There isn't one yet. Observation of CSRs using the copilot happens inside Build Alpha, after the tool is built and the real-time ingestion path is solved. Treating it as a day-one activity is what made the original phasing read as further along than it is.
Phased plan — gated, not dated
The axis below is relative, not calendar — time runs left to right so you can read the shape and pace, but it is anchored to T0 (when access lands), not to specific months. Phase 0 eval runs now; everything after it waits on access we don't yet have — shown as a labeled unknown-duration zone, not an assumed date. Each phase graduates only by clearing its gate; a slipped dependency holds position and pushes everything downstream.
NOW T0 · ACCESS LANDS +~1 QUARTER +~2 QUARTERS HORIZON Feasibility eval runs now G0 Build Alpha · suggest-only gated · duration unknown CSR Copilot (HITL) G1 Expanded co-pilot more clusters · chat G2 In-product (Beta+) customer-facing agent
gate — must clear to graduate · ▒ unknown-duration zone, gated on access · the axis is elapsed-from-start, not a calendar.
T0 = CX1 raw access + Agent Web bridge + real-time ingestion path in place. The NOW→T0 gap has no committed duration until those land.
1
Concept Build · Mark Stage 3 · now
Feasibility eval
~weeks · scope-dependent
Starts when
Now — no blocking dependency. Uses the sample already in hand.
Does
Offline: model produces source + Boolean + Ask-prompt on held-out questions; senior CSRs grade. No live tool.
Also
Re-cluster our corpus under a documented method to define the target taxonomy + eval set — the prior 35-cluster analysis is year-old, different data, unknown method, so it scopes nothing.
Gate → Qualitative lift demonstrated on the top clusters of the re-run taxonomy, before any infrastructure ask.
2
Build Alpha · Mark Stage 4 · suggest-only
CSR Copilot (HITL)
pace set by ingestion build
Starts when
CX1 raw access + Agent Web bridge land, and the real-time ingestion path is solved.
Does
Suggest-only, on a side-by-side surface next to LN's stack — minimal integration to reach CSRs fast. Top clusters, small CSR group. Accept/edit/reject logged.
Gate → accept-as-is clears threshold · "wrong source" below ceilingproposed · not SME-confirmed
3
Alpha Testing · Mark Stage 5 · co-pilot
Expanded co-pilot
~a quarter once started
Starts when
Gate 1 passes — top-cluster accuracy cleared.
Does
Expand to more / all clusters — still in chat — with jurisdiction-aware tuning, Shepardize / secondary / forms. Expansion is cluster coverage, not new channels.
Gate → accept-as-is holds as coverage widens · citation-hallucination below ceilingproposed · not SME-confirmed
4
Beta+ · Mark Stage 7 · customer-facing
Customer-facing agent
horizon
Starts when
Gate 2 passes + Lexis+ product sponsorship + production-grade commitments (SLA, SRE, security).
Does
Same capability reaches customers directly; sampled human review as a quality service. Vehicle undecided (in-product Lexis Advance/Lexis+ or a NICE chatbot) — a downstream decision owned by another team. Content-nudge for stickiness lives here.
Gate → production readiness — the direct-to-customer threshold, deliberately conservative.
Each ▸ gate must clear before the next phase starts · a missed dependency holds position, it does not graduate the surface
How the CSR's day changes
In one research request Today — no copilot Build Alpha — suggest-only Co-pilot — expanded
Find the sourceFrom memory / individual skill; varies CSR to CSR.Copilot proposes the source + a one-line "why."+ jurisdiction-aware; Shepardize, secondary sources, forms.
Build the searchHand-built Boolean with Lexis connectors.Copilot drafts the Boolean + a Lexis+ Ask prompt; CSR edits.Connector tuning per jurisdiction.
Clarify intentAd hoc, if the CSR thinks to ask.Copilot suggests the clarifying question.Same, refined from feedback.
DeliverCSR pastes the permalink to the customer.Unchanged — CSR still sends; nothing auto-sends.Unchanged.
Feedback loopNone — nothing is captured.Accept / edit / reject logged as the training signal.Continuous fine-tune from that signal.
ChannelChat, and voice (phone + emailed result), both manual.Chat.Still chat — more clusters. Voice is a later channel.
Integration architecture — what the copilot plugs into
Live conversation Chat (Agent Web) · voice later via NICE platform Amelia (now) vs Cognigy? Real-time ingestion turn-by-turn feed of the live conversation Biggest unknown · live path unproven EI Copilot backend API (Embedded Innovation builds) suggests: source · Boolean · Lexis+ Ask prompt + rationale grounded in KB inventory calls Lexis search API · Jim Presto CSR copilot surface side-by-side surface (alpha) copy / edit / reject suggest-only · never auto-sends Lexis Advance / Lexis+ CSR sends permalink to the customer customer-facing (Beta+) LN owns NICE integration outcome labels ← Agent Web dispositions · telemetry ← Lexis+
Architecture split (working direction — not yet a built contract). Embedded Innovation builds the backend AI capability and calls the existing LexisNexis search API (managed by Jim Presto) to ground and produce the suggestion; both are handed to LN's team to fold into their stack — at alpha, or at beta later. For the alpha, the CSR-facing experience runs side-by-side with LN's existing stack rather than integrated into it — a standalone surface that minimizes integration overhead, so we can put it in front of CSRs and gather early feedback fast, before committing to deeper in-stack integration. The conversational platform (Amelia, current, vs Cognigy, proposed with native HITL) and the real-time ingestion path — consuming the live chat feed turn-by-turn — remain unresolved and sit on the critical path to Build Alpha.
Integration & dependencies — what each phase needs
What every phase needs, who provides it, and what's blocking — in one place. Required must land before that phase starts; Blocking is the hard gate that, if unmet, stops the phase; inherited = carried from the prior phase; — = not needed yet. This replaces a separate "asks" list — the Owner column is what we need from you. Voice is a later channel (live voice-to-text HITL + emailed delivery), not one of these near-term phases; its dependencies are scoped when chat clusters are covered.
Integration / dependency P1P2P3P4 Owner & why
Real-time ingestion (live conversation feed)Blockingin usein useEI + LN (NICE). The #1 unknown — consume the chat live, stay ahead of the CSR. Test cheaply and early before integration spend. The later voice channel adds live voice-to-text on top of this.
CSR delivery surface (side-by-side for alpha; AgentWeb later)side-by-sidein useDecideEI's call. Alpha runs a side-by-side standalone surface next to LN's stack — fast, EI-owned, minimal integration to reach CSRs. Deeper AgentWeb integration is the later / beta decision.
CX1 raw transcript access (rolling window, source-truth)sample in handRequiredinheritedinheritedTherese → LN Murthy. The sample tests the bet; training-grade fidelity needs a real rolling window (~200k–250k).
AgentWeb disposition feed (outcome labels)RequiredinheritedinheritedTherese (CS-IT). Pick-list dispositions + auto-summary joined to transcript ID → resolved/escalated/repeat labels. Read-only.
Knowledge base inventory (index + per-article metadata)RequiredinheritedinheritedTherese + KB owner. Lets the copilot ground its rationale in known-good content, not free-generate it.
Conversational platform decision (Amelia vs Cognigy)NeededinheritedinheritedArchitecture (Ollie / Lakshmi). HITL-native platform; migration cost / lock-in / learning curve. Architecture split: EI builds the API, LN owns the NICE integration.
Voice channel (live voice-to-text HITL + emailed delivery)Later channel — out of the near-term phases. Same suggest-only HITL as chat, but harder: live voice-to-text feeds a suggestion the CSR reviews on the call, then emails the result (email is voice's delivery step, not a separate channel). Instrumenting it = capturing that emailed deliverable + tying it to the call. Scoped when chat clusters are covered. Owner: Therese & Mark.
CS Ops success metrics (AHT · CSat · FCR · hold-time)RequiredinheritedinheritedTherese (CS Ops). Baselines already exist — AHT 28→21 min target (−25%, the ROI lever), CSat ~78%, FCR 71% (2026-05-29). Join to the pilot cohort to measure impact. AHT delta is provable only under steady CSR adoption, so it reads in live use, not at offline eval.
Customer engagement with the delivered resultRequiredinheritedOwner depends on where it lives. Did the customer click the permalink and use what we sent? A chat transcript shows the link was sent, not clicked — so this needs click/engagement telemetry from somewhere we don't own: the chat platform (NICE/AgentWeb), an instrumented-link layer, or Lexis+ analytics for in-product use. Where it lives is open. Cross-team either way; the richest outcome signal, lands after the CS-side metrics above.
Customer delivery surface (in-product vs NICE)TBDDownstream team (not EI). In-product (Lexis Advance/Lexis+) vs. customer-facing chatbot via NICE — depends on feasibility + existing architecture.