CASE STUDY

Novique Health

Private clinical LLM pipeline, evaluation-gated before launch

THE SITUATION

A thousand pages of internal guideline, manually searched

Novique Health's clinical team had accumulated internal protocols, care pathways, and decision frameworks across a decade of practice. All of it existed as PDFs and internal documents. When a clinician needed to reference a specific guideline mid-shift, the current answer was keyword search across a network drive.

Off-the-shelf AI assistants were not an option. The data never leaves their environment, and the answers have to be defensible.

BEFORE

A decade of clinical guidelines, searched by keyword across a network drive.

AFTER

Grounded, evaluated answers on their own infrastructure, with the source passage on every reply.

WHAT I BUILT

A private retrieval pipeline, evaluated before it shipped

I built a retrieval-augmented system over the guideline corpus, deployed behind a private API. Hybrid retrieval, re-ranking tuned on a curated evaluation set, and a prompt pipeline that refuses to answer when retrieval does not reach a confidence threshold.

Before the clinical team used it, I wrote an eval suite against questions their own lead clinician had asked in the past year. I did not ship until recall on those questions was where it needed to be.

Private deployment

Models hosted in their environment. Nothing about a patient question leaves their network.

Evaluations before launch

Curated question set from the clinical lead, recall tracked per release. No vibes-based rollouts.

Grounded answers only

Refusal when retrieval is weak. Clinicians see the source passages behind every answer.

THE OUTCOMES

Shipped, measured, trusted

100%

of inference stays on their infrastructure

Evaluated

releases gated on a curated clinical question set, not vibes

Live

with the clinical team during active shifts

STACK

Python
FastAPI
pgvector
Anthropic
Railway
Evals

NEXT CASE STUDY

Freight brokerage

Grounded ops co-pilot for dispatchers and account managers

Read the case study →