CASE STUDY
Novique Health
Clinical-grade LLM pipeline for internal guideline retrieval

THE SITUATION
A thousand pages of internal guideline, manually searched
Novique Health's clinical team had accumulated internal protocols, care pathways, and decision frameworks across a decade of practice. All of it existed as PDFs and internal documents. When a clinician needed to reference a specific guideline mid-shift, the current answer was keyword search across a network drive.
Off-the-shelf AI assistants were not an option. The data never leaves their environment, and the answers have to be defensible.
BEFORE
A decade of clinical guidelines, searched by keyword across a network drive.
AFTER
Grounded, evaluated answers on their own infrastructure, with the source passage on every reply.
WHAT WE BUILT
A private retrieval pipeline, evaluated before it shipped
We built a retrieval-augmented system over the guideline corpus, deployed behind a private API. Hybrid retrieval, re-ranking tuned on a curated evaluation set, and a prompt pipeline that refuses to answer when retrieval does not reach a confidence threshold.
Before we let the clinical team use it, we wrote an eval suite against questions their own lead clinician had asked in the past year. We did not ship until recall on those questions was where it needed to be.
Private deployment
Models hosted in their environment. Nothing about a patient question leaves their network.
Evaluations before launch
Curated question set from the clinical lead, recall tracked per release. No vibes-based rollouts.
Grounded answers only
Refusal when retrieval is weak. Clinicians see the source passages behind every answer.
THE OUTCOMES
Shipped, measured, trusted
100%
of inference stays on their infrastructure
Eval-gated
releases; no vibes-based rollouts
In use
by clinical team during active shifts
STACK
- Python
- FastAPI
- pgvector
- Open-weights LLM
- Railway
- Evals
