Show HN: LettuceDetect – Lightweight hallucination detector for RAG pipelines

9 points by justacoolname a day ago

Hallucinations are still a major blocker for deploying reliable retrieval-augmented generation (RAG) systems, especially in complex domains like medical or legal.

Most existing hallucination detectors rely on full LLM inference (expensive, slow), or struggle with long-context inputs.

I built LettuceDetect — an open-source, encoder-only framework that detects hallucinated spans in LLM-generated answers based on the retrieved context. No LLMs needed, and it much more efficiently.

Highlights:

- Token-level hallucination detection (unsupported spans flagged based on retrieved evidence)

- Built on ModernBERT — handles up to 4K token contexts

- 79.22% F1 on the RAGTruth benchmark (beats previous encoder models, competitive with LLMs)

- MIT licensed

— Includes Python packages, pretrained models, and Hugging Face demo

GitHub: https://github.com/KRLabsOrg/LettuceDetect

Blog: https://huggingface.co/blog/adaamko/lettucedetect

Preprint: https://arxiv.org/abs/2502.17125

Models/Demo: https://huggingface.co/KRLabsOrg

Would love feedback from anyone working on RAG, hallucination detection, or efficient LLM evaluation. Also exploring real-time hallucination detection (vs. just post-gen) — open to thoughts/collab there.