Here's a research-project that aims to reduce hallucinations in a plug-and-play way - worth exploring for Pickaxe implementation?

Hallucination Risk Calculator & Prompt Re-engineering Toolkit (OpenAI-only)

Post-hoc calibration without retraining for large language models. This toolkit turns a raw prompt into:

  1. a bounded hallucination risk using the Expectation-level Decompression Law (EDFL), and

  2. a decision to ANSWER or REFUSE under a target SLA, with transparent math (nats).

It supports two deployment modes:

  • Evidence-based: prompts include evidence/context; rolling priors are built by erasing that evidence.

  • Closed-book: prompts have no evidence; rolling priors are built by semantic masking of entities/numbers/titles.

All scoring relies only on the OpenAI Chat Completions API. No retraining required.

@nathanielmhld - check this out