aletheia (ἀλήθεια)

ἀλήθεια — unconcealment, the unforgotten. Bringing the hidden truth of an RAG answer to light.

Benchmark platform for inference-time hallucination detection methods on Vietnamese RAG. Target: Qwen3-4B-class quantized models (MLX on Mac, vLLM on Linux). Each method is ported from its paper, evaluated against the same Vietnamese hard split, and ranked by a single rubric so we know what actually works.

Status

Method	Source	Latency	AUROC (hard)	Verdict
PCC τ (certainty)	arXiv:2601.02574	~0.2s	0.807	champion
PCC γ (consistency, K=2)	same	~6s	0.74	opt-in escalation
NLL baseline	—	~0s	0.71	free signal
WEPR	arXiv:2509.04492	~0.08ms (LR)	0.69	SKIP (overfit, n_train=60)
BSE K=1	arXiv:2504.03579	~0.3s	0.65	SKIP (CSD collapse)
Latent_Audit `d` (v5 probe)	arXiv:2604.05358	~free (hook)	0.49–0.76	SKIP (intrinsic ≈chance)

Tested on ragbench MISA admin-procedures (Vietnamese, Qwen3-4B-4bit). Hard split = subtle 1-fact corruptions in retrieved context.

Pattern observed: every logprob-only method we ported (BSE, WEPR) lost to plain NLL by ~0.02 AUROC. Logprob signal on Qwen3-4B-4bit + Vietnamese admin domain has limited information beyond NLL. PCC's verdict-prompt approach remains the only inference-time method to materially beat NLL.

Per-method docs: docs/methods/. Detailed results & decisions: notes/.

Layout

.
├── lib/                    importable modules
│   ├── pcc.py              PCC certainty + consistency  (arXiv:2601.02574)
│   ├── bse.py              Bayesian Semantic Entropy   (arXiv:2504.03579)
│   ├── wepr.py             Weighted Entropy Production Rate (arXiv:2509.04492)
│   ├── audit.py            Latent_Audit Mahalanobis    (arXiv:2604.05358)
│   ├── latentaudit_v5.py   v5 probe (paper-faithful, leak-free)
│   ├── clustering.py       Qwen-as-judge entailment clustering
│   ├── fusion_eval.py      multi-signal evaluator
│   └── util.py             shared generation helpers (top-K logprobs, CACHE)
│
├── data/                   datasets + source adapter
│   ├── examples.py         100 VN hand-crafted (faithful + hallucinated)
│   ├── adapter.py          source dispatcher: data | ragbench
│   ├── realistic_hallu.json
│   └── generate_real_hallu.py
│
├── scripts/                runners (entrypoints — invoke from project root)
│   ├── run_pcc.py, run_bse.py, run_wepr.py
│   ├── bootstrap_bse.py    multi-sample cal for BSE CSD
│   ├── run_audit.py, run_v5.py, run_ragbench.py, run_fusion.py
│   └── analyze.py, diagnostic.py, threshold_tune.py
│
├── docs/methods/           per-method user-facing docs
│   ├── pcc.md, bse.md, wepr.md, latent_audit.md
│   └── README.md (index)
│
├── notes/                  design docs + per-method results
│   ├── bse-integration-plan.md, bse-results-2026-06-01.md
│   ├── wepr-results-2026-06-02.md
│   ├── partial-support-bench-plan.md
│   └── TODO-*.md
│
├── linux_deploy/           vLLM + Docker plugin (production handoff)
├── experiments/            one-off probes
├── bse_cache/              runtime artifacts (pickles, eval JSON)
├── quickstart.py           MLX building blocks demo
└── pyproject.toml, uv.lock, ...

Run

uv sync
uv run quickstart.py                        # smoke: model loads, top-K logprobs work
uv run scripts/run_pcc.py                   # PCC champion baseline
uv run scripts/run_bse.py --csd bse_cache/csd_*.pkl    # BSE K=1
uv run scripts/run_wepr.py                  # WEPR train+eval

Requires Mac M-series, Python ≥ 3.11. Model downloads to ~/.cache/huggingface/.

Methodology

All methods evaluated under the same rubric, reported in notes/{method}-results-*.md:

Generate Qwen's answer for each (context, question) in test split
Score generated answer via the detection method (returns scalar uncertainty / hallu probability)
Label is_hallucination = NOT (Qwen-as-judge confirms answer matches gold)
AUROC over scores + labels → main ranking metric
Decision rubric per method's plan doc: < 0.75 SKIP, 0.75–0.85 COMPLEMENT (fuse with PCC τ), > 0.85 REPLACE champion

Methods evaluated on real-case Vietnamese data (ragbench MISA admin procedures) — content is novel to Qwen3-4B (cannot recall from training), forcing real grounding.

Design notes

Inference-time only, no fine-tuning — methods that need offline probe-training on labeled activations are disqualified upfront (Latent_Audit's own v5 probe is documented as such; we keep it as a study but it's not the ship target).
Black-box preferred — logprob-API methods are more portable across models than white-box hidden-state methods.
Sample-size honesty — every report notes n_test (typically 18–57) and flags small-sample noise.

Citation

@software{hert4_aletheia_2026,
  author = {Hert4},
  title  = {aletheia: benchmark platform for inference-time hallucination
            detection on Vietnamese RAG},
  year   = {2026},
  url    = {https://github.com/Hert4/aletheia}
}

If you use a method's algorithm, please also cite its source paper (linked in the status table above and in notes/).

License

Licensed under the PolyForm Noncommercial License 1.0.0 — see LICENSE.

✅ Free for noncommercial use — research, education, personal study, non-profit/academic work.
🔒 Commercial / business use is not granted by default — contact the author (ductransa01@gmail.com) for written permission / a commercial license.
📌 Citation required — any use that feeds a publication, model, product, or other public artifact must cite this repo (see Citation above).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aletheia (ἀλήθεια)

Status

Layout

Run

Methodology

Design notes

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
docs/methods		docs/methods
experiments		experiments
lib		lib
linux_deploy		linux_deploy
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
CITATION.cff		CITATION.cff
DEPLOYMENT_LINUX.md		DEPLOYMENT_LINUX.md
LICENSE		LICENSE
README.md		README.md
RESULTS.md		RESULTS.md
pyproject.toml		pyproject.toml
quickstart.py		quickstart.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

aletheia (ἀλήθεια)

Status

Layout

Run

Methodology

Design notes

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages