From 40683a76712c29c39f7b152beb1b7a7f3c077e85 Mon Sep 17 00:00:00 2001 From: connerlambden Date: Thu, 4 Jun 2026 22:33:08 -0600 Subject: [PATCH] Add BGPT REFUTE benchmark --- README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.md b/README.md index aa3bb89e3..e29aa64f7 100644 --- a/README.md +++ b/README.md @@ -215,3 +215,8 @@ We thank EleutherAI for their work on the [lm-evaluation harness](https://github year = 2022, } ``` + + +## Benchmarks + +- [REFUTE](https://huggingface.co/datasets/BGPT-OFFICIAL/refute) — Scientific critique & epistemic calibration (Apache-2.0). [Leaderboard](https://huggingface.co/spaces/BGPT-OFFICIAL/refute-leaderboard) · [Report](https://huggingface.co/datasets/BGPT-OFFICIAL/refute/blob/main/TECHNICAL_REPORT.md) · [Integrations](https://huggingface.co/datasets/BGPT-OFFICIAL/refute/tree/main/integrations)