sandylaker/ib-edl
Calibrating LLMs with Information-Theoretic Evidential Deep Learning (ICLR 2025)
This helps data scientists and machine learning engineers fine-tune large language models (LLMs) for multiple-choice question answering. It takes a pre-trained LLM and a dataset like OBQA, producing a more reliable model that can also identify when it's being asked questions outside its training scope. This is for professionals building and deploying AI assistants or automated Q&A systems.
No commits in the last 6 months.
Use this if you need to fine-tune an LLM for classification tasks like multiple-choice QA and want to ensure its predictions are well-calibrated and it can detect out-of-distribution questions.
Not ideal if your task involves open-ended text generation, as this implementation is currently limited to classification-style question answering.
Stars
17
Forks
4
Language
Python
License
MIT
Category
Last pushed
Mar 02, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sandylaker/ib-edl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
google-deepmind/long-form-factuality
Benchmarking long-form factuality in large language models. Original code for our paper...
gnai-creator/aletheion-llm-v2
Decoder-only LLM with integrated epistemic tomography. Knows what it doesn't know.
nightdessert/Retrieval_Head
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
MLD3/steerability
An open-source evaluation framework for measuring LLM steerability.
kazemihabib/Mitigating-Reasoning-LLM-Social-Bias
A novel approach to mitigating social bias in Large Language Models through a multi-judge...