sandylaker/ib-edl

Calibrating LLMs with Information-Theoretic Evidential Deep Learning (ICLR 2025)

/ 100

Emerging

This helps data scientists and machine learning engineers fine-tune large language models (LLMs) for multiple-choice question answering. It takes a pre-trained LLM and a dataset like OBQA, producing a more reliable model that can also identify when it's being asked questions outside its training scope. This is for professionals building and deploying AI assistants or automated Q&A systems.

No commits in the last 6 months.

Use this if you need to fine-tune an LLM for classification tasks like multiple-choice QA and want to ensure its predictions are well-calibrated and it can detect out-of-distribution questions.

Not ideal if your task involves open-ended text generation, as this implementation is currently limited to classification-style question answering.

large-language-models question-answering model-calibration out-of-distribution-detection AI-safety

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

google-deepmind/long-form-factuality

Benchmarking long-form factuality in large language models. Original code for our paper...

gnai-creator/aletheion-llm-v2

Decoder-only LLM with integrated epistemic tomography. Knows what it doesn't know.

nightdessert/Retrieval_Head

open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality

MLD3/steerability

An open-source evaluation framework for measuring LLM steerability.

kazemihabib/Mitigating-Reasoning-LLM-Social-Bias

A novel approach to mitigating social bias in Large Language Models through a multi-judge...

Explore Transformer Models

All categories Trending Transformer directory Insights