KID-22/Source-Bias

Code for "Neural Retrievers are Biased Towards LLM-Generated Content"

/ 100

Experimental

This project helps researchers and developers understand how information retrieval models, especially those using neural networks, perform when documents include content generated by large language models (LLMs). It takes in datasets with both human-written and LLM-generated text and outputs evaluation metrics showing any bias. This is for anyone building or evaluating search and retrieval systems, particularly those incorporating AI-generated content.

No commits in the last 6 months.

Use this if you are developing search engines or information retrieval systems and need to assess how well your models handle and rank LLM-generated content compared to human-written content.

Not ideal if you are looking for a general-purpose information retrieval system or a tool to generate content, as this focuses specifically on analyzing retrieval bias.

Information Retrieval Search Engine Development LLM Evaluation AI Content Bias Knowledge Discovery

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Related tools

luka-group/Causal-View-of-Entity-Bias

[EMNLP 2023] A Causal View of Entity Bias in (Large) Language Models

d-lab/ecir26-qd-dense-vector-llm-rel-jud-bias-analysis

Code and experiments for Query–Document Dense Vectors for LLM Relevance Judgment Bias Analysis (ECIR2026)

Explore Embedding Tools

All categories Trending Embeddings directory Insights