KID-22/Source-Bias

Code for "Neural Retrievers are Biased Towards LLM-Generated Content"

23
/ 100
Experimental

This project helps researchers and developers understand how information retrieval models, especially those using neural networks, perform when documents include content generated by large language models (LLMs). It takes in datasets with both human-written and LLM-generated text and outputs evaluation metrics showing any bias. This is for anyone building or evaluating search and retrieval systems, particularly those incorporating AI-generated content.

No commits in the last 6 months.

Use this if you are developing search engines or information retrieval systems and need to assess how well your models handle and rank LLM-generated content compared to human-written content.

Not ideal if you are looking for a general-purpose information retrieval system or a tool to generate content, as this focuses specifically on analyzing retrieval bias.

Information Retrieval Search Engine Development LLM Evaluation AI Content Bias Knowledge Discovery
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 10 / 25

How are scores calculated?

Stars

14

Forks

2

Language

Python

License

Last pushed

Oct 18, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/KID-22/Source-Bias"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.