KID-22/Source-Bias
Code for "Neural Retrievers are Biased Towards LLM-Generated Content"
This project helps researchers and developers understand how information retrieval models, especially those using neural networks, perform when documents include content generated by large language models (LLMs). It takes in datasets with both human-written and LLM-generated text and outputs evaluation metrics showing any bias. This is for anyone building or evaluating search and retrieval systems, particularly those incorporating AI-generated content.
No commits in the last 6 months.
Use this if you are developing search engines or information retrieval systems and need to assess how well your models handle and rank LLM-generated content compared to human-written content.
Not ideal if you are looking for a general-purpose information retrieval system or a tool to generate content, as this focuses specifically on analyzing retrieval bias.
Stars
14
Forks
2
Language
Python
License
—
Category
Last pushed
Oct 18, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/KID-22/Source-Bias"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.