LivingFutureLab/UQABench
[KDD 2025] The source code for UQABench
This benchmark helps e-commerce platforms and personalized recommendation systems evaluate how well their large language models (LLMs) can answer customer questions in a personalized way. It takes historical user interaction data, like past purchases or clicks, and outputs metrics showing how accurately an LLM can provide tailored answers. This is for researchers and engineers working on enhancing personalized customer service or product recommendations.
No commits in the last 6 months.
Use this if you need a standardized way to test and compare different methods of personalizing LLM responses for individual users based on their historical behavior.
Not ideal if you are looking for a plug-and-play LLM solution or a general-purpose question-answering system without a strong focus on user personalization benchmarks.
Stars
13
Forks
2
Language
Python
License
—
Category
Last pushed
Aug 18, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/LivingFutureLab/UQABench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ContextualAI/gritlm
Generative Representational Instruction Tuning
xlang-ai/instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
liuqidong07/LLMEmb
[AAAI'25 Oral] The official implementation code of LLMEmb
hpcaitech/CachedEmbedding
A memory efficient DLRM training solution using ColossalAI
ritesh-modi/embedding-hallucinations
This repo shows how foundational model hallucinates and how we can fix such hallucinations using...