Llm Evaluation Frameworks Transformer Models

There are 7 llm evaluation frameworks models tracked. The highest-rated is FuxiaoLiu/LRV-Instruction at 37/100 with 297 stars.

Get all 7 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-evaluation-frameworks&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	FuxiaoLiu/LRV-Instruction [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust...	37	Emerging	297	Python
2	kiyoshisasano/llm-failure-atlas A graph-based failure modeling and deterministic detection system for LLM...	32	Emerging	1	Python
3	mary-lev/llm-ocr LLM-powered OCR evaluation and correction package that supports multiple...	32	Emerging	4	Python
4	gwasiakshay/llm-eval-benchmark LLM evaluation & benchmarking framework using LLM-as-a-judge scoring,...	21	Experimental	—	Python
5	useentropy/llmkit LLM Kit - Python Large Language Model Kit for generating data of your choice	19	Experimental	4	Python
6	flamehaven01/CRoM-EfficientLLM A Python toolkit to optimize LLM context by intelligently selecting,...	17	Experimental	—	Python
7	IreneCalle/LLM_COMPARATOR_commonGround An off-the-shelf LLM comparator that allows you to assess the performance of...	12	Experimental	1	Jupyter Notebook