krystalan/chatgpt_as_nlg_evaluator

Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study

19
/ 100
Experimental

This project helps researchers and developers evaluate the quality of text generated by large language models, specifically focusing on summarization and story generation tasks. It takes human-generated or model-generated text and outputs quantitative correlation scores, indicating how well ChatGPT's evaluations align with human judgments. Natural Language Generation (NLG) researchers and practitioners developing or comparing text generation models would use this.

No commits in the last 6 months.

Use this if you need to understand how reliable ChatGPT is for automatically scoring aspects like coherence, relevance, or fluency of generated text, without extensive human evaluation.

Not ideal if you are looking for a tool to generate text, or if you require human-level evaluation with nuanced qualitative feedback rather than quantitative correlation scores.

natural-language-generation text-summarization story-generation model-evaluation AI-research
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 3 / 25

How are scores calculated?

Stars

43

Forks

1

Language

Python

License

Last pushed

Mar 08, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/krystalan/chatgpt_as_nlg_evaluator"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.