bloomberg/MixCE-acl2023

Implementation of MixCE method described in ACL 2023 paper by Zhang et al.

34
/ 100
Emerging

This project offers an improved way to train large language models like GPT-2, helping them generate more natural and contextually relevant text. By adjusting how models learn from existing text data and evaluate their own predictions, it aims to produce higher quality language outputs. It's designed for researchers and practitioners who are building or fine-tuning advanced text generation systems and need to improve their performance.

No commits in the last 6 months.

Use this if you are a researcher or engineer working on autoregressive language models and want to enhance their training process for better text generation quality.

Not ideal if you are looking for a ready-to-use application or a simpler method for basic text generation tasks without deep involvement in model training specifics.

natural-language-generation language-model-training computational-linguistics text-synthesis AI-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 12 / 25

How are scores calculated?

Stars

20

Forks

3

Language

Python

License

Apache-2.0

Last pushed

May 29, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/bloomberg/MixCE-acl2023"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.