oooranz/Baby-CoThought
🍼 Baby's CoThought: Leveraging LLMs for Enhanced Reasoning in Compact Models (BabyLM Challenge)
This project helps AI researchers and developers working on language models to train more efficient, compact models using human-like data. It takes diverse, smaller text corpora, processes them using larger language models to generate new natural language understanding examples, and then uses these examples to pretrain a smaller RoBERTa-like model. The output is a "Baby Language Model" that demonstrates enhanced reasoning capabilities with less training data.
No commits in the last 6 months.
Use this if you are an NLP researcher or machine learning engineer looking to develop small, sample-efficient language models that still possess strong reasoning abilities, mirroring human language acquisition.
Not ideal if you need to train a full-scale, cutting-edge large language model for production use, as this project focuses on compact models and sample efficiency rather than maximizing overall performance.
Stars
17
Forks
3
Language
Python
License
—
Category
Last pushed
Jan 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/oooranz/Baby-CoThought"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ShiZhengyan/InstructionModelling
[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"
raymin0223/fast_robust_early_exit
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized...
SALT-NLP/Adaptive-Compositional-Modules
Code for the ACL 2022 paper "Continual Sequence Generation with Adaptive Compositional Modules"
joisino/zeh
Code for "Even GPT-5.2 Can’t Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs"
yhy1117/X-Mixup
Implementation of ICLR 2022 paper "Enhancing Cross-lingual Transfer by Manifold Mixup".