rhubarbwu/linguistic-collapse

Codebase for Linguistic Collapse: Neural Collapse in (Large) Language Models [NeurIPS 2024] [arXiv:2405.17767]

/ 100

Experimental

This project provides the tools and scripts to train and evaluate large language models (LLMs) like GPT-Neo, and then deeply analyze their internal representations for a phenomenon called 'Neural Collapse.' You can input your LLM configurations and training data, receive trained models, and then get detailed analysis results, including various metrics of linguistic collapse and visualization notebooks. It's designed for machine learning researchers and academics studying the fundamental behaviors of LLMs.

No commits in the last 6 months.

Use this if you are a machine learning researcher who wants to rigorously study the internal mechanics of how large language models learn and represent information.

Not ideal if you are looking for an off-the-shelf solution to train a production-ready LLM or to apply LLMs for practical tasks without deep architectural analysis.

machine-learning-research natural-language-processing neural-networks computational-linguistics model-analysis

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

jncraton/languagemodels

Explore large language models in 512MB of RAM

microsoft/unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

haizelabs/verdict

Inference-time scaling for LLMs-as-a-judge.

albertan017/LLM4Decompile

Reverse Engineering: Decompiling Binary Code with Large Language Models

bytedance/Sa2VA

Official Repo For Pixel-LLM Codebase

Explore Transformer Models

All categories Trending Transformer directory Insights