xinzhanguo/hellollm

pre train a new llm

44
/ 100
Emerging

This project helps machine learning engineers pre-train a new large language model (LLM) from scratch using custom textual data. You provide raw text files, and it produces a fine-tuned tokenizer and a new language model capable of generating text based on your input data's patterns and vocabulary. This tool is for developers building specialized conversational AI or text generation systems.

No commits in the last 6 months.

Use this if you need to create a brand-new large language model tailored specifically to your unique domain's text data, rather than adapting an existing general-purpose model.

Not ideal if you want to fine-tune an existing, pre-trained large language model, or if you don't have the technical expertise to set up and manage a deep learning training pipeline.

Machine Learning Engineering Natural Language Processing AI Model Training Custom LLM Development Text Generation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

73

Forks

22

Language

Python

License

MIT

Last pushed

Jan 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/xinzhanguo/hellollm"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.