SteveKGYang/MetaAligner
Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models
This project helps developers fine-tune large language models (LLMs) to better align with specific goals like harmlessness, helpfulness, or professionalism. It takes an existing LLM and a dataset of preferences, then outputs a more refined LLM that adheres to multiple objectives simultaneously. AI researchers and developers building conversational AI or specialized language applications would use this.
No commits in the last 6 months.
Use this if you need to quickly and efficiently adjust a large language model's behavior to meet several desired objectives without extensive retraining.
Not ideal if you are a non-developer seeking a ready-to-use application, as this project provides models and code for technical implementation.
Stars
24
Forks
2
Language
Python
License
MIT
Category
Last pushed
Sep 26, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/SteveKGYang/MetaAligner"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
steering-vectors/steering-vectors
Steering vectors for transformer language models in Pytorch / Huggingface
jianghoucheng/AlphaEdit
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)
kmeng01/memit
Mass-editing thousands of facts into a transformer memory (ICLR 2023)
boyiwei/alignment-attribution-code
[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
jianghoucheng/AnyEdit
AnyEdit: Edit Any Knowledge Encoded in Language Models, ICML 2025