knoveleng/steering

Official repo for the paper: "Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection"

28
/ 100
Experimental

This project helps AI developers and researchers precisely control Large Language Models (LLMs) to modify their behavior, for example, to make them more helpful or more resistant to misuse. By taking an existing LLM and applying specific "steering" techniques, it allows you to get an LLM that behaves exactly as intended, without losing its core capabilities. This is for AI practitioners who need to fine-tune model responses for safety, alignment, or specific tasks.

Use this if you need to reliably modify an LLM's output behavior, such as making it more or less prone to generating certain types of content, while ensuring the model maintains its overall quality and understanding.

Not ideal if you are a non-technical user looking for a ready-to-use application, or if you only need basic, coarse-grained control over an LLM's responses.

LLM-fine-tuning AI-safety model-alignment AI-behavioral-control NLP-research
No License No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 5 / 25
Community 8 / 25

How are scores calculated?

Stars

9

Forks

1

Language

Jupyter Notebook

License

Last pushed

Feb 20, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/knoveleng/steering"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.