chaoluond/safetyllama

Finetune LLaMA-2-7b-chat to perform safety evaluation of user-bot conversation

23
/ 100
Experimental

This project helps ensure AI chatbots provide safe and appropriate responses to user prompts. It takes a conversation between a human and a chatbot as input and outputs an evaluation stating whether the chatbot's response adheres to a set of safety guidelines. AI product managers, trust & safety teams, or developers building AI applications would use this tool.

No commits in the last 6 months.

Use this if you need to automatically monitor and detect unsafe or inappropriate content generated by your AI chatbot before it reaches users.

Not ideal if you are looking for a general-purpose content moderation tool that flags user-generated content directly, rather than chatbot outputs.

AI Safety Content Moderation Chatbot Development Trust & Safety AI Ethics
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

11

Forks

Language

Python

License

MIT

Last pushed

Jun 02, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/chaoluond/safetyllama"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.