jacobwarren/social-media-ai-engineering-etl
Real-world AI engineering dataset creation, SFT fine-tuning, and GRPO alignment ETL pipeline.
This project helps AI engineers and machine learning practitioners transform raw social media posts into highly structured datasets, ready for fine-tuning large language models. You feed in social media data, and it outputs meticulously organized training splits (SFT and DPO) for building specialized AI models. It's designed for individuals creating custom LLMs, especially those focused on specific writing styles or social media content.
No commits in the last 6 months.
Use this if you need to quickly and reliably create high-quality, task-specific datasets from social media content for training or fine-tuning large language models on an NVIDIA GPU.
Not ideal if you don't have access to a data-center NVIDIA GPU or if your primary goal is general-purpose LLM training without specific social media data or style constraints.
Stars
33
Forks
2
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 27, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/jacobwarren/social-media-ai-engineering-etl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
daveebbelaar/ai-cookbook
Examples and tutorials to help developers build AI systems
Explorer-Dong/wiki
个人知识库,包括「CS/AI 基础概念、数据结构与算法、软件开发、大模型」等体系化的学习笔记。持续更新中~
PetroIvaniuk/llms-tools
A list of LLMs Tools & Projects
CrankAddict/section-11
Evidence-based endurance coaching protocol for any AI/LLM. Deterministic training guidance with...
liguodongiot/ai-system
LLM/MLOps/LLMOps