jacobwarren/social-media-ai-engineering-etl

Real-world AI engineering dataset creation, SFT fine-tuning, and GRPO alignment ETL pipeline.

/ 100

Emerging

This project helps AI engineers and machine learning practitioners transform raw social media posts into highly structured datasets, ready for fine-tuning large language models. You feed in social media data, and it outputs meticulously organized training splits (SFT and DPO) for building specialized AI models. It's designed for individuals creating custom LLMs, especially those focused on specific writing styles or social media content.

No commits in the last 6 months.

Use this if you need to quickly and reliably create high-quality, task-specific datasets from social media content for training or fine-tuning large language models on an NVIDIA GPU.

Not ideal if you don't have access to a data-center NVIDIA GPU or if your primary goal is general-purpose LLM training without specific social media data or style constraints.

AI-engineering LLM-fine-tuning social-media-data-preparation machine-learning-engineering natural-language-processing

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 7 / 25

Maturity 15 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

daveebbelaar/ai-cookbook

Examples and tutorials to help developers build AI systems

Explorer-Dong/wiki

个人知识库，包括「CS/AI 基础概念、数据结构与算法、软件开发、大模型」等体系化的学习笔记。持续更新中~

PetroIvaniuk/llms-tools

A list of LLMs Tools & Projects

CrankAddict/section-11

Evidence-based endurance coaching protocol for any AI/LLM. Deterministic training guidance with...

liguodongiot/ai-system

LLM/MLOps/LLMOps

Explore LLM Tools

All categories Trending LLM Tool directory Insights