ScienceOne-AI/DeepSeek-671B-SFT-Guide

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案,包含从训练到推理的完整代码和脚本,以及实践中积累一些经验和结论。)

46
/ 100
Emerging

This project offers a comprehensive guide and tools for developers and machine learning engineers to fine-tune the DeepSeek-V3/R1 671B language model. It provides all necessary code and scripts, from setting up the training environment to performing the actual training and inference. The project takes raw text data, formatted with specific roles and optional reasoning content, and produces a specialized, fine-tuned DeepSeek model ready for deployment.

796 stars. No commits in the last 6 months.

Use this if you are an AI/ML engineer or researcher working with large language models and need to perform full parameter fine-tuning on the DeepSeek-V3/R1 671B model, especially when dealing with complex reasoning data.

Not ideal if you are an end-user looking for a pre-trained model to use directly, or if you lack the extensive computational resources and expertise in distributed machine learning required for such a large-scale fine-tuning task.

large-language-model-training deep-learning-fine-tuning distributed-ai-training model-optimization natural-language-processing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

796

Forks

96

Language

Python

License

Apache-2.0

Last pushed

Mar 13, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/ScienceOne-AI/DeepSeek-671B-SFT-Guide"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.